WorldWideScience

Sample records for novo protein structure

  1. De novo protein structure generation from incomplete chemical shift assignments

    Energy Technology Data Exchange (ETDEWEB)

    Shen Yang [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States); Vernon, Robert; Baker, David [University of Washington, Department of Biochemistry and Howard Hughes Medical Institute (United States); Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)], E-mail: bax@nih.gov

    2009-02-15

    NMR chemical shifts provide important local structural information for proteins. Consistent structure generation from NMR chemical shift data has recently become feasible for proteins with sizes of up to 130 residues, and such structures are of a quality comparable to those obtained with the standard NMR protocol. This study investigates the influence of the completeness of chemical shift assignments on structures generated from chemical shifts. The Chemical-Shift-Rosetta (CS-Rosetta) protocol was used for de novo protein structure generation with various degrees of completeness of the chemical shift assignment, simulated by omission of entries in the experimental chemical shift data previously used for the initial demonstration of the CS-Rosetta approach. In addition, a new CS-Rosetta protocol is described that improves robustness of the method for proteins with missing or erroneous NMR chemical shift input data. This strategy, which uses traditional Rosetta for pre-filtering of the fragment selection process, is demonstrated for two paramagnetic proteins and also for two proteins with solid-state NMR chemical shift assignments.

  2. De novo protein structure determination using sparse NMR data

    International Nuclear Information System (INIS)

    Bowers, Peter M.; Strauss, Charlie E.M.; Baker, David

    2000-01-01

    We describe a method for generating moderate to high-resolution protein structures using limited NMR data combined with the ab initio protein structure prediction method Rosetta. Peptide fragments are selected from proteins of known structure based on sequence similarity and consistency with chemical shift and NOE data. Models are built from these fragments by minimizing an energy function that favors hydrophobic burial, strand pairing, and satisfaction of NOE constraints. Models generated using this procedure with ∼1 NOE constraint per residue are in some cases closer to the corresponding X-ray structures than the published NMR solution structures. The method requires only the sparse constraints available during initial stages of NMR structure determination, and thus holds promise for increasing the speed with which protein solution structures can be determined

  3. Building a Better Fragment Library for De Novo Protein Structure Prediction

    Science.gov (United States)

    de Oliveira, Saulo H. P.; Shi, Jiye; Deane, Charlotte M.

    2015-01-01

    Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. “Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources”. PMID:25901595

  4. Building a better fragment library for de novo protein structure prediction.

    Directory of Open Access Journals (Sweden)

    Saulo H P de Oliveira

    Full Text Available Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10. We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. "Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources".

  5. Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction.

    Science.gov (United States)

    de Oliveira, Saulo H P; Law, Eleanor C; Shi, Jiye; Deane, Charlotte M

    2018-04-01

    Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally. We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score > 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy. Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2. saulo.deoliveira@dtc.ox.ac.uk. Supplementary data are available at Bioinformatics online.

  6. Algorithm for selection of optimized EPR distance restraints for de novo protein structure determination

    Science.gov (United States)

    Kazmier, Kelli; Alexander, Nathan S.; Meiler, Jens; Mchaourab, Hassane S.

    2010-01-01

    A hybrid protein structure determination approach combining sparse Electron Paramagnetic Resonance (EPR) distance restraints and Rosetta de novo protein folding has been previously demonstrated to yield high quality models (Alexander et al., 2008). However, widespread application of this methodology to proteins of unknown structures is hindered by the lack of a general strategy to place spin label pairs in the primary sequence. In this work, we report the development of an algorithm that optimally selects spin labeling positions for the purpose of distance measurements by EPR. For the α-helical subdomain of T4 lysozyme (T4L), simulated restraints that maximize sequence separation between the two spin labels while simultaneously ensuring pairwise connectivity of secondary structure elements yielded vastly improved models by Rosetta folding. 50% of all these models have the correct fold compared to only 21% and 8% correctly folded models when randomly placed restraints or no restraints are used, respectively. Moreover, the improvements in model quality require a limited number of optimized restraints, the number of which is determined by the pairwise connectivities of T4L α-helices. The predicted improvement in Rosetta model quality was verified by experimental determination of distances between spin labels pairs selected by the algorithm. Overall, our results reinforce the rationale for the combined use of sparse EPR distance restraints and de novo folding. By alleviating the experimental bottleneck associated with restraint selection, this algorithm sets the stage for extending computational structure determination to larger, traditionally elusive protein topologies of critical structural and biochemical importance. PMID:21074624

  7. Structural Insight into the Core of CAD, the Multifunctional Protein Leading De Novo Pyrimidine Biosynthesis.

    Science.gov (United States)

    Moreno-Morcillo, María; Grande-García, Araceli; Ruiz-Ramos, Alba; Del Caño-Ochoa, Francisco; Boskovic, Jasminka; Ramón-Maiques, Santiago

    2017-06-06

    CAD, the multifunctional protein initiating and controlling de novo biosynthesis of pyrimidines in animals, self-assembles into ∼1.5 MDa hexamers. The structures of the dihydroorotase (DHO) and aspartate transcarbamoylase (ATC) domains of human CAD have been previously determined, but we lack information on how these domains associate and interact with the rest of CAD forming a multienzymatic unit. Here, we prove that a construct covering human DHO and ATC oligomerizes as a dimer of trimers and that this arrangement is conserved in CAD-like from fungi, which holds an inactive DHO-like domain. The crystal structures of the ATC trimer and DHO-like dimer from the fungus Chaetomium thermophilum confirm the similarity with the human CAD homologs. These results demonstrate that, despite being inactive, the fungal DHO-like domain has a conserved structural function. We propose a model that sets the DHO and ATC complex as the central element in the architecture of CAD. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Get phases from arsenic anomalous scattering: de novo SAD phasing of two protein structures crystallized in cacodylate buffer.

    Directory of Open Access Journals (Sweden)

    Xiang Liu

    Full Text Available The crystal structures of two proteins, a putative pyrazinamidase/nicotinamidase from the dental pathogen Streptococcus mutans (SmPncA and the human caspase-6 (Casp6, were solved by de novo arsenic single-wavelength anomalous diffraction (As-SAD phasing method. Arsenic (As, an uncommonly used element in SAD phasing, was covalently introduced into proteins by cacodylic acid, the buffering agent in the crystallization reservoirs. In SmPncA, the only cysteine was bound to dimethylarsinoyl, which is a pentavalent arsenic group (As (V. This arsenic atom and a protein-bound zinc atom both generated anomalous signals. The predominant contribution, however, was from the As anomalous signals, which were sufficient to phase the SmPncA structure alone. In Casp6, four cysteines were found to bind cacodyl, a trivalent arsenic group (As (III, in the presence of the reducing agent, dithiothreitol (DTT, and arsenic atoms were the only anomalous scatterers for SAD phasing. Analyses and discussion of these two As-SAD phasing examples and comparison of As with other traditional heavy atoms that generate anomalous signals, together with a few arsenic-based de novo phasing cases reported previously strongly suggest that As is an ideal anomalous scatterer for SAD phasing in protein crystallography.

  9. De novo protein structure prediction by dynamic fragment assembly and conformational space annealing.

    Science.gov (United States)

    Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung

    2011-08-01

    Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.

  10. The dual role of fragments in fragment-assembly methods for de novo protein structure prediction

    Science.gov (United States)

    Handl, Julia; Knowles, Joshua; Vernon, Robert; Baker, David; Lovell, Simon C.

    2013-01-01

    In fragment-assembly techniques for protein structure prediction, models of protein structure are assembled from fragments of known protein structures. This process is typically guided by a knowledge-based energy function and uses a heuristic optimization method. The fragments play two important roles in this process: they define the set of structural parameters available, and they also assume the role of the main variation operators that are used by the optimiser. Previous analysis has typically focused on the first of these roles. In particular, the relationship between local amino acid sequence and local protein structure has been studied by a range of authors. The correlation between the two has been shown to vary with the window length considered, and the results of these analyses have informed directly the choice of fragment length in state-of-the-art prediction techniques. Here, we focus on the second role of fragments and aim to determine the effect of fragment length from an optimization perspective. We use theoretical analyses to reveal how the size and structure of the search space changes as a function of insertion length. Furthermore, empirical analyses are used to explore additional ways in which the size of the fragment insertion influences the search both in a simulation model and for the fragment-assembly technique, Rosetta. PMID:22095594

  11. Foldability of a Natural De Novo Evolved Protein.

    Science.gov (United States)

    Bungard, Dixie; Copple, Jacob S; Yan, Jing; Chhun, Jimmy J; Kumirov, Vlad K; Foy, Scott G; Masel, Joanna; Wysocki, Vicki H; Cordes, Matthew H J

    2017-11-07

    The de novo evolution of protein-coding genes from noncoding DNA is emerging as a source of molecular innovation in biology. Studies of random sequence libraries, however, suggest that young de novo proteins will not fold into compact, specific structures typical of native globular proteins. Here we show that Bsc4, a functional, natural de novo protein encoded by a gene that evolved recently from noncoding DNA in the yeast S. cerevisiae, folds to a partially specific three-dimensional structure. Bsc4 forms soluble, compact oligomers with high β sheet content and a hydrophobic core, and undergoes cooperative, reversible denaturation. Bsc4 lacks a specific quaternary state, however, existing instead as a continuous distribution of oligomer sizes, and binds dyes indicative of amyloid oligomers or molten globules. The combination of native-like and non-native-like properties suggests a rudimentary fold that could potentially act as a functional intermediate in the emergence of new folded proteins de novo. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. De novo structural modeling and computational sequence analysis ...

    African Journals Online (AJOL)

    Different bioinformatics tools and machine learning techniques were used for protein structural classification. De novo protein modeling was performed by using I-TASSER server. The final model obtained was accessed by PROCHECK and DFIRE2, which confirmed that the final model is reliable. Until complete biochemical ...

  13. De Novo Construction of Redox Active Proteins.

    Science.gov (United States)

    Moser, C C; Sheehan, M M; Ennist, N M; Kodali, G; Bialas, C; Englander, M T; Discher, B M; Dutton, P L

    2016-01-01

    Relatively simple principles can be used to plan and construct de novo proteins that bind redox cofactors and participate in a range of electron-transfer reactions analogous to those seen in natural oxidoreductase proteins. These designed redox proteins are called maquettes. Hydrophobic/hydrophilic binary patterning of heptad repeats of amino acids linked together in a single-chain self-assemble into 4-alpha-helix bundles. These bundles form a robust and adaptable frame for uncovering the default properties of protein embedded cofactors independent of the complexities introduced by generations of natural selection and allow us to better understand what factors can be exploited by man or nature to manipulate the physical chemical properties of these cofactors. Anchoring of redox cofactors such as hemes, light active tetrapyrroles, FeS clusters, and flavins by His and Cys residues allow cofactors to be placed at positions in which electron-tunneling rates between cofactors within or between proteins can be predicted in advance. The modularity of heptad repeat designs facilitates the construction of electron-transfer chains and novel combinations of redox cofactors and new redox cofactor assisted functions. Developing de novo designs that can support cofactor incorporation upon expression in a cell is needed to support a synthetic biology advance that integrates with natural bioenergetic pathways. © 2016 Elsevier Inc. All rights reserved.

  14. De novo origin of human protein-coding genes.

    Directory of Open Access Journals (Sweden)

    Dong-Dong Wu

    2011-11-01

    Full Text Available The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA-seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes.

  15. De Novo Origin of Human Protein-Coding Genes

    Science.gov (United States)

    Wu, Dong-Dong; Irwin, David M.; Zhang, Ya-Ping

    2011-01-01

    The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. PMID:22102831

  16. Automated de novo phasing and model building of coiled-coil proteins.

    Science.gov (United States)

    Rämisch, Sebastian; Lizatović, Robert; André, Ingemar

    2015-03-01

    Models generated by de novo structure prediction can be very useful starting points for molecular replacement for systems where suitable structural homologues cannot be readily identified. Protein-protein complexes and de novo-designed proteins are examples of systems that can be challenging to phase. In this study, the potential of de novo models of protein complexes for use as starting points for molecular replacement is investigated. The approach is demonstrated using homomeric coiled-coil proteins, which are excellent model systems for oligomeric systems. Despite the stereotypical fold of coiled coils, initial phase estimation can be difficult and many structures have to be solved with experimental phasing. A method was developed for automatic structure determination of homomeric coiled coils from X-ray diffraction data. In a benchmark set of 24 coiled coils, ranging from dimers to pentamers with resolutions down to 2.5 Å, 22 systems were automatically solved, 11 of which had previously been solved by experimental phasing. The generated models contained 71-103% of the residues present in the deposited structures, had the correct sequence and had free R values that deviated on average by 0.01 from those of the respective reference structures. The electron-density maps were of sufficient quality that only minor manual editing was necessary to produce final structures. The method, named CCsolve, combines methods for de novo structure prediction, initial phase estimation and automated model building into one pipeline. CCsolve is robust against errors in the initial models and can readily be modified to make use of alternative crystallographic software. The results demonstrate the feasibility of de novo phasing of protein-protein complexes, an approach that could also be employed for other small systems beyond coiled coils.

  17. Getting the best out of long-wavelength X-rays: de novo chlorine/sulfur SAD phasing of a structural protein from ATV

    DEFF Research Database (Denmark)

    Goulet, Adeline; Vestergaard, Gisle Alberg; Felisberto-Rodrigues, Catarina

    2010-01-01

    The structure of a 14 kDa structural protein from Acidianus two-tailed virus (ATV) was solved by single-wavelength anomalous diffraction (SAD) phasing using X-ray data collected at 2.0 A wavelength. Although the anomalous signal from methionine sulfurs was expected to suffice to solve the structu...... on intrinsic protein light atoms along with associated chloride ions from the solvent. In such cases, data collection at long wavelengths may be a time-efficient alternative to selenomethionine substitution and heavy-atom derivatization....

  18. Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Chen Xie

    2012-09-01

    Full Text Available Tinkering with pre-existing genes has long been known as a major way to create new genes. Recently, however, motherless protein-coding genes have been found to have emerged de novo from ancestral non-coding DNAs. How these genes originated is not well addressed to date. Here we identified 24 hominoid-specific de novo protein-coding genes with precise origination timing in vertebrate phylogeny. Strand-specific RNA-Seq analyses were performed in five rhesus macaque tissues (liver, prefrontal cortex, skeletal muscle, adipose, and testis, which were then integrated with public transcriptome data from human, chimpanzee, and rhesus macaque. On the basis of comparing the RNA expression profiles in the three species, we found that most of the hominoid-specific de novo protein-coding genes encoded polyadenylated non-coding RNAs in rhesus macaque or chimpanzee with a similar transcript structure and correlated tissue expression profile. According to the rule of parsimony, the majority of these hominoid-specific de novo protein-coding genes appear to have acquired a regulated transcript structure and expression profile before acquiring coding potential. Interestingly, although the expression profile was largely correlated, the coding genes in human often showed higher transcriptional abundance than their non-coding counterparts in rhesus macaque. The major findings we report in this manuscript are robust and insensitive to the parameters used in the identification and analysis of de novo genes. Our results suggest that at least a portion of long non-coding RNAs, especially those with active and regulated transcription, may serve as a birth pool for protein-coding genes, which are then further optimized at the transcriptional level.

  19. A de novo designed monomeric, compact three helix bundle protein on a carbohydrate template

    DEFF Research Database (Denmark)

    Malik, Leila; Nygård, Jesper; Christensen, Niels Johan

    2015-01-01

    De novo design and chemical synthesis of proteins and of other artificial structures, which mimic them, is a central strategy for understanding protein folding and for accessing proteins with novel functions. We have previously described carbohydrates as templates for the assembly of artificial...... the template could facilitate protein folding. Here we report the design and synthesis of 3-helix bundle carboproteins on deoxy-hexopyranosides. The carboproteins were analyzed by CD, AUC, SAXS, and NMR, which revealed the formation of the first compact, and folded monomeric carboprotein distinctly different...

  20. De Novo Discovery of Structured ncRNA Motifs in Genomic Sequences

    DEFF Research Database (Denmark)

    Ruzzo, Walter L; Gorodkin, Jan

    2014-01-01

    De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphas...... on an approach based on the CMfinder CMfinder program as a case study. Applications to genomic screens for novel de novo structured ncRNA ncRNA s, including structured RNA elements in untranslated portions of protein-coding genes, are presented.......De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphasis...

  1. Massively parallel de novo protein design for targeted therapeutics

    KAUST Repository

    Chevalier, Aaron; Silva, Daniel-Adriano; Rocklin, Gabriel J.; Hicks, Derrick R.; Vergara, Renan; Murapa, Patience; Bernard, Steffen M.; Zhang, Lu; Lam, Kwok-Ho; Yao, Guorui; Bahl, Christopher D.; Miyashita, Shin-Ichiro; Goreshnik, Inna; Fuller, James T.; Koday, Merika T.; Jenkins, Cody M.; Colvin, Tom; Carter, Lauren; Bohn, Alan; Bryan, Cassie M.; Ferná ndez-Velasco, D. Alejandro; Stewart, Lance; Dong, Min; Huang, Xuhui; Jin, Rongsheng; Wilson, Ian A.; Fuller, Deborah H.; Baker, David

    2017-01-01

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37-43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.

  2. Massively parallel de novo protein design for targeted therapeutics

    KAUST Repository

    Chevalier, Aaron

    2017-09-26

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37-43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.

  3. Massively parallel de novo protein design for targeted therapeutics

    Science.gov (United States)

    Chevalier, Aaron; Silva, Daniel-Adriano; Rocklin, Gabriel J.; Hicks, Derrick R.; Vergara, Renan; Murapa, Patience; Bernard, Steffen M.; Zhang, Lu; Lam, Kwok-Ho; Yao, Guorui; Bahl, Christopher D.; Miyashita, Shin-Ichiro; Goreshnik, Inna; Fuller, James T.; Koday, Merika T.; Jenkins, Cody M.; Colvin, Tom; Carter, Lauren; Bohn, Alan; Bryan, Cassie M.; Fernández-Velasco, D. Alejandro; Stewart, Lance; Dong, Min; Huang, Xuhui; Jin, Rongsheng; Wilson, Ian A.; Fuller, Deborah H.; Baker, David

    2018-01-01

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37–43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing. PMID:28953867

  4. The Folding of de Novo Designed Protein DS119 via Molecular Dynamics Simulations

    Directory of Open Access Journals (Sweden)

    Moye Wang

    2016-04-01

    Full Text Available As they are not subjected to natural selection process, de novo designed proteins usually fold in a manner different from natural proteins. Recently, a de novo designed mini-protein DS119, with a βαβ motif and 36 amino acids, has folded unusually slowly in experiments, and transient dimers have been detected in the folding process. Here, by means of all-atom replica exchange molecular dynamics (REMD simulations, several comparably stable intermediate states were observed on the folding free-energy landscape of DS119. Conventional molecular dynamics (CMD simulations showed that when two unfolded DS119 proteins bound together, most binding sites of dimeric aggregates were located at the N-terminal segment, especially residues 5–10, which were supposed to form β-sheet with its own C-terminal segment. Furthermore, a large percentage of individual proteins in the dimeric aggregates adopted conformations similar to those in the intermediate states observed in REMD simulations. These results indicate that, during the folding process, DS119 can easily become trapped in intermediate states. Then, with diffusion, a transient dimer would be formed and stabilized with the binding interface located at N-terminals. This means that it could not quickly fold to the native structure. The complicated folding manner of DS119 implies the important influence of natural selection on protein-folding kinetics, and more improvement should be achieved in rational protein design.

  5. Recurrence risk in de novo structural chromosomal rearrangements.

    Science.gov (United States)

    Röthlisberger, Benno; Kotzot, Dieter

    2007-08-01

    According to the textbook of Gardner and Sutherland [2004], the standard on genetic counseling for chromosome abnormalities, the recurrence risk of de novo structural or combined structural and numeric chromosome rearrangements is less than 0.5-2% and takes into account recurrence by chance, gonadal mosaicism, and somatic-gonadal mosaicism. However, these figures are roughly estimated and neither any systematic study nor exact or evidence-based risk calculations are available. To address this question, an extensive literature search was performed and surprisingly only 29 case reports of recurrence of de novo structural or combined structural and numeric chromosomal rearrangements were found. Thirteen of them were with a trisomy 21 due to an i(21q) replacing one normal chromosome 21. In eight of them low-level mosaicism in one of the parents was found either in fibroblasts or in blood or in both. As a consequence of the low number of cases and theoretical considerations (clinical consequences, mechanisms of formation, etc.), the recurrence risk should be reduced to less than 1% for a de novo i(21q) and to even less than 0.3% for all other de novo structural or combined structural and numeric chromosomal rearrangements. As the latter is lower than the commonly accepted risk of approximately 0.3% for indicating an invasive prenatal diagnosis and as the risk of abortion of a healthy fetus after chorionic villous sampling or amniocentesis is higher than approximately 0.5%, invasive prenatal investigation in most cases is not indicated and should only be performed if explicitly asked by the parents subsequent to appropriate genetic counseling. (c) 2007 Wiley-Liss, Inc.

  6. Emergence, Retention and Selection: A Trilogy of Origination for Functional De Novo Proteins from Ancestral LncRNAs in Primates.

    Directory of Open Access Journals (Sweden)

    Jia-Yu Chen

    2015-07-01

    Full Text Available While some human-specific protein-coding genes have been proposed to originate from ancestral lncRNAs, the transition process remains poorly understood. Here we identified 64 hominoid-specific de novo genes and report a mechanism for the origination of functional de novo proteins from ancestral lncRNAs with precise splicing structures and specific tissue expression profiles. Whole-genome sequencing of dozens of rhesus macaque animals revealed that these lncRNAs are generally not more selectively constrained than other lncRNA loci. The existence of these newly-originated de novo proteins is also not beyond anticipation under neutral expectation, as they generally have longer theoretical lifespan than their current age, due to their GC-rich sequence property enabling stable ORFs with lower chance of non-sense mutations. Interestingly, although the emergence and retention of these de novo genes are likely driven by neutral forces, population genetics study in 67 human individuals and 82 macaque animals revealed signatures of purifying selection on these genes specifically in human population, indicating a proportion of these newly-originated proteins are already functional in human. We thus propose a mechanism for creation of functional de novo proteins from ancestral lncRNAs during the primate evolution, which may contribute to human-specific genetic novelties by taking advantage of existed genomic contexts.

  7. De novo structural modeling and computational sequence analysis ...

    African Journals Online (AJOL)

    Jane

    2011-07-25

    Jul 25, 2011 ... fold recognition and ab initio protein structures, classification of structural motifs and ... stringent cross validation method to evaluate the method's performance ..... Hauser H, Jagels K, Moule S, Mungall K, Norbertczak H,.

  8. Spaced Seed Data Structures for De Novo Assembly

    Directory of Open Access Journals (Sweden)

    Inanç Birol

    2015-01-01

    Full Text Available De novo assembly of the genome of a species is essential in the absence of a reference genome sequence. Many scalable assembly algorithms use the de Bruijn graph (DBG paradigm to reconstruct genomes, where a table of subsequences of a certain length is derived from the reads, and their overlaps are analyzed to assemble sequences. Despite longer subsequences unlocking longer genomic features for assembly, associated increase in compute resources limits the practicability of DBG over other assembly archetypes already designed for longer reads. Here, we revisit the DBG paradigm to adapt it to the changing sequencing technology landscape and introduce three data structure designs for spaced seeds in the form of paired subsequences. These data structures address memory and run time constraints imposed by longer reads. We observe that when a fixed distance separates seed pairs, it provides increased sequence specificity with increased gap length. Further, we note that Bloom filters would be suitable to implicitly store spaced seeds and be tolerant to sequencing errors. Building on this concept, we describe a data structure for tracking the frequencies of observed spaced seeds. These data structure designs will have applications in genome, transcriptome and metagenome assemblies, and read error correction.

  9. Improved protein quality in transgenic soybean expressing a de novo synthetic protein, MB-16.

    Science.gov (United States)

    Zhang, Yunfang; Schernthaner, Johann; Labbé, Natalie; Hefford, Mary A; Zhao, Jiping; Simmonds, Daina H

    2014-06-01

    To improve soybean [Glycine max (L.) Merrill] seed nutritional quality, a synthetic gene, MB-16 was introduced into the soybean genome to boost seed methionine content. MB-16, an 11 kDa de novo protein enriched in the essential amino acids (EAAs) methionine, threonine, lysine and leucine, was originally developed for expression in rumen bacteria. For efficient seed expression, constructs were designed using the soybean codon bias, with and without the KDEL ER retention sequence, and β-conglycinin or cruciferin seed specific protein storage promoters. Homozygous lines, with single locus integrations, were identified for several transgenic events. Transgene transmission and MB-16 protein expression were confirmed to the T5 and T7 generations, respectively. Quantitative RT-PCR analysis of developing seed showed that the transcript peaked in growing seed, 5-6 mm long, remained at this peak level to the full-sized green seed and then was significantly reduced in maturing yellow seed. Transformed events carrying constructs with the rumen bacteria codon preference showed the same transcription pattern as those with the soybean codon preference, but the transcript levels were lower at each developmental stage. MB-16 protein levels, as determined by immunoblots, were highest in full-sized green seed but the protein virtually disappeared in mature seed. However, amino acid analysis of mature seed, in the best transgenic line, showed a significant increase of 16.2 and 65.9 % in methionine and cysteine, respectively, as compared to the parent. This indicates that MB-16 elevated the sulfur amino acids, improved the EAA seed profile and confirms that a de novo synthetic gene can enhance the nutritional quality of soybean.

  10. Top-down approach in protein RDC data analysis: de novo estimation of the alignment tensor

    International Nuclear Information System (INIS)

    Chen Kang; Tjandra, Nico

    2007-01-01

    In solution NMR spectroscopy the residual dipolar coupling (RDC) is invaluable in improving both the precision and accuracy of NMR structures during their structural refinement. The RDC also provides a potential to determine protein structure de novo. These procedures are only effective when an accurate estimate of the alignment tensor has already been made. Here we present a top-down approach, starting from the secondary structure elements and finishing at the residue level, for RDC data analysis in order to obtain a better estimate of the alignment tensor. Using only the RDCs from N-H bonds of residues in α-helices and CA-CO bonds in β-strands, we are able to determine the offset and the approximate amplitude of the RDC modulation-curve for each secondary structure element, which are subsequently used as targets for global minimization. The alignment order parameters and the orientation of the major principal axis of individual helix or strand, with respect to the alignment frame, can be determined in each of the eight quadrants of a sphere. The following minimization against RDC of all residues within the helix or strand segment can be carried out with fixed alignment order parameters to improve the accuracy of the orientation. For a helical protein Bax, the three components A xx , A yy and A zz , of the alignment order can be determined with this method in average to within 2.3% deviation from the values calculated with the available atomic coordinates. Similarly for β-sheet protein Ubiquitin they agree in average to within 8.5%. The larger discrepancy in β-strand parameters comes from both the diversity of the β-sheet structure and the lower precision of CA-CO RDCs. This top-down approach is a robust method for alignment tensor estimation and also holds a promise for providing a protein topological fold using limited sets of RDCs

  11. Catalysis by a de novo zinc-mediated protein interface: implications for natural enzyme evolution and rational enzyme engineering.

    Science.gov (United States)

    Der, Bryan S; Edwards, David R; Kuhlman, Brian

    2012-05-08

    Here we show that a recent computationally designed zinc-mediated protein interface is serendipitously capable of catalyzing carboxyester and phosphoester hydrolysis. Although the original motivation was to design a de novo zinc-mediated protein-protein interaction (called MID1-zinc), we observed in the homodimer crystal structure a small cleft and open zinc coordination site. We investigated if the cleft and zinc site at the designed interface were sufficient for formation of a primitive active site that can perform hydrolysis. MID1-zinc hydrolyzes 4-nitrophenyl acetate with a rate acceleration of 10(5) and a k(cat)/K(M) of 630 M(-1) s(-1) and 4-nitrophenyl phosphate with a rate acceleration of 10(4) and a k(cat)/K(M) of 14 M(-1) s(-1). These rate accelerations by an unoptimized active site highlight the catalytic power of zinc and suggest that the clefts formed by protein-protein interactions are well-suited for creating enzyme active sites. This discovery has implications for protein evolution and engineering: from an evolutionary perspective, three-coordinated zinc at a homodimer interface cleft represents a simple evolutionary path to nascent enzymatic activity; from a protein engineering perspective, future efforts in de novo design of enzyme active sites may benefit from exploring clefts at protein interfaces for active site placement.

  12. HBV core protein allosteric modulators differentially alter cccDNA biosynthesis from de novo infection and intracellular amplification pathways

    Science.gov (United States)

    Guo, Fang; Zhao, Qiong; Cheng, Junjun; Qi, Yonghe; Su, Qing; Wei, Lai; Li, Wenhui; Chang, Jinhong

    2017-01-01

    Hepatitis B virus (HBV) core protein assembles viral pre-genomic (pg) RNA and DNA polymerase into nucleocapsids for reverse transcriptional DNA replication to take place. Several chemotypes of small molecules, including heteroaryldihydropyrimidines (HAPs) and sulfamoylbenzamides (SBAs), have been discovered to allosterically modulate core protein structure and consequentially alter the kinetics and pathway of core protein assembly, resulting in formation of irregularly-shaped core protein aggregates or “empty” capsids devoid of pre-genomic RNA and viral DNA polymerase. Interestingly, in addition to inhibiting nucleocapsid assembly and subsequent viral genome replication, we have now demonstrated that HAPs and SBAs differentially modulate the biosynthesis of covalently closed circular (ccc) DNA from de novo infection and intracellular amplification pathways by inducing disassembly of nucleocapsids derived from virions as well as double-stranded DNA-containing progeny nucleocapsids in the cytoplasm. Specifically, the mistimed cuing of nucleocapsid uncoating prevents cccDNA formation during de novo infection of hepatocytes, while transiently accelerating cccDNA synthesis from cytoplasmic progeny nucleocapsids. Our studies indicate that elongation of positive-stranded DNA induces structural changes of nucleocapsids, which confers ability of mature nucleocapsids to bind CpAMs and triggers its disassembly. Understanding the molecular mechanism underlying the dual effects of the core protein allosteric modulators on nucleocapsid assembly and disassembly will facilitate the discovery of novel core protein-targeting antiviral agents that can more efficiently suppress cccDNA synthesis and cure chronic hepatitis B. PMID:28945802

  13. HBV core protein allosteric modulators differentially alter cccDNA biosynthesis from de novo infection and intracellular amplification pathways.

    Science.gov (United States)

    Guo, Fang; Zhao, Qiong; Sheraz, Muhammad; Cheng, Junjun; Qi, Yonghe; Su, Qing; Cuconati, Andrea; Wei, Lai; Du, Yanming; Li, Wenhui; Chang, Jinhong; Guo, Ju-Tao

    2017-09-01

    Hepatitis B virus (HBV) core protein assembles viral pre-genomic (pg) RNA and DNA polymerase into nucleocapsids for reverse transcriptional DNA replication to take place. Several chemotypes of small molecules, including heteroaryldihydropyrimidines (HAPs) and sulfamoylbenzamides (SBAs), have been discovered to allosterically modulate core protein structure and consequentially alter the kinetics and pathway of core protein assembly, resulting in formation of irregularly-shaped core protein aggregates or "empty" capsids devoid of pre-genomic RNA and viral DNA polymerase. Interestingly, in addition to inhibiting nucleocapsid assembly and subsequent viral genome replication, we have now demonstrated that HAPs and SBAs differentially modulate the biosynthesis of covalently closed circular (ccc) DNA from de novo infection and intracellular amplification pathways by inducing disassembly of nucleocapsids derived from virions as well as double-stranded DNA-containing progeny nucleocapsids in the cytoplasm. Specifically, the mistimed cuing of nucleocapsid uncoating prevents cccDNA formation during de novo infection of hepatocytes, while transiently accelerating cccDNA synthesis from cytoplasmic progeny nucleocapsids. Our studies indicate that elongation of positive-stranded DNA induces structural changes of nucleocapsids, which confers ability of mature nucleocapsids to bind CpAMs and triggers its disassembly. Understanding the molecular mechanism underlying the dual effects of the core protein allosteric modulators on nucleocapsid assembly and disassembly will facilitate the discovery of novel core protein-targeting antiviral agents that can more efficiently suppress cccDNA synthesis and cure chronic hepatitis B.

  14. HBV core protein allosteric modulators differentially alter cccDNA biosynthesis from de novo infection and intracellular amplification pathways.

    Directory of Open Access Journals (Sweden)

    Fang Guo

    2017-09-01

    Full Text Available Hepatitis B virus (HBV core protein assembles viral pre-genomic (pg RNA and DNA polymerase into nucleocapsids for reverse transcriptional DNA replication to take place. Several chemotypes of small molecules, including heteroaryldihydropyrimidines (HAPs and sulfamoylbenzamides (SBAs, have been discovered to allosterically modulate core protein structure and consequentially alter the kinetics and pathway of core protein assembly, resulting in formation of irregularly-shaped core protein aggregates or "empty" capsids devoid of pre-genomic RNA and viral DNA polymerase. Interestingly, in addition to inhibiting nucleocapsid assembly and subsequent viral genome replication, we have now demonstrated that HAPs and SBAs differentially modulate the biosynthesis of covalently closed circular (ccc DNA from de novo infection and intracellular amplification pathways by inducing disassembly of nucleocapsids derived from virions as well as double-stranded DNA-containing progeny nucleocapsids in the cytoplasm. Specifically, the mistimed cuing of nucleocapsid uncoating prevents cccDNA formation during de novo infection of hepatocytes, while transiently accelerating cccDNA synthesis from cytoplasmic progeny nucleocapsids. Our studies indicate that elongation of positive-stranded DNA induces structural changes of nucleocapsids, which confers ability of mature nucleocapsids to bind CpAMs and triggers its disassembly. Understanding the molecular mechanism underlying the dual effects of the core protein allosteric modulators on nucleocapsid assembly and disassembly will facilitate the discovery of novel core protein-targeting antiviral agents that can more efficiently suppress cccDNA synthesis and cure chronic hepatitis B.

  15. Cavitation during the protein misfolding cyclic amplification (PMCA) method – The trigger for de novo prion generation?

    International Nuclear Information System (INIS)

    Haigh, Cathryn L.; Drew, Simon C.

    2015-01-01

    The protein misfolding cyclic amplification (PMCA) technique has become a widely-adopted method for amplifying minute amounts of the infectious conformer of the prion protein (PrP). PMCA involves repeated cycles of 20 kHz sonication and incubation, during which the infectious conformer seeds the conversion of normally folded protein by a templating interaction. Recently, it has proved possible to create an infectious PrP conformer without the need for an infectious seed, by including RNA and the phospholipid POPG as essential cofactors during PMCA. The mechanism underpinning this de novo prion formation remains unknown. In this study, we first establish by spin trapping methods that cavitation bubbles formed during PMCA provide a radical-rich environment. Using a substrate preparation comparable to that employed in studies of de novo prion formation, we demonstrate by immuno-spin trapping that PrP- and RNA-centered radicals are generated during sonication, in addition to PrP-RNA cross-links. We further show that serial PMCA produces protease-resistant PrP that is oxidatively modified. We suggest a unique confluence of structural (membrane-mimetic hydrophobic/hydrophilic bubble interface) and chemical (ROS) effects underlie the phenomenon of de novo prion formation by PMCA, and that these effects have meaningful biological counterparts of possible relevance to spontaneous prion formation in vivo. - Highlights: • Sonication during PMCA generates free radicals at the surface of cavitation bubbles. • PrP-centered and RNA-centered radicals are formed in addition to PrP-RNA adducts. • De novo prions may result from ROS and structural constraints during cavitation

  16. Cavitation during the protein misfolding cyclic amplification (PMCA) method – The trigger for de novo prion generation?

    Energy Technology Data Exchange (ETDEWEB)

    Haigh, Cathryn L., E-mail: chaigh@unimelb.edu.au [Department of Pathology, The University of Melbourne, Victoria 3010 (Australia); Drew, Simon C., E-mail: sdrew@unimelb.edu.au [Florey Department of Neuroscience and Mental Health, The University of Melbourne, Victoria 3010 (Australia)

    2015-06-05

    The protein misfolding cyclic amplification (PMCA) technique has become a widely-adopted method for amplifying minute amounts of the infectious conformer of the prion protein (PrP). PMCA involves repeated cycles of 20 kHz sonication and incubation, during which the infectious conformer seeds the conversion of normally folded protein by a templating interaction. Recently, it has proved possible to create an infectious PrP conformer without the need for an infectious seed, by including RNA and the phospholipid POPG as essential cofactors during PMCA. The mechanism underpinning this de novo prion formation remains unknown. In this study, we first establish by spin trapping methods that cavitation bubbles formed during PMCA provide a radical-rich environment. Using a substrate preparation comparable to that employed in studies of de novo prion formation, we demonstrate by immuno-spin trapping that PrP- and RNA-centered radicals are generated during sonication, in addition to PrP-RNA cross-links. We further show that serial PMCA produces protease-resistant PrP that is oxidatively modified. We suggest a unique confluence of structural (membrane-mimetic hydrophobic/hydrophilic bubble interface) and chemical (ROS) effects underlie the phenomenon of de novo prion formation by PMCA, and that these effects have meaningful biological counterparts of possible relevance to spontaneous prion formation in vivo. - Highlights: • Sonication during PMCA generates free radicals at the surface of cavitation bubbles. • PrP-centered and RNA-centered radicals are formed in addition to PrP-RNA adducts. • De novo prions may result from ROS and structural constraints during cavitation.

  17. Identification of De Novo Synthesized and Relatively Older Proteins

    OpenAIRE

    Jaleel, Abdul; Henderson, Gregory C.; Madden, Benjamin J.; Klaus, Katherine A.; Morse, Dawn M.; Gopala, Srinivas; Nair, K. Sreekumaran

    2010-01-01

    OBJECTIVE The accumulation of old and damaged proteins likely contributes to complications of diabetes, but currently no methodology is available to measure the relative age of a specific protein alongside assessment of posttranslational modifications (PTM). To accomplish our goal of studying the impact of insulin deficiency and hyperglycemia in type 1 diabetes upon accumulation of old damaged isoforms of plasma apolipoprotein A-1 (ApoA-1), we sought to develop a novel methodology, which is r...

  18. BayesMotif: de novo protein sorting motif discovery from impure datasets.

    Science.gov (United States)

    Hu, Jianjun; Zhang, Fan

    2010-01-18

    Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of

  19. Identification of a novel Plasmopara halstedii elicitor protein combining de novo peptide sequencing algorithms and RACE-PCR

    Directory of Open Access Journals (Sweden)

    Madlung Johannes

    2010-05-01

    Full Text Available Abstract Background Often high-quality MS/MS spectra of tryptic peptides do not match to any database entry because of only partially sequenced genomes and therefore, protein identification requires de novo peptide sequencing. To achieve protein identification of the economically important but still unsequenced plant pathogenic oomycete Plasmopara halstedii, we first evaluated the performance of three different de novo peptide sequencing algorithms applied to a protein digests of standard proteins using a quadrupole TOF (QStar Pulsar i. Results The performance order of the algorithms was PEAKS online > PepNovo > CompNovo. In summary, PEAKS online correctly predicted 45% of measured peptides for a protein test data set. All three de novo peptide sequencing algorithms were used to identify MS/MS spectra of tryptic peptides of an unknown 57 kDa protein of P. halstedii. We found ten de novo sequenced peptides that showed homology to a Phytophthora infestans protein, a closely related organism of P. halstedii. Employing a second complementary approach, verification of peptide prediction and protein identification was performed by creation of degenerate primers for RACE-PCR and led to an ORF of 1,589 bp for a hypothetical phosphoenolpyruvate carboxykinase. Conclusions Our study demonstrated that identification of proteins within minute amounts of sample material improved significantly by combining sensitive LC-MS methods with different de novo peptide sequencing algorithms. In addition, this is the first study that verified protein prediction from MS data by also employing a second complementary approach, in which RACE-PCR led to identification of a novel elicitor protein in P. halstedii.

  20. Pushing the frontiers of atomic models for protein tertiary structure ...

    Indian Academy of Sciences (India)

    as an NP complete or NP hard problem.4,5 This notwith- standing, the dire need for tertiary structures of proteins in drug discovery and other areas6–8 has propelled the development of a multitude of computational recipes. In this article, we focus on ab initio/de novo strategies,. Bhageerath in particular, for protein tertiary ...

  1. A human-specific de novo protein-coding gene associated with human brain functions.

    Directory of Open Access Journals (Sweden)

    Chuan-Yun Li

    2010-03-01

    Full Text Available To understand whether any human-specific new genes may be associated with human brain functions, we computationally screened the genetic vulnerable factors identified through Genome-Wide Association Studies and linkage analyses of nicotine addiction and found one human-specific de novo protein-coding gene, FLJ33706 (alternative gene symbol C20orf203. Cross-species analysis revealed interesting evolutionary paths of how this gene had originated from noncoding DNA sequences: insertion of repeat elements especially Alu contributed to the formation of the first coding exon and six standard splice junctions on the branch leading to humans and chimpanzees, and two subsequent substitutions in the human lineage escaped two stop codons and created an open reading frame of 194 amino acids. We experimentally verified FLJ33706's mRNA and protein expression in the brain. Real-Time PCR in multiple tissues demonstrated that FLJ33706 was most abundantly expressed in brain. Human polymorphism data suggested that FLJ33706 encodes a protein under purifying selection. A specifically designed antibody detected its protein expression across human cortex, cerebellum and midbrain. Immunohistochemistry study in normal human brain cortex revealed the localization of FLJ33706 protein in neurons. Elevated expressions of FLJ33706 were detected in Alzheimer's brain samples, suggesting the role of this novel gene in human-specific pathogenesis of Alzheimer's disease. FLJ33706 provided the strongest evidence so far that human-specific de novo genes can have protein-coding potential and differential protein expression, and be involved in human brain functions.

  2. Apoprotein Structure and Metal Binding Characterization of a de Novo Designed Peptide, α3DIV, that Sequesters Toxic Heavy Metals.

    Science.gov (United States)

    Plegaria, Jefferson S; Dzul, Stephen P; Zuiderweg, Erik R P; Stemmler, Timothy L; Pecoraro, Vincent L

    2015-05-12

    De novo protein design is a biologically relevant approach that provides a novel process in elucidating protein folding and modeling the metal centers of metalloproteins in a completely unrelated or simplified fold. An integral step in de novo protein design is the establishment of a well-folded scaffold with one conformation, which is a fundamental characteristic of many native proteins. Here, we report the NMR solution structure of apo α3DIV at pH 7.0, a de novo designed three-helix bundle peptide containing a triscysteine motif (Cys18, Cys28, and Cys67) that binds toxic heavy metals. The structure comprises 1067 NOE restraints derived from multinuclear multidimensional NOESY, as well as 138 dihedral angles (ψ, φ, and χ1). The backbone and heavy atoms of the 20 lowest energy structures have a root mean square deviation from the mean structure of 0.79 (0.16) Å and 1.31 (0.15) Å, respectively. When compared to the parent structure α3D, the substitution of Leu residues to Cys enhanced the α-helical content of α3DIV while maintaining the same overall topology and fold. In addition, solution studies on the metalated species illustrated metal-induced stability. An increase in the melting temperatures was observed for Hg(II), Pb(II), or Cd(II) bound α3DIV by 18-24 °C compared to its apo counterpart. Further, the extended X-ray absorption fine structure analysis on Hg(II)-α3DIV produced an average Hg(II)-S bond length at 2.36 Å, indicating a trigonal T-shaped coordination environment. Overall, the structure of apo α3DIV reveals an asymmetric distorted triscysteine metal binding site, which offers a model for native metalloregulatory proteins with thiol-rich ligands that function in regulating toxic heavy metals, such as ArsR, CadC, MerR, and PbrR.

  3. De novo generation of infectious prions with bacterially expressed recombinant prion protein.

    Science.gov (United States)

    Zhang, Zhihong; Zhang, Yi; Wang, Fei; Wang, Xinhe; Xu, Yuanyuan; Yang, Huaiyi; Yu, Guohua; Yuan, Chonggang; Ma, Jiyan

    2013-12-01

    The prion hypothesis is strongly supported by the fact that prion infectivity and the pathogenic conformer of prion protein (PrP) are simultaneously propagated in vitro by the serial protein misfolding cyclic amplification (sPMCA). However, due to sPMCA's enormous amplification power, whether an infectious prion can be formed de novo with bacterially expressed recombinant PrP (rPrP) remains to be satisfactorily resolved. To address this question, we performed unseeded sPMCA with rPrP in a laboratory that has never been exposed to any native prions. Two types of proteinase K (PK)-resistant and self-perpetuating recombinant PrP conformers (rPrP-res) with PK-resistant cores of 17 or 14 kDa were generated. A bioassay revealed that rPrP-res(17kDa) was highly infectious, causing prion disease in wild-type mice with an average survival time of about 172 d. In contrast, rPrP-res(14kDa) completely failed to induce any disease. Our findings reveal that sPMCA is sufficient to initiate various self-perpetuating PK-resistant rPrP conformers, but not all of them possess in vivo infectivity. Moreover, generating an infectious prion in a prion-free environment establishes that an infectious prion can be formed de novo with bacterially expressed rPrP.

  4. Mass Spectrometry Analysis Coupled with de novo Sequencing Reveals Amino Acid Substitutions in Nucleocapsid Protein from Influenza A Virus

    Directory of Open Access Journals (Sweden)

    Zijian Li

    2014-02-01

    Full Text Available Amino acid substitutions in influenza A virus are the main reasons for both antigenic shift and virulence change, which result from non-synonymous mutations in the viral genome. Nucleocapsid protein (NP, one of the major structural proteins of influenza virus, is responsible for regulation of viral RNA synthesis and replication. In this report we used LC-MS/MS to analyze tryptic digestion of nucleocapsid protein of influenza virus (A/Puerto Rico/8/1934 H1N1, which was isolated and purified by SDS poly-acrylamide gel electrophoresis. Thus, LC-MS/MS analyses, coupled with manual de novo sequencing, allowed the determination of three substituted amino acid residues R452K, T423A and N430T in two tryptic peptides. The obtained results provided experimental evidence that amino acid substitutions resulted from non-synonymous gene mutations could be directly characterized by mass spectrometry in proteins of RNA viruses such as influenza A virus.

  5. Acquisition, consolidation, reconsolidation, and extinction of eyelid conditioning responses require de novo protein synthesis.

    Science.gov (United States)

    Inda, Mari Carmen; Delgado-García, José María; Carrión, Angel Manuel

    2005-02-23

    Memory, as measured by changes in an animal's behavior some time after learning, is a reflection of many processes. Here, using a trace paradigm, in mice we show that de novo protein synthesis is required for acquisition, consolidation, reconsolidation, and extinction of classically conditioned eyelid responses. Two critical periods of protein synthesis have been found: the first, during training, the blocking of which impaired acquisition; and the second, lasting the first 4 h after training, the blocking of which impaired consolidation. The process of reconsolidation was sensitive to protein synthesis inhibition if anisomycin was injected before or just after the reactivation session. Furthermore, extinction was also dependent on protein synthesis, following the same temporal course as that followed during acquisition and consolidation. This last fact reinforces the idea that extinction is an active learning process rather than a passive event of forgetting. Together, these findings demonstrate that all of the different stages of memory formation involved in the classical conditioning of eyelid responses are dependent on protein synthesis.

  6. Pushing the size limit of de novo structure ensemble prediction guided by sparse SDSL-EPR restraints to 200 residues: The monomeric and homodimeric forms of BAX

    Science.gov (United States)

    Fischer, Axel W.; Bordignon, Enrica; Bleicken, Stephanie; García-Sáez, Ana J.; Jeschke, Gunnar; Meiler, Jens

    2016-01-01

    Structure determination remains a challenge for many biologically important proteins. In particular, proteins that adopt multiple conformations often evade crystallization in all biologically relevant states. Although computational de novo protein folding approaches often sample biologically relevant conformations, the selection of the most accurate model for different functional states remains a formidable challenge, in particular, for proteins with more than about 150 residues. Electron paramagnetic resonance (EPR) spectroscopy can obtain limited structural information for proteins in well-defined biological states and thereby assist in selecting biologically relevant conformations. The present study demonstrates that de novo folding methods are able to accurately sample the folds of 192-residue long soluble monomeric Bcl-2-associated X protein (BAX). The tertiary structures of the monomeric and homodimeric forms of BAX were predicted using the primary structure as well as 25 and 11 EPR distance restraints, respectively. The predicted models were subsequently compared to respective NMR/X-ray structures of BAX. EPR restraints improve the protein-size normalized root-mean-square-deviation (RMSD100) of the most accurate models with respect to the NMR/crystal structure from 5.9 Å to 3.9 Å and from 5.7 Å to 3.3 Å, respectively. Additionally, the model discrimination is improved, which is demonstrated by an improvement of the enrichment from 5% to 15% and from 13% to 21%, respectively. PMID:27129417

  7. The induction of the oxidative burst in Elodea densa by sulfhydryl reagent does not depend on de novo protein synthesis

    Energy Technology Data Exchange (ETDEWEB)

    Amicucci, Enrica [Milan, Univ. (Italy). Dipt. di Fisiologia e Biochimica delle Piante

    1997-12-31

    In Elodea densa Planchon leaves, N-ethylmaleimide (NEM) and other sulfhydryl-binding reagents induce a marked and temporary increase of respiration that is insensitive to cyanide, hydroxamate and propylgallate and completely inhibited by diphenylene iodonium (DPI) and by quinacrine. In this paper the author investigates whether the mechanism that causes the oxidative burst depends on the activation of preexisting oxidative systems or on the activation of de novo protein synthesis. The inhibitors used were cycloheximide (CHI) which inhibits protein synthesis in plant cells by depressing the incorporation of aminoacids into proteins and cordycepin, an effective inhibitor of mRNA synthesis. The data support the idea that the mechanism investigated depends on the activation of a long lived protein(s) and not on de novo protein synthesis.

  8. Structures composing protein domains.

    Science.gov (United States)

    Kubrycht, Jaroslav; Sigler, Karel; Souček, Pavel; Hudeček, Jiří

    2013-08-01

    This review summarizes available data concerning intradomain structures (IS) such as functionally important amino acid residues, short linear motifs, conserved or disordered regions, peptide repeats, broadly occurring secondary structures or folds, etc. IS form structural features (units or elements) necessary for interactions with proteins or non-peptidic ligands, enzyme reactions and some structural properties of proteins. These features have often been related to a single structural level (e.g. primary structure) mostly requiring certain structural context of other levels (e.g. secondary structures or supersecondary folds) as follows also from some examples reported or demonstrated here. In addition, we deal with some functionally important dynamic properties of IS (e.g. flexibility and different forms of accessibility), and more special dynamic changes of IS during enzyme reactions and allosteric regulation. Selected notes concern also some experimental methods, still more necessary tools of bioinformatic processing and clinically interesting relationships. Copyright © 2013 Elsevier Masson SAS. All rights reserved.

  9. SV2: accurate structural variation genotyping and de novo mutation detection from whole genomes.

    Science.gov (United States)

    Antaki, Danny; Brandler, William M; Sebat, Jonathan

    2018-05-15

    Structural variation (SV) detection from short-read whole genome sequencing is error prone, presenting significant challenges for population or family-based studies of disease. Here, we describe SV2, a machine-learning algorithm for genotyping deletions and duplications from paired-end sequencing data. SV2 can rapidly integrate variant calls from multiple structural variant discovery algorithms into a unified call set with high genotyping accuracy and capability to detect de novo mutations. SV2 is freely available on GitHub (https://github.com/dantaki/SV2). jsebat@ucsd.edu. Supplementary data are available at Bioinformatics online.

  10. Proteomic Profiling of De Novo Protein Synthesis in Starvation-Induced Autophagy Using Bioorthogonal Noncanonical Amino Acid Tagging.

    Science.gov (United States)

    Zhang, J; Wang, J; Lee, Y-M; Lim, T-K; Lin, Q; Shen, H-M

    2017-01-01

    Autophagy is an intracellular degradation process activated by stress factors such as nutrient starvation to maintain cellular homeostasis. There is emerging evidence demonstrating that de novo protein synthesis is involved in the autophagic process. However, up-to-date characterizing of these de novo proteins is technically difficult. In this chapter, we describe a novel method to identify newly synthesized proteins during starvation-mediated autophagy by bioorthogonal noncanonical amino acid tagging (BONCAT), in conjunction with isobaric tagging for relative and absolute quantification (iTRAQ)-based quantitative proteomics. l-azidohomoalanine (AHA) is an analog of methionine, and it can be readily incorporated into the newly synthesized proteins. The AHA-containing proteins can be enriched with avidin beads after a "click" reaction between alkyne-bearing biotin and the azide moiety of AHA. The enriched proteins are then subjected to iTRAQ™ labeling for protein identification and quantification using liquid chromatography-tandem mass spectrometry (LC-MS/MS). By using this technique, we have successfully profiled more than 700 proteins that are synthesized during starvation-induced autophagy. We believe that this approach is effective in identification of newly synthesized proteins in the process of autophagy and provides useful insights to the molecular mechanisms and biological functions of autophagy. © 2017 Elsevier Inc. All rights reserved.

  11. De Novo generation of molecular structures using optimization to select graphs on a given lattice

    DEFF Research Database (Denmark)

    Bywater, R.P.; Poulsen, Thomas Agersten; Røgen, Peter

    2004-01-01

    A recurrent problem in organic chemistry is the generation of new molecular structures that conform to some predetermined set of structural constraints that are imposed in an endeavor to build certain required properties into the newly generated structure. An example of this is the pharmacophore...... model, used in medicinal chemistry to guide de novo design or selection of suitable structures from compound databases. We propose here a method that efficiently links up a selected number of required atom positions while at the same time directing the emergent molecular skeleton to avoid forbidden...... positions. The linkage process takes place on a lattice whose unit step length and overall geometry is designed to match typical architectures of organic molecules. We use an optimization method to select from the many different graphs possible. The approach is demonstrated in an example where crystal...

  12. A de novo designed 11 kDa polypeptide: model for amyloidogenic intrinsically disordered proteins.

    Science.gov (United States)

    Topilina, Natalya I; Ermolenkov, Vladimir V; Sikirzhytski, Vitali; Higashiya, Seiichiro; Lednev, Igor K; Welch, John T

    2010-07-01

    A de novo polypeptide GH(6)[(GA)(3)GY(GA)(3)GE](8)GAH(6) (YE8) has a significant number of identical weakly interacting beta-strands with the turns and termini functionalized by charged amino acids to control polypeptide folding and aggregation. YE8 exists in a soluble, disordered form at neutral pH but is responsive to changes in pH and ionic strength. The evolution of YE8 secondary structure has been successfully quantified during all stages of polypeptide fibrillation by deep UV resonance Raman (DUVRR) spectroscopy combined with other morphological, structural, spectral, and tinctorial characterization. The YE8 folding kinetics at pH 3.5 are strongly dependent on polypeptide concentration with a lag phase that can be eliminated by seeding with a solution of folded fibrillar YE8. The lag phase of polypeptide folding is concentration dependent leading to the conclusion that beta-sheet folding of the 11-kDa amyloidogenic polypeptide is completely aggregation driven.

  13. Versatile de novo enzyme activity in capsid proteins from an engineered M13 bacteriophage library.

    Science.gov (United States)

    Casey, John P; Barbero, Roberto J; Heldman, Nimrod; Belcher, Angela M

    2014-11-26

    Biocatalysis has grown rapidly in recent decades as a solution to the evolving demands of industrial chemical processes. Mounting environmental pressures and shifting supply chains underscore the need for novel chemical activities, while rapid biotechnological progress has greatly increased the utility of enzymatic methods. Enzymes, though capable of high catalytic efficiency and remarkable reaction selectivity, still suffer from relative instability, high costs of scaling, and functional inflexibility. Herein, we developed a biochemical platform for engineering de novo semisynthetic enzymes, functionally modular and widely stable, based on the M13 bacteriophage. The hydrolytic bacteriophage described in this paper catalyzes a range of carboxylic esters, is active from 25 to 80 °C, and demonstrates greater efficiency in DMSO than in water. The platform complements biocatalysts with characteristics of heterogeneous catalysis, yielding high-surface area, thermostable biochemical structures readily adaptable to reactions in myriad solvents. As the viral structure ensures semisynthetic enzymes remain linked to the genetic sequences responsible for catalysis, future work will tailor the biocatalysts to high-demand synthetic processes by evolving new activities, utilizing high-throughput screening technology and harnessing M13's multifunctionality.

  14. Transferable coarse-grained potential for de novo protein folding and design.

    Directory of Open Access Journals (Sweden)

    Ivan Coluzza

    Full Text Available Protein folding and design are major biophysical problems, the solution of which would lead to important applications especially in medicine. Here we provide evidence of how a novel parametrization of the Caterpillar model may be used for both quantitative protein design and folding. With computer simulations it is shown that, for a large set of real protein structures, the model produces designed sequences with similar physical properties to the corresponding natural occurring sequences. The designed sequences require further experimental testing. For an independent set of proteins, previously used as benchmark, the correct folded structure of both the designed and the natural sequences is also demonstrated. The equilibrium folding properties are characterized by free energy calculations. The resulting free energy profiles not only are consistent among natural and designed proteins, but also show a remarkable precision when the folded structures are compared to the experimentally determined ones. Ultimately, the updated Caterpillar model is unique in the combination of its fundamental three features: its simplicity, its ability to produce natural foldable designed sequences, and its structure prediction precision. It is also remarkable that low frustration sequences can be obtained with such a simple and universal design procedure, and that the folding of natural proteins shows funnelled free energy landscapes without the need of any potentials based on the native structure.

  15. De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture.

    Science.gov (United States)

    Di Pierro, Michele; Cheng, Ryan R; Lieberman Aiden, Erez; Wolynes, Peter G; Onuchic, José N

    2017-11-14

    Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. Copyright © 2017 the Author(s). Published by PNAS.

  16. Protein Structure Prediction by Protein Threading

    Science.gov (United States)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  17. Dependency on de novo protein synthesis and proteomic changes during metamorphosis of the marine bryozoan Bugula neritina

    KAUST Repository

    Wong, Yue Him

    2010-05-24

    Background: Metamorphosis in the bryozoan Bugula neritina (Linne) includes an initial phase of rapid morphological rearrangement followed by a gradual phase of morphogenesis. We hypothesized that the first phase may be independent of de novo synthesis of proteins and, instead, involves post-translational modifications of existing proteins, providing a simple mechanism to quickly initiate metamorphosis. To test our hypothesis, we challenged B. neritina larvae with transcription and translation inhibitors. Furthermore, we employed 2D gel electrophoresis to characterize changes in the phosphoproteome and proteome during early metamorphosis. Differentially expressed proteins were identified by liquid chromatography tandem mass spectrometry and their gene expression patterns were profiled using semi-quantitative real time PCR.Results: When larvae were incubated with transcription and translation inhibitors, metamorphosis initiated through the first phase but did not complete. We found a significant down-regulation of 60 protein spots and the percentage of phosphoprotein spots decreased from 15% in the larval stage to12% during early metamorphosis. Two proteins--the mitochondrial processing peptidase beta subunit and severin--were abundantly expressed and phosphorylated in the larval stage, but down-regulated during metamorphosis. MPPbeta and severin were also down-regulated on the gene expression level.Conclusions: The initial morphogenetic changes that led to attachment of B. neritina did not depend on de novo protein synthesis, but the subsequent gradual morphogenesis did. This is the first time that the mitochondrial processing peptidase beta subunit or severin have been shown to be down-regulated on both gene and protein expression levels during the metamorphosis of B. neritina. Future studies employing immunohistochemistry to reveal the expression locality of these two proteins during metamorphosis should provide further evidence of the involvement of these two

  18. Protein design and engineering of a de novo pathway for microbial production of 1,3-propanediol from glucose.

    Science.gov (United States)

    Chen, Zhen; Geng, Feng; Zeng, An-Ping

    2015-02-01

    Protein engineering to expand the substrate spectrum of native enzymes opens new possibilities for bioproduction of valuable chemicals from non-natural pathways. No natural microorganism can directly use sugars to produce 1,3-propanediol (PDO). Here, we present a de novo route for the biosynthesis of PDO from sugar, which may overcome the mentioned limitations by expanding the homoserine synthesis pathway. The accomplishment of pathway from homoserine to PDO is achieved by protein engineering of glutamate dehydrogenase (GDH) and pyruvate decarboxylase to sequentially convert homoserine to 4-hydroxy-2-ketobutyrate and 3-hydroxypropionaldehyde. The latter is finally converted to PDO by using a native alcohol dehydrogenase. In this work, we report on experimental accomplishment of this non-natural pathway, especially by protein engineering of GDH for the key step of converting homoserine to 4-hydroxy-2-ketobutyrate. These results show the feasibility and significance of protein engineering for de novo pathway design and overproduction of desired industrial products. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  20. Voltammetry and In Situ Scanning Tunnelling Microscopy of De Novo Designed Heme Protein Monolayers on Au(111)-Electrode Surfaces

    DEFF Research Database (Denmark)

    Albrecht, Tim; Li, Wu; Haehnel, Wolfgang

    2006-01-01

    to the tunnelling current, apparently due to slow electron transfer kinetics. As a consequence, STM images of heme-containing and heme-free MOP-C did not reveal any notable differences in apparent height or physical extension. The apparent height of heme-containing MOP-C did not show any dependence on the substrate...... potential being varied around the redox potential of the protein. The mere presence of an accessible molecular energy level is not sufficient to result in detectable tunnelling current modulation. (c) 2006 Elsevier B.V. All rights reserved.......In the present work, we report the electrochemical characterization and in situ scanning tunnelling microscopy (STM) studies of monolayers of an artificial de novo designed heme protein MOP-C, covalently immobilized on modified Au(111) surfaces. The protein forms closely packed monolayers, which...

  1. Computational design of proteins with novel structure and functions

    International Nuclear Information System (INIS)

    Yang Wei; Lai Lu-Hua

    2016-01-01

    Computational design of proteins is a relatively new field, where scientists search the enormous sequence space for sequences that can fold into desired structure and perform desired functions. With the computational approach, proteins can be designed, for example, as regulators of biological processes, novel enzymes, or as biotherapeutics. These approaches not only provide valuable information for understanding of sequence–structure–function relations in proteins, but also hold promise for applications to protein engineering and biomedical research. In this review, we briefly introduce the rationale for computational protein design, then summarize the recent progress in this field, including de novo protein design, enzyme design, and design of protein–protein interactions. Challenges and future prospects of this field are also discussed. (topical review)

  2. Differential requirement of de novo Arc protein synthesis in the insular cortex and the amygdala for safe and aversive taste long-term memory formation.

    Science.gov (United States)

    Guzmán-Ramos, Kioko; Venkataraman, Archana; Morin, Jean-Pascal; Osorio-Gómez, Daniel; Bermúdez-Rattoni, Federico

    2018-04-16

    Several immediate early genes products are known to be involved in the facilitation of structural and functional modifications at distinct synapses activated through experience. The IEG-encoded protein Arc (activity regulated cytoskeletal-associated protein) has been widely implicated in long-term memory formation and stabilization. In this study, we sought to evaluate a possible role for de novo Arc protein synthesis in the insular cortex (IC) and in the amygdala (AMY) during long-term taste memory formation. We found that acute inhibition of Arc protein synthesis through the infusion of antisense oligonucleotides administered in the IC before a novel taste presentation, affected consolidation of a safe taste memory trace (ST) but spared consolidation of conditioned taste aversion (CTA). Conversely, blocking Arc synthesis within the AMY impaired CTA consolidation but had no effect on ST long-term memory formation. Our results suggest that Arc-dependent plasticity during taste learning is required within distinct structures of the medial temporal lobe, depending on the emotional valence of the memory trace. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. Arsenic trioxide (AT) is a novel human neutrophil pro-apoptotic agent: effects of catalase on AT-induced apoptosis, degradation of cytoskeletal proteins and de novo protein synthesis.

    Science.gov (United States)

    Binet, François; Cavalli, Hélène; Moisan, Eliane; Girard, Denis

    2006-02-01

    The anti-cancer drug arsenic trioxide (AT) induces apoptosis in a variety of transformed or proliferating cells. However, little is known regarding its ability to induce apoptosis in terminally differentiated cells, such as neutrophils. Because neutropenia has been reported in some cancer patients after AT treatment, we hypothesised that AT could induce neutrophil apoptosis, an issue that has never been investigated. Herein, we found that AT-induced neutrophil apoptosis and gelsolin degradation via caspases. AT did not increase neutrophil superoxide production and did not induce mitochondrial generation of reactive oxygen species. AT-induced apoptosis in PLB-985 and X-linked chronic granulomatous disease (CGD) cells (PLB-985 cells deficient in gp91(phox) mimicking CGD) at the same potency. Addition of catalase, an inhibitor of H2O2, reversed AT-induced apoptosis and degradation of the cytoskeletal proteins gelsolin, alpha-tubulin and lamin B1. Unexpectedly, AT-induced de novo protein synthesis, which was reversed by catalase. Cycloheximide partially reversed AT-induced apoptosis. We conclude that AT induces neutrophil apoptosis by a caspase-dependent mechanism and via de novo protein synthesis. H2O2 is of major importance in AT-induced neutrophil apoptosis but its production does not originate from nicotinamide adenine dinucleotide phosphate dehydrogenase activation and mitochondria. Cytoskeletal structures other than microtubules can now be considered as novel targets of AT.

  4. Protein structure prediction using bee colony optimization metaheuristic

    DEFF Research Database (Denmark)

    Fonseca, Rasmus; Paluszewski, Martin; Winter, Pawel

    2010-01-01

    of the proteins structure, an energy potential and some optimization algorithm that ¿nds the structure with minimal energy. Bee Colony Optimization (BCO) is a relatively new approach to solving opti- mization problems based on the foraging behaviour of bees. Several variants of BCO have been suggested......Predicting the native structure of proteins is one of the most challenging problems in molecular biology. The goal is to determine the three-dimensional struc- ture from the one-dimensional amino acid sequence. De novo prediction algorithms seek to do this by developing a representation...... our BCO method to generate good solutions to the protein structure prediction problem. The results show that BCO generally ¿nds better solutions than simulated annealing which so far has been the metaheuristic of choice for this problem....

  5. Effect of metal ions on de novo aggregation of full-length prion protein

    International Nuclear Information System (INIS)

    Giese, Armin; Levin, Johannes; Bertsch, Uwe; Kretzschmar, Hans

    2004-01-01

    It is well established that the prion protein (PrP) contains metal ion binding sites with specificity for copper. Changes in copper levels have been suggested to influence incubation time in experimental prion disease. Therefore, we studied the effect of heavy metal ions (Cu 2+ , Mn 2+ , Ni 2+ , Co 2+ , and Zn 2+ ) in vitro in a model system that utilizes changes in the concentration of SDS to induce structural conversion and aggregation of recombinant PrP. To quantify and characterize PrP aggregates, we used fluorescently labelled PrP and cross-correlation analysis as well as scanning for intensely fluorescent targets in a confocal single molecule detection system. We found a specific strong pro-aggregatory effect of Mn 2+ at low micromolar concentrations that could be blocked by nanomolar concentration of Cu 2+ . These findings suggest that metal ions such as copper and manganese may also affect PrP conversion in vivo

  6. Oligomeric protein structure networks: insights into protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Brinda KV

    2005-12-01

    Full Text Available Abstract Background Protein-protein association is essential for a variety of cellular processes and hence a large number of investigations are being carried out to understand the principles of protein-protein interactions. In this study, oligomeric protein structures are viewed from a network perspective to obtain new insights into protein association. Structure graphs of proteins have been constructed from a non-redundant set of protein oligomer crystal structures by considering amino acid residues as nodes and the edges are based on the strength of the non-covalent interactions between the residues. The analysis of such networks has been carried out in terms of amino acid clusters and hubs (highly connected residues with special emphasis to protein interfaces. Results A variety of interactions such as hydrogen bond, salt bridges, aromatic and hydrophobic interactions, which occur at the interfaces are identified in a consolidated manner as amino acid clusters at the interface, from this study. Moreover, the characterization of the highly connected hub-forming residues at the interfaces and their comparison with the hubs from the non-interface regions and the non-hubs in the interface regions show that there is a predominance of charged interactions at the interfaces. Further, strong and weak interfaces are identified on the basis of the interaction strength between amino acid residues and the sizes of the interface clusters, which also show that many protein interfaces are stronger than their monomeric protein cores. The interface strengths evaluated based on the interface clusters and hubs also correlate well with experimentally determined dissociation constants for known complexes. Finally, the interface hubs identified using the present method correlate very well with experimentally determined hotspots in the interfaces of protein complexes obtained from the Alanine Scanning Energetics database (ASEdb. A few predictions of interface hot

  7. Protein Loop Structure Prediction Using Conformational Space Annealing.

    Science.gov (United States)

    Heo, Seungryong; Lee, Juyong; Joo, Keehyoung; Shin, Hang-Cheol; Lee, Jooyoung

    2017-05-22

    We have developed a protein loop structure prediction method by combining a new energy function, which we call E PLM (energy for protein loop modeling), with the conformational space annealing (CSA) global optimization algorithm. The energy function includes stereochemistry, dynamic fragment assembly, distance-scaled finite ideal gas reference (DFIRE), and generalized orientation- and distance-dependent terms. For the conformational search of loop structures, we used the CSA algorithm, which has been quite successful in dealing with various hard global optimization problems. We assessed the performance of E PLM with two widely used loop-decoy sets, Jacobson and RAPPER, and compared the results against the DFIRE potential. The accuracy of model selection from a pool of loop decoys as well as de novo loop modeling starting from randomly generated structures was examined separately. For the selection of a nativelike structure from a decoy set, E PLM was more accurate than DFIRE in the case of the Jacobson set and had similar accuracy in the case of the RAPPER set. In terms of sampling more nativelike loop structures, E PLM outperformed E DFIRE for both decoy sets. This new approach equipped with E PLM and CSA can serve as the state-of-the-art de novo loop modeling method.

  8. Computational design and elaboration of a de novo heterotetrameric alpha-helical protein that selectively binds an emissive abiological (porphinato)zinc chromophore.

    Science.gov (United States)

    Fry, H Christopher; Lehmann, Andreas; Saven, Jeffery G; DeGrado, William F; Therien, Michael J

    2010-03-24

    The first example of a computationally de novo designed protein that binds an emissive abiological chromophore is presented, in which a sophisticated level of cofactor discrimination is pre-engineered. This heterotetrameric, C(2)-symmetric bundle, A(His):B(Thr), uniquely binds (5,15-di[(4-carboxymethyleneoxy)phenyl]porphinato)zinc [(DPP)Zn] via histidine coordination and complementary noncovalent interactions. The A(2)B(2) heterotetrameric protein reflects ligand-directed elements of both positive and negative design, including hydrogen bonds to second-shell ligands. Experimental support for the appropriate formulation of [(DPP)Zn:A(His):B(Thr)](2) is provided by UV/visible and circular dichroism spectroscopies, size exclusion chromatography, and analytical ultracentrifugation. Time-resolved transient absorption and fluorescence spectroscopic data reveal classic excited-state singlet and triplet PZn photophysics for the A(His):B(Thr):(DPP)Zn protein (k(fluorescence) = 4 x 10(8) s(-1); tau(triplet) = 5 ms). The A(2)B(2) apoprotein has immeasurably low binding affinities for related [porphinato]metal chromophores that include a (DPP)Fe(III) cofactor and the zinc metal ion hemin derivative [(PPIX)Zn], underscoring the exquisite active-site binding discrimination realized in this computationally designed protein. Importantly, elements of design in the A(His):B(Thr) protein ensure that interactions within the tetra-alpha-helical bundle are such that only the heterotetramer is stable in solution; corresponding homomeric bundles present unfavorable ligand-binding environments and thus preclude protein structural rearrangements that could lead to binding of (porphinato)iron cofactors.

  9. Electrochemical and spectroscopic investigations of immobilized de novo designed heme proteins on metal electrodes

    DEFF Research Database (Denmark)

    Albrecht, Tim; Li, WW; Ulstrup, Jens

    2005-01-01

    On the basis of rational design principles, template-assisted four-helix-bundle proteins that include two histidines for coordinative binding of a heme were synthesized. Spectroscopic and thermodynamic characterization of the proteins in solution reveals the expected bis-histidine coordinated heme...

  10. NCYM, a Cis-antisense gene of MYCN, encodes a de novo evolved protein that inhibits GSK3β resulting in the stabilization of MYCN in human neuroblastomas.

    Directory of Open Access Journals (Sweden)

    Yusuke Suenaga

    2014-01-01

    Full Text Available The rearrangement of pre-existing genes has long been thought of as the major mode of new gene generation. Recently, de novo gene birth from non-genic DNA was found to be an alternative mechanism to generate novel protein-coding genes. However, its functional role in human disease remains largely unknown. Here we show that NCYM, a cis-antisense gene of the MYCN oncogene, initially thought to be a large non-coding RNA, encodes a de novo evolved protein regulating the pathogenesis of human cancers, particularly neuroblastoma. The NCYM gene is evolutionally conserved only in the taxonomic group containing humans and chimpanzees. In primary human neuroblastomas, NCYM is 100% co-amplified and co-expressed with MYCN, and NCYM mRNA expression is associated with poor clinical outcome. MYCN directly transactivates both NCYM and MYCN mRNA, whereas NCYM stabilizes MYCN protein by inhibiting the activity of GSK3β, a kinase that promotes MYCN degradation. In contrast to MYCN transgenic mice, neuroblastomas in MYCN/NCYM double transgenic mice were frequently accompanied by distant metastases, behavior reminiscent of human neuroblastomas with MYCN amplification. The NCYM protein also interacts with GSK3β, thereby stabilizing the MYCN protein in the tumors of the MYCN/NCYM double transgenic mice. Thus, these results suggest that GSK3β inhibition by NCYM stabilizes the MYCN protein both in vitro and in vivo. Furthermore, the survival of MYCN transgenic mice bearing neuroblastoma was improved by treatment with NVP-BEZ235, a dual PI3K/mTOR inhibitor shown to destabilize MYCN via GSK3β activation. In contrast, tumors caused in MYCN/NCYM double transgenic mice showed chemo-resistance to the drug. Collectively, our results show that NCYM is the first de novo evolved protein known to act as an oncopromoting factor in human cancer, and suggest that de novo evolved proteins may functionally characterize human disease.

  11. Encoding of contextual fear memory requires de novo proteins in the prelimbic cortex

    Science.gov (United States)

    Rizzo, Valerio; Touzani, Khalid; Raveendra, Bindu L.; Swarnkar, Supriya; Lora, Joan; Kadakkuzha, Beena M.; Liu, Xin-An; Zhang, Chao; Betel, Doron; Stackman, Robert W.; Puthanveettil, Sathyanarayanan V.

    2016-01-01

    Background Despite our understanding of the significance of the prefrontal cortex in the consolidation of long-term memories (LTM), its role in the encoding of LTM remains elusive. Here we investigated the role of new protein synthesis in the mouse medial prefrontal cortex (mPFC) in encoding contextual fear memory. Methods Because a change in the association of mRNAs to polyribosomes is an indicator of new protein synthesis, we assessed the changes in polyribosome-associated mRNAs in the mPFC following contextual fear conditioning (CFC) in the mouse. Differential gene expression in mPFC was identified by polyribosome profiling (n = 18). The role of new protein synthesis in mPFC was determined by focal inhibition of protein synthesis (n = 131) and by intra-prelimbic cortex manipulation (n = 56) of Homer 3, a candidate identified from polyribosome profiling. Results We identified several mRNAs that are differentially and temporally recruited to polyribosomes in the mPFC following CFC. Inhibition of protein synthesis in the prelimbic (PL), but not in the anterior cingulate cortex (ACC) region of the mPFC immediately after CFC disrupted encoding of contextual fear memory. Intriguingly, inhibition of new protein synthesis in the PL 6 hours after CFC did not impair encoding. Furthermore, expression of Homer 3, an mRNA enriched in polyribosomes following CFC, in the PL constrained encoding of contextual fear memory. Conclusions Our studies identify several molecular substrates of new protein synthesis in the mPFC and establish that encoding of contextual fear memories require new protein synthesis in PL subregion of mPFC. PMID:28503670

  12. mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

    Science.gov (United States)

    Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

    2013-08-15

    Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.

  13. Shape-specific nanostructured protein mimics from de novo designed chimeric peptides.

    Science.gov (United States)

    Jiang, Linhai; Yang, Su; Lund, Reidar; Dong, He

    2018-01-30

    Natural proteins self-assemble into highly-ordered nanoscaled architectures to perform specific functions. The intricate functions of proteins have provided great impetus for researchers to develop strategies for designing and engineering synthetic nanostructures as protein mimics. Compared to the success in engineering fibrous protein mimetics, the design of discrete globular protein-like nanostructures has been challenging mainly due to the lack of precise control over geometric packing and intermolecular interactions among synthetic building blocks. In this contribution, we report an effective strategy to construct shape-specific nanostructures based on the self-assembly of chimeric peptides consisting of a coiled coil dimer and a collagen triple helix folding motif. Under salt-free conditions, we showed spontaneous self-assembly of the chimeric peptides into monodisperse, trigonal bipyramidal-like nanoparticles with precise control over the stoichiometry of two folding motifs and the geometrical arrangements relative to one another. Three coiled coil dimers are interdigitated on the equatorial plane while the two collagen triple helices are located in the axial position, perpendicular to the coiled coil plane. A detailed molecular model was proposed and further validated by small angle X-ray scattering experiments and molecular dynamics (MD) simulation. The results from this study indicated that the molecular folding of each motif within the chimeric peptides and their geometric packing played important roles in the formation of discrete protein-like nanoparticles. The peptide design and self-assembly mechanism may open up new routes for the construction of highly organized, discrete self-assembling protein-like nanostructures with greater levels of control over assembly accuracy.

  14. Protein interfacial structure and nanotoxicology

    International Nuclear Information System (INIS)

    White, John W.; Perriman, Adam W.; McGillivray, Duncan J.; Lin, J.-M.

    2009-01-01

    Here we briefly recapitulate the use of X-ray and neutron reflectometry at the air-water interface to find protein structures and thermodynamics at interfaces and test a possibility for understanding those interactions between nanoparticles and proteins which lead to nanoparticle toxicology through entry into living cells. Stable monomolecular protein films have been made at the air-water interface and, with a specially designed vessel, the substrate changed from that which the air-water interfacial film was deposited. This procedure allows interactions, both chemical and physical, between introduced species and the monomolecular film to be studied by reflectometry. The method is briefly illustrated here with some new results on protein-protein interaction between β-casein and κ-casein at the air-water interface using X-rays. These two proteins are an essential component of the structure of milk. In the experiments reported, specific and directional interactions appear to cause different interfacial structures if first, a β-casein monolayer is attacked by a κ-casein solution compared to the reverse. The additional contrast associated with neutrons will be an advantage here. We then show the first results of experiments on the interaction of a β-casein monolayer with a nanoparticle titanium oxide sol, foreshadowing the study of the nanoparticle 'corona' thought to be important for nanoparticle-cell wall penetration.

  15. Protein interfacial structure and nanotoxicology

    Energy Technology Data Exchange (ETDEWEB)

    White, John W. [Research School of Chemistry, Australian National University, Canberra (Australia)], E-mail: jww@rsc.anu.edu.au; Perriman, Adam W.; McGillivray, Duncan J.; Lin, J.-M. [Research School of Chemistry, Australian National University, Canberra (Australia)

    2009-02-21

    Here we briefly recapitulate the use of X-ray and neutron reflectometry at the air-water interface to find protein structures and thermodynamics at interfaces and test a possibility for understanding those interactions between nanoparticles and proteins which lead to nanoparticle toxicology through entry into living cells. Stable monomolecular protein films have been made at the air-water interface and, with a specially designed vessel, the substrate changed from that which the air-water interfacial film was deposited. This procedure allows interactions, both chemical and physical, between introduced species and the monomolecular film to be studied by reflectometry. The method is briefly illustrated here with some new results on protein-protein interaction between {beta}-casein and {kappa}-casein at the air-water interface using X-rays. These two proteins are an essential component of the structure of milk. In the experiments reported, specific and directional interactions appear to cause different interfacial structures if first, a {beta}-casein monolayer is attacked by a {kappa}-casein solution compared to the reverse. The additional contrast associated with neutrons will be an advantage here. We then show the first results of experiments on the interaction of a {beta}-casein monolayer with a nanoparticle titanium oxide sol, foreshadowing the study of the nanoparticle 'corona' thought to be important for nanoparticle-cell wall penetration.

  16. Degradation and de novo synthesis of D1 protein and psbA ...

    Indian Academy of Sciences (India)

    This shows that synthesis of D1 protein is not the only component involved in the recovery process. Our events, which ... transcript levels in the green alga Chlamydomonas reinhardtii in ..... and Gaba V 1996 Accelerated degradation of the D2 ...

  17. De novo design and engineering of functional metal and porphyrin-binding protein domains

    Science.gov (United States)

    Everson, Bernard H.

    In this work, I describe an approach to the rational, iterative design and characterization of two functional cofactor-binding protein domains. First, a hybrid computational/experimental method was developed with the aim of algorithmically generating a suite of porphyrin-binding protein sequences with minimal mutual sequence information. This method was explored by generating libraries of sequences, which were then expressed and evaluated for function. One successful sequence is shown to bind a variety of porphyrin-like cofactors, and exhibits light- activated electron transfer in mixed hemin:chlorin e6 and hemin:Zn(II)-protoporphyrin IX complexes. These results imply that many sophisticated functions such as cofactor binding and electron transfer require only a very small number of residue positions in a protein sequence to be fixed. Net charge and hydrophobic content are important in determining protein solubility and stability. Accordingly, rational modifications were made to the aforementioned design procedure in order to improve its overall success rate. The effects of these modifications are explored using two `next-generation' sequence libraries, which were separately expressed and evaluated. Particular modifications to these design parameters are demonstrated to effectively double the purification success rate of the procedure. Finally, I describe the redesign of the artificial di-iron protein DF2 into CDM13, a single chain di-Manganese four-helix bundle. CDM13 acts as a functional model of natural manganese catalase, exhibiting a kcat of 0.08s-1 under steady-state conditions. The bound manganese cofactors have a reduction potential of +805 mV vs NHE, which is too high for efficient dismutation of hydrogen peroxide. These results indicate that as a high-potential manganese complex, CDM13 may represent a promising first step toward a polypeptide model of the Oxygen Evolving Complex of the photosynthetic enzyme Photosystem II.

  18. Structural entanglements in protein complexes

    Science.gov (United States)

    Zhao, Yani; Chwastyk, Mateusz; Cieplak, Marek

    2017-06-01

    We consider multi-chain protein native structures and propose a criterion that determines whether two chains in the system are entangled or not. The criterion is based on the behavior observed by pulling at both termini of each chain simultaneously in the two chains. We have identified about 900 entangled systems in the Protein Data Bank and provided a more detailed analysis for several of them. We argue that entanglement enhances the thermodynamic stability of the system but it may have other functions: burying the hydrophobic residues at the interface and increasing the DNA or RNA binding area. We also study the folding and stretching properties of the knotted dimeric proteins MJ0366, YibK, and bacteriophytochrome. These proteins have been studied theoretically in their monomeric versions so far. The dimers are seen to separate on stretching through the tensile mechanism and the characteristic unraveling force depends on the pulling direction.

  19. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Liu, Siyang; Huang, Shujia; Rao, Junhua

    2015-01-01

    present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome......) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We...... assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction...

  20. Evolution and structural organization of the C proteins of paramyxovirinae.

    Directory of Open Access Journals (Sweden)

    Michael K Lo

    Full Text Available The phosphoprotein (P gene of most Paramyxovirinae encodes several proteins in overlapping frames: P and V, which share a common N-terminus (PNT, and C, which overlaps PNT. Overlapping genes are of particular interest because they encode proteins originated de novo, some of which have unknown structural folds, challenging the notion that nature utilizes only a limited, well-mapped area of fold space. The C proteins cluster in three groups, comprising measles, Nipah, and Sendai virus. We predicted that all C proteins have a similar organization: a variable, disordered N-terminus and a conserved, α-helical C-terminus. We confirmed this predicted organization by biophysically characterizing recombinant C proteins from Tupaia paramyxovirus (measles group and human parainfluenza virus 1 (Sendai group. We also found that the C of the measles and Nipah groups have statistically significant sequence similarity, indicating a common origin. Although the C of the Sendai group lack sequence similarity with them, we speculate that they also have a common origin, given their similar genomic location and structural organization. Since C is dispensable for viral replication, unlike PNT, we hypothesize that C may have originated de novo by overprinting PNT in the ancestor of Paramyxovirinae. Intriguingly, in measles virus and Nipah virus, PNT encodes STAT1-binding sites that overlap different regions of the C-terminus of C, indicating they have probably originated independently. This arrangement, in which the same genetic region encodes simultaneously a crucial functional motif (a STAT1-binding site and a highly constrained region (the C-terminus of C, seems paradoxical, since it should severely reduce the ability of the virus to adapt. The fact that it originated twice suggests that it must be balanced by an evolutionary advantage, perhaps from reducing the size of the genetic region vulnerable to mutations.

  1. Soliton concepts and protein structure

    Science.gov (United States)

    Krokhotin, Andrei; Niemi, Antti J.; Peng, Xubiao

    2012-03-01

    Structural classification shows that the number of different protein folds is surprisingly small. It also appears that proteins are built in a modular fashion from a relatively small number of components. Here we propose that the modular building blocks are made of the dark soliton solution of a generalized discrete nonlinear Schrödinger equation. We find that practically all protein loops can be obtained simply by scaling the size and by joining together a number of copies of the soliton, one after another. The soliton has only two loop-specific parameters, and we compute their statistical distribution in the Protein Data Bank (PDB). We explicitly construct a collection of 200 sets of parameters, each determining a soliton profile that describes a different short loop. The ensuing profiles cover practically all those proteins in PDB that have a resolution which is better than 2.0 Å, with a precision such that the average root-mean-square distance between the loop and its soliton is less than the experimental B-factor fluctuation distance. We also present two examples that describe how the loop library can be employed both to model and to analyze folded proteins.

  2. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences.

    Directory of Open Access Journals (Sweden)

    Josephine A Reinhardt

    Full Text Available How non-coding DNA gives rise to new protein-coding genes (de novo genes is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs, while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important.

  3. Improved protein structure reconstruction using secondary structures, contacts at higher distance thresholds, and non-contacts.

    Science.gov (United States)

    Adhikari, Badri; Cheng, Jianlin

    2017-08-29

    Residue-residue contacts are key features for accurate de novo protein structure prediction. For the optimal utilization of these predicted contacts in folding proteins accurately, it is important to study the challenges of reconstructing protein structures using true contacts. Because contact-guided protein modeling approach is valuable for predicting the folds of proteins that do not have structural templates, it is necessary for reconstruction studies to focus on hard-to-predict protein structures. Using a data set consisting of 496 structural domains released in recent CASP experiments and a dataset of 150 representative protein structures, in this work, we discuss three techniques to improve the reconstruction accuracy using true contacts - adding secondary structures, increasing contact distance thresholds, and adding non-contacts. We find that reconstruction using secondary structures and contacts can deliver accuracy higher than using full contact maps. Similarly, we demonstrate that non-contacts can improve reconstruction accuracy not only when the used non-contacts are true but also when they are predicted. On the dataset consisting of 150 proteins, we find that by simply using low ranked predicted contacts as non-contacts and adding them as additional restraints, can increase the reconstruction accuracy by 5% when the reconstructed models are evaluated using TM-score. Our findings suggest that secondary structures are invaluable companions of contacts for accurate reconstruction. Confirming some earlier findings, we also find that larger distance thresholds are useful for folding many protein structures which cannot be folded using the standard definition of contacts. Our findings also suggest that for more accurate reconstruction using predicted contacts it is useful to predict contacts at higher distance thresholds (beyond 8 Å) and predict non-contacts.

  4. Protein Structure Refinement by Optimization

    DEFF Research Database (Denmark)

    Carlsen, Martin

    on whether the three-dimensional structure of a homologous sequence is known. Whether or not a protein model can be used for industrial purposes depends on the quality of the predicted structure. A model can be used to design a drug when the quality is high. The overall goal of this project is to assess...... that correlates maximally to a native-decoy distance. The main contribution of this thesis is methods developed for analyzing the performance of metrically trained knowledge-based potentials and for optimizing their performance while making them less dependent on the decoy set used to define them. We focus...... being at-least a local minimum of the potential. To address how far the current functional form of the potential is from an ideal potential we present two methods for finding the optimal metrically trained potential that simultaneous has a number of native structures as a local minimum. Our results...

  5. Structure and Sequence Search on Aptamer-Protein Docking

    Science.gov (United States)

    Xiao, Jiajie; Bonin, Keith; Guthold, Martin; Salsbury, Freddie

    2015-03-01

    Interactions between proteins and deoxyribonucleic acid (DNA) play a significant role in the living systems, especially through gene regulation. However, short nucleic acids sequences (aptamers) with specific binding affinity to specific proteins exhibit clinical potential as therapeutics. Our capillary and gel electrophoresis selection experiments show that specific sequences of aptamers can be selected that bind specific proteins. Computationally, given the experimentally-determined structure and sequence of a thrombin-binding aptamer, we can successfully dock the aptamer onto thrombin in agreement with experimental structures of the complex. In order to further study the conformational flexibility of this thrombin-binding aptamer and to potentially develop a predictive computational model of aptamer-binding, we use GPU-enabled molecular dynamics simulations to both examine the conformational flexibility of the aptamer in the absence of binding to thrombin, and to determine our ability to fold an aptamer. This study should help further de-novo predictions of aptamer sequences by enabling the study of structural and sequence-dependent effects on aptamer-protein docking specificity.

  6. Protein 3D structure computed from evolutionary sequence variation.

    Directory of Open Access Journals (Sweden)

    Debora S Marks

    Full Text Available The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org. This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of

  7. SDSL-ESR-based protein structure characterization.

    Science.gov (United States)

    Strancar, Janez; Kavalenka, Aleh; Urbancic, Iztok; Ljubetic, Ajasja; Hemminga, Marcus A

    2010-03-01

    As proteins are key molecules in living cells, knowledge about their structure can provide important insights and applications in science, biotechnology, and medicine. However, many protein structures are still a big challenge for existing high-resolution structure-determination methods, as can be seen in the number of protein structures published in the Protein Data Bank. This is especially the case for less-ordered, more hydrophobic and more flexible protein systems. The lack of efficient methods for structure determination calls for urgent development of a new class of biophysical techniques. This work attempts to address this problem with a novel combination of site-directed spin labelling electron spin resonance spectroscopy (SDSL-ESR) and protein structure modelling, which is coupled by restriction of the conformational spaces of the amino acid side chains. Comparison of the application to four different protein systems enables us to generalize the new method and to establish a general procedure for determination of protein structure.

  8. Sequence protein identification by randomized sequence database and transcriptome mass spectrometry (SPIDER-TMS): from manual to automatic application of a 'de novo sequencing' approach.

    Science.gov (United States)

    Pascale, Raffaella; Grossi, Gerarda; Cruciani, Gabriele; Mecca, Giansalvatore; Santoro, Donatello; Sarli Calace, Renzo; Falabella, Patrizia; Bianco, Giuliana

    Sequence protein identification by a randomized sequence database and transcriptome mass spectrometry software package has been developed at the University of Basilicata in Potenza (Italy) and designed to facilitate the determination of the amino acid sequence of a peptide as well as an unequivocal identification of proteins in a high-throughput manner with enormous advantages of time, economical resource and expertise. The software package is a valid tool for the automation of a de novo sequencing approach, overcoming the main limits and a versatile platform useful in the proteomic field for an unequivocal identification of proteins, starting from tandem mass spectrometry data. The strength of this software is that it is a user-friendly and non-statistical approach, so protein identification can be considered unambiguous.

  9. de novo computational enzyme design.

    Science.gov (United States)

    Zanghellini, Alexandre

    2014-10-01

    Recent advances in systems and synthetic biology as well as metabolic engineering are poised to transform industrial biotechnology by allowing us to design cell factories for the sustainable production of valuable fuels and chemicals. To deliver on their promises, such cell factories, as much as their brick-and-mortar counterparts, will require appropriate catalysts, especially for classes of reactions that are not known to be catalyzed by enzymes in natural organisms. A recently developed methodology, de novo computational enzyme design can be used to create enzymes catalyzing novel reactions. Here we review the different classes of chemical reactions for which active protein catalysts have been designed as well as the results of detailed biochemical and structural characterization studies. We also discuss how combining de novo computational enzyme design with more traditional protein engineering techniques can alleviate the shortcomings of state-of-the-art computational design techniques and create novel enzymes with catalytic proficiencies on par with natural enzymes. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. De novo prediction of structured RNAs from genomic sequences

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Þórarinsson, Elfar

    2010-01-01

    currently available, because evolutionary conservation highlights functionally important regions. Conserved secondary structure, rather than primary sequence, is the hallmark of many functionally important RNAs, because compensatory substitutions in base-paired regions preserve structure. Unfortunately...

  11. Computational methods for constructing protein structure models from 3D electron microscopy maps.

    Science.gov (United States)

    Esquivel-Rodríguez, Juan; Kihara, Daisuke

    2013-10-01

    Protein structure determination by cryo-electron microscopy (EM) has made significant progress in the past decades. Resolutions of EM maps have been improving as evidenced by recently reported structures that are solved at high resolutions close to 3Å. Computational methods play a key role in interpreting EM data. Among many computational procedures applied to an EM map to obtain protein structure information, in this article we focus on reviewing computational methods that model protein three-dimensional (3D) structures from a 3D EM density map that is constructed from two-dimensional (2D) maps. The computational methods we discuss range from de novo methods, which identify structural elements in an EM map, to structure fitting methods, where known high resolution structures are fit into a low-resolution EM map. A list of available computational tools is also provided. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. Modularity in protein structures: study on all-alpha proteins.

    Science.gov (United States)

    Khan, Taushif; Ghosh, Indira

    2015-01-01

    Modularity is known as one of the most important features of protein's robust and efficient design. The architecture and topology of proteins play a vital role by providing necessary robust scaffolds to support organism's growth and survival in constant evolutionary pressure. These complex biomolecules can be represented by several layers of modular architecture, but it is pivotal to understand and explore the smallest biologically relevant structural component. In the present study, we have developed a component-based method, using protein's secondary structures and their arrangements (i.e. patterns) in order to investigate its structural space. Our result on all-alpha protein shows that the known structural space is highly populated with limited set of structural patterns. We have also noticed that these frequently observed structural patterns are present as modules or "building blocks" in large proteins (i.e. higher secondary structure content). From structural descriptor analysis, observed patterns are found to be within similar deviation; however, frequent patterns are found to be distinctly occurring in diverse functions e.g. in enzymatic classes and reactions. In this study, we are introducing a simple approach to explore protein structural space using combinatorial- and graph-based geometry methods, which can be used to describe modularity in protein structures. Moreover, analysis indicates that protein function seems to be the driving force that shapes the known structure space.

  13. Protein enriched pasta: structure and digestibility of its protein network.

    Science.gov (United States)

    Laleg, Karima; Barron, Cécile; Santé-Lhoutellier, Véronique; Walrand, Stéphane; Micard, Valérie

    2016-02-01

    Wheat (W) pasta was enriched in 6% gluten (G), 35% faba (F) or 5% egg (E) to increase its protein content (13% to 17%). The impact of the enrichment on the multiscale structure of the pasta and on in vitro protein digestibility was studied. Increasing the protein content (W- vs. G-pasta) strengthened pasta structure at molecular and macroscopic scales but reduced its protein digestibility by 3% by forming a higher covalently linked protein network. Greater changes in the macroscopic and molecular structure of the pasta were obtained by varying the nature of protein used for enrichment. Proteins in G- and E-pasta were highly covalently linked (28-32%) resulting in a strong pasta structure. Conversely, F-protein (98% SDS-soluble) altered the pasta structure by diluting gluten and formed a weak protein network (18% covalent link). As a result, protein digestibility in F-pasta was significantly higher (46%) than in E- (44%) and G-pasta (39%). The effect of low (55 °C, LT) vs. very high temperature (90 °C, VHT) drying on the protein network structure and digestibility was shown to cause greater molecular changes than pasta formulation. Whatever the pasta, a general strengthening of its structure, a 33% to 47% increase in covalently linked proteins and a higher β-sheet structure were observed. However, these structural differences were evened out after the pasta was cooked, resulting in identical protein digestibility in LT and VHT pasta. Even after VHT drying, F-pasta had the best amino acid profile with the highest protein digestibility, proof of its nutritional interest.

  14. Neural Networks for protein Structure Prediction

    DEFF Research Database (Denmark)

    Bohr, Henrik

    1998-01-01

    This is a review about neural network applications in bioinformatics. Especially the applications to protein structure prediction, e.g. prediction of secondary structures, prediction of surface structure, fold class recognition and prediction of the 3-dimensional structure of protein backbones...

  15. Structure-based barcoding of proteins.

    Science.gov (United States)

    Metri, Rahul; Jerath, Gaurav; Kailas, Govind; Gacche, Nitin; Pal, Adityabarna; Ramakrishnan, Vibin

    2014-01-01

    A reduced representation in the format of a barcode has been developed to provide an overview of the topological nature of a given protein structure from 3D coordinate file. The molecular structure of a protein coordinate file from Protein Data Bank is first expressed in terms of an alpha-numero code and further converted to a barcode image. The barcode representation can be used to compare and contrast different proteins based on their structure. The utility of this method has been exemplified by comparing structural barcodes of proteins that belong to same fold family, and across different folds. In addition to this, we have attempted to provide an illustration to (i) the structural changes often seen in a given protein molecule upon interaction with ligands and (ii) Modifications in overall topology of a given protein during evolution. The program is fully downloadable from the website http://www.iitg.ac.in/probar/. © 2013 The Protein Society.

  16. Studies on Antifungal Potential, Primary Characterization and Mode of Action of a De Novo Cytoplasmic Protein (EAF) from Human Commensal Escherichia coli Against Aspergillus spp.

    Science.gov (United States)

    Balhara, Meenakshi; Ruhil, Sonam; Dhankhar, Sandeep; Chhillar, Anil K

    2015-01-01

    A de novo protein named as EAF (Escherichia antifungal protein) from the cytoplasmic pool of an Escherichia coli strain (MTCC 1652), has been purified to homogeneity using anion exchange (Q-XL Sepharose) and cation exchange (SP-Sepharose) chromatography. The MIC (minimum inhibitory concentration) values of purified protein against A. fumigatus (the major pathogenic species) were found to be comparable with standard drugs i.e. 3.90 µg/ml, 3.90 µg/ml and 1.25 µg/disc via microbroth dilution assay (MDA), percentage spore germination inhibition (PSGI) and disc diffusion assay (DDA) respectively. Toxicity results confirmed that it causes no haemolysis against human RBCs upto a concentration of 1000.0 µg/ml as compared to Amphotericin B (conventional antifungal drug) that causes hundred percent haemolysis at a concentration of 37.50 µg/ml only.The purified protein demonstrated a molecular mass of 28 kDa on SDS-PAGE which was further authenticated by MALDI-TOF. Proteomic and bioinformatics studies deciphered its significant homology (72 %) with chain A-D-ribose binding protein (cluster 2 sugar binding periplasmic proteins; sequence homologues of transcription regulatory proteins) from E. coli. Single dimensional page analysis of A. fumigatusproteins with due effect of EAF (at MIC50) revealed the inhibition of two major proteins; a heat shock protein 70-Hsp70 (68 kDa); having role in protein folding and functioning andphenylanalyl-t RNA synthetase PodG subunit protein (74 kDa); involved in growth polarity in fungi. Scanning electron microscopic studies depicted homologous results. We suggest that EAF most likely belongs to a new group of proteins with potent antifungal characteristics, negligible toxicity and targeting vital proteins of fungal metabolism.

  17. The interface of protein structure, protein biophysics, and molecular evolution

    Science.gov (United States)

    Liberles, David A; Teichmann, Sarah A; Bahar, Ivet; Bastolla, Ugo; Bloom, Jesse; Bornberg-Bauer, Erich; Colwell, Lucy J; de Koning, A P Jason; Dokholyan, Nikolay V; Echave, Julian; Elofsson, Arne; Gerloff, Dietlind L; Goldstein, Richard A; Grahnen, Johan A; Holder, Mark T; Lakner, Clemens; Lartillot, Nicholas; Lovell, Simon C; Naylor, Gavin; Perica, Tina; Pollock, David D; Pupko, Tal; Regan, Lynne; Roger, Andrew; Rubinstein, Nimrod; Shakhnovich, Eugene; Sjölander, Kimmen; Sunyaev, Shamil; Teufel, Ashley I; Thorne, Jeffrey L; Thornton, Joseph W; Weinreich, Daniel M; Whelan, Simon

    2012-01-01

    Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction. PMID:22528593

  18. SDSL-ESR-based protein structure characterization

    NARCIS (Netherlands)

    Strancar, J.; Kavalenka, A.A.; Urbancic, I.; Ljubetic, A.; Hemminga, M.A.

    2010-01-01

    As proteins are key molecules in living cells, knowledge about their structure can provide important insights and applications in science, biotechnology, and medicine. However, many protein structures are still a big challenge for existing high-resolution structure-determination methods, as can be

  19. Overcoming barriers to membrane protein structure determination.

    Science.gov (United States)

    Bill, Roslyn M; Henderson, Peter J F; Iwata, So; Kunji, Edmund R S; Michel, Hartmut; Neutze, Richard; Newstead, Simon; Poolman, Bert; Tate, Christopher G; Vogel, Horst

    2011-04-01

    After decades of slow progress, the pace of research on membrane protein structures is beginning to quicken thanks to various improvements in technology, including protein engineering and microfocus X-ray diffraction. Here we review these developments and, where possible, highlight generic new approaches to solving membrane protein structures based on recent technological advances. Rational approaches to overcoming the bottlenecks in the field are urgently required as membrane proteins, which typically comprise ~30% of the proteomes of organisms, are dramatically under-represented in the structural database of the Protein Data Bank.

  20. Mapping monomeric threading to protein-protein structure prediction.

    Science.gov (United States)

    Guerler, Aysam; Govindarajoo, Brandon; Zhang, Yang

    2013-03-25

    The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.

  1. PSAIA – Protein Structure and Interaction Analyzer

    Directory of Open Access Journals (Sweden)

    Vlahoviček Kristian

    2008-04-01

    Full Text Available Abstract Background PSAIA (Protein Structure and Interaction Analyzer was developed to compute geometric parameters for large sets of protein structures in order to predict and investigate protein-protein interaction sites. Results In addition to most relevant established algorithms, PSAIA offers a new method PIADA (Protein Interaction Atom Distance Algorithm for the determination of residue interaction pairs. We found that PIADA produced more satisfactory results than comparable algorithms implemented in PSAIA. Particular advantages of PSAIA include its capacity to combine different methods to detect the locations and types of interactions between residues and its ability, without any further automation steps, to handle large numbers of protein structures and complexes. Generally, the integration of a variety of methods enables PSAIA to offer easier automation of analysis and greater reliability of results. PSAIA can be used either via a graphical user interface or from the command-line. Results are generated in either tabular or XML format. Conclusion In a straightforward fashion and for large sets of protein structures, PSAIA enables the calculation of protein geometric parameters and the determination of location and type for protein-protein interaction sites. XML formatted output enables easy conversion of results to various formats suitable for statistic analysis. Results from smaller data sets demonstrated the influence of geometry on protein interaction sites. Comprehensive analysis of properties of large data sets lead to new information useful in the prediction of protein-protein interaction sites.

  2. NAPS: Network Analysis of Protein Structures

    Science.gov (United States)

    Chakrabarty, Broto; Parekh, Nita

    2016-01-01

    Traditionally, protein structures have been analysed by the secondary structure architecture and fold arrangement. An alternative approach that has shown promise is modelling proteins as a network of non-covalent interactions between amino acid residues. The network representation of proteins provide a systems approach to topological analysis of complex three-dimensional structures irrespective of secondary structure and fold type and provide insights into structure-function relationship. We have developed a web server for network based analysis of protein structures, NAPS, that facilitates quantitative and qualitative (visual) analysis of residue–residue interactions in: single chains, protein complex, modelled protein structures and trajectories (e.g. from molecular dynamics simulations). The user can specify atom type for network construction, distance range (in Å) and minimal amino acid separation along the sequence. NAPS provides users selection of node(s) and its neighbourhood based on centrality measures, physicochemical properties of amino acids or cluster of well-connected residues (k-cliques) for further analysis. Visual analysis of interacting domains and protein chains, and shortest path lengths between pair of residues are additional features that aid in functional analysis. NAPS support various analyses and visualization views for identifying functional residues, provide insight into mechanisms of protein folding, domain-domain and protein–protein interactions for understanding communication within and between proteins. URL:http://bioinf.iiit.ac.in/NAPS/. PMID:27151201

  3. Solution NMR structure determination of proteins revisited

    International Nuclear Information System (INIS)

    Billeter, Martin; Wagner, Gerhard; Wuethrich, Kurt

    2008-01-01

    This 'Perspective' bears on the present state of protein structure determination by NMR in solution. The focus is on a comparison of the infrastructure available for NMR structure determination when compared to protein crystal structure determination by X-ray diffraction. The main conclusion emerges that the unique potential of NMR to generate high resolution data also on dynamics, interactions and conformational equilibria has contributed to a lack of standard procedures for structure determination which would be readily amenable to improved efficiency by automation. To spark renewed discussion on the topic of NMR structure determination of proteins, procedural steps with high potential for improvement are identified

  4. Extracting knowledge from protein structure geometry

    DEFF Research Database (Denmark)

    Røgen, Peter; Koehl, Patrice

    2013-01-01

    potential from geometric knowledge extracted from native and misfolded conformers of protein structures. This new potential, Metric Protein Potential (MPP), has two main features that are key to its success. Firstly, it is composite in that it includes local and nonlocal geometric information on proteins...

  5. Integrated Structural Biology for α-Helical Membrane Protein Structure Determination.

    Science.gov (United States)

    Xia, Yan; Fischer, Axel W; Teixeira, Pedro; Weiner, Brian; Meiler, Jens

    2018-04-03

    While great progress has been made, only 10% of the nearly 1,000 integral, α-helical, multi-span membrane protein families are represented by at least one experimentally determined structure in the PDB. Previously, we developed the algorithm BCL::MP-Fold, which samples the large conformational space of membrane proteins de novo by assembling predicted secondary structure elements guided by knowledge-based potentials. Here, we present a case study of rhodopsin fold determination by integrating sparse and/or low-resolution restraints from multiple experimental techniques including electron microscopy, electron paramagnetic resonance spectroscopy, and nuclear magnetic resonance spectroscopy. Simultaneous incorporation of orthogonal experimental restraints not only significantly improved the sampling accuracy but also allowed identification of the correct fold, which is demonstrated by a protein size-normalized transmembrane root-mean-square deviation as low as 1.2 Å. The protocol developed in this case study can be used for the determination of unknown membrane protein folds when limited experimental restraints are available. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Validation-driven protein-structure improvement

    NARCIS (Netherlands)

    Touw, W.G.

    2016-01-01

    High-quality protein structure models are essential for many Life Science applications, such as protein engineering, molecular dynamics, drug design, and homology modelling. The WHAT_CHECK model validation project and the PDB_REDO model optimisation project have shown that many structure models in

  7. Heterochiral Knottin Protein: Folding and Solution Structure.

    Science.gov (United States)

    Mong, Surin K; Cochran, Frank V; Yu, Hongtao; Graziano, Zachary; Lin, Yu-Shan; Cochran, Jennifer R; Pentelute, Bradley L

    2017-10-31

    Homochirality is a general feature of biological macromolecules, and Nature includes few examples of heterochiral proteins. Herein, we report on the design, chemical synthesis, and structural characterization of heterochiral proteins possessing loops of amino acids of chirality opposite to that of the rest of a protein scaffold. Using the protein Ecballium elaterium trypsin inhibitor II, we discover that selective β-alanine substitution favors the efficient folding of our heterochiral constructs. Solution nuclear magnetic resonance spectroscopy of one such heterochiral protein reveals a homogeneous global fold. Additionally, steered molecular dynamics simulation indicate β-alanine reduces the free energy required to fold the protein. We also find these heterochiral proteins to be more resistant to proteolysis than homochiral l-proteins. This work informs the design of heterochiral protein architectures containing stretches of both d- and l-amino acids.

  8. Amino acid code of protein secondary structure.

    Science.gov (United States)

    Shestopalov, B V

    2003-01-01

    The calculation of protein three-dimensional structure from the amino acid sequence is a fundamental problem to be solved. This paper presents principles of the code theory of protein secondary structure, and their consequence--the amino acid code of protein secondary structure. The doublet code model of protein secondary structure, developed earlier by the author (Shestopalov, 1990), is part of this theory. The theory basis are: 1) the name secondary structure is assigned to the conformation, stabilized only by the nearest (intraresidual) and middle-range (at a distance no more than that between residues i and i + 5) interactions; 2) the secondary structure consists of regular (alpha-helical and beta-structural) and irregular (coil) segments; 3) the alpha-helices, beta-strands and coil segments are encoded, respectively, by residue pairs (i, i + 4), (i, i + 2), (i, i = 1), according to the numbers of residues per period, 3.6, 2, 1; 4) all such pairs in the amino acid sequence are codons for elementary structural elements, or structurons; 5) the codons are divided into 21 types depending on their strength, i.e. their encoding capability; 6) overlappings of structurons of one and the same structure generate the longer segments of this structure; 7) overlapping of structurons of different structures is forbidden, and therefore selection of codons is required, the codon selection is hierarchic; 8) the code theory of protein secondary structure generates six variants of the amino acid code of protein secondary structure. There are two possible kinds of model construction based on the theory: the physical one using physical properties of amino acid residues, and the statistical one using results of statistical analysis of a great body of structural data. Some evident consequences of the theory are: a) the theory can be used for calculating the secondary structure from the amino acid sequence as a partial solution of the problem of calculation of protein three

  9. Fast iodide-SAD phasing for high-throughput membrane protein structure determination.

    Science.gov (United States)

    Melnikov, Igor; Polovinkin, Vitaly; Kovalev, Kirill; Gushchin, Ivan; Shevtsov, Mikhail; Shevchenko, Vitaly; Mishin, Alexey; Alekseev, Alexey; Rodriguez-Valera, Francisco; Borshchevskiy, Valentin; Cherezov, Vadim; Leonard, Gordon A; Gordeliy, Valentin; Popov, Alexander

    2017-05-01

    We describe a fast, easy, and potentially universal method for the de novo solution of the crystal structures of membrane proteins via iodide-single-wavelength anomalous diffraction (I-SAD). The potential universality of the method is based on a common feature of membrane proteins-the availability at the hydrophobic-hydrophilic interface of positively charged amino acid residues with which iodide strongly interacts. We demonstrate the solution using I-SAD of four crystal structures representing different classes of membrane proteins, including a human G protein-coupled receptor (GPCR), and we show that I-SAD can be applied using data collection strategies based on either standard or serial x-ray crystallography techniques.

  10. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng; Gao, Xin

    2016-01-01

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated

  11. Automated protein structure calculation from NMR data

    International Nuclear Information System (INIS)

    Williamson, Mike P.; Craven, C. Jeremy

    2009-01-01

    Current software is almost at the stage to permit completely automatic structure determination of small proteins of <15 kDa, from NMR spectra to structure validation with minimal user interaction. This goal is welcome, as it makes structure calculation more objective and therefore more easily validated, without any loss in the quality of the structures generated. Moreover, it releases expert spectroscopists to carry out research that cannot be automated. It should not take much further effort to extend automation to ca 20 kDa. However, there are technological barriers to further automation, of which the biggest are identified as: routines for peak picking; adoption and sharing of a common framework for structure calculation, including the assembly of an automated and trusted package for structure validation; and sample preparation, particularly for larger proteins. These barriers should be the main target for development of methodology for protein structure determination, particularly by structural genomics consortia

  12. Structural anatomy of telomere OB proteins.

    Science.gov (United States)

    Horvath, Martin P

    2011-10-01

    Telomere DNA-binding proteins protect the ends of chromosomes in eukaryotes. A subset of these proteins are constructed with one or more OB folds and bind with G+T-rich single-stranded DNA found at the extreme termini. The resulting DNA-OB protein complex interacts with other telomere components to coordinate critical telomere functions of DNA protection and DNA synthesis. While the first crystal and NMR structures readily explained protection of telomere ends, the picture of how single-stranded DNA becomes available to serve as primer and template for synthesis of new telomere DNA is only recently coming into focus. New structures of telomere OB fold proteins alongside insights from genetic and biochemical experiments have made significant contributions towards understanding how protein-binding OB proteins collaborate with DNA-binding OB proteins to recruit telomerase and DNA polymerase for telomere homeostasis. This review surveys telomere OB protein structures alongside highly comparable structures derived from replication protein A (RPA) components, with the goal of providing a molecular context for understanding telomere OB protein evolution and mechanism of action in protection and synthesis of telomere DNA.

  13. Understanding Protein-Protein Interactions Using Local Structural Features

    DEFF Research Database (Denmark)

    Planas-Iglesias, Joan; Bonet, Jaume; García-García, Javier

    2013-01-01

    Protein-protein interactions (PPIs) play a relevant role among the different functions of a cell. Identifying the PPI network of a given organism (interactome) is useful to shed light on the key molecular mechanisms within a biological system. In this work, we show the role of structural features...... interacting and non-interacting protein pairs to classify the structural features that sustain the binding (or non-binding) behavior. Our study indicates that not only the interacting region but also the rest of the protein surface are important for the interaction fate. The interpretation...... to score the likelihood of the interaction between two proteins and to develop a method for the prediction of PPIs. We have tested our method on several sets with unbalanced ratios of interactions and non-interactions to simulate real conditions, obtaining accuracies higher than 25% in the most unfavorable...

  14. Introducing site-specific cysteines into nanobodies for mercury labelling allows de novo phasing of their crystal structures

    DEFF Research Database (Denmark)

    Hansen, Simon Boje; Laursen, Nick Stub; Andersen, Gregers Rom

    2017-01-01

    of the presence of free cysteines in the target protein could considerably facilitate the process of obtaining unbiased experimental phases. Nanobodies (single-domain antibodies) have recently been shown to promote the crystallization and structure determination of flexible proteins and complexes. To extend...... phased using single-wavelength anomalous dispersion (SAD) and single isomorphous replacement with anomalous signal (SIRAS), taking advantage of radiation-induced changes in Cys-Hg bonding. Importantly, Hg labelling influenced neither the interaction of Nb36 with its antigen complement C5 nor its...

  15. Dependency on de novo protein synthesis and proteomic changes during metamorphosis of the marine bryozoan Bugula neritina

    KAUST Repository

    Wong, Yue Him; Arellano, Shawn M; Zhang, Huoming; Ravasi, Timothy; Qian, Pei-Yuan

    2010-01-01

    synthesis of proteins and, instead, involves post-translational modifications of existing proteins, providing a simple mechanism to quickly initiate metamorphosis. To test our hypothesis, we challenged B. neritina larvae with transcription and translation

  16. Algorithms for Protein Structure Prediction

    DEFF Research Database (Denmark)

    Paluszewski, Martin

    -trace. Here we present three different approaches for reconstruction of C-traces from predictable measures. In our first approach [63, 62], the C-trace is positioned on a lattice and a tabu-search algorithm is applied to find minimum energy structures. The energy function is based on half-sphere-exposure (HSE......) is more robust than standard Monte Carlo search. In the second approach for reconstruction of C-traces, an exact branch and bound algorithm has been developed [67, 65]. The model is discrete and makes use of secondary structure predictions, HSE, CN and radius of gyration. We show how to compute good lower...... bounds for partial structures very fast. Using these lower bounds, we are able to find global minimum structures in a huge conformational space in reasonable time. We show that many of these global minimum structures are of good quality compared to the native structure. Our branch and bound algorithm...

  17. Structural symmetry and protein function.

    Science.gov (United States)

    Goodsell, D S; Olson, A J

    2000-01-01

    The majority of soluble and membrane-bound proteins in modern cells are symmetrical oligomeric complexes with two or more subunits. The evolutionary selection of symmetrical oligomeric complexes is driven by functional, genetic, and physicochemical needs. Large proteins are selected for specific morphological functions, such as formation of rings, containers, and filaments, and for cooperative functions, such as allosteric regulation and multivalent binding. Large proteins are also more stable against denaturation and have a reduced surface area exposed to solvent when compared with many individual, smaller proteins. Large proteins are constructed as oligomers for reasons of error control in synthesis, coding efficiency, and regulation of assembly. Symmetrical oligomers are favored because of stability and finite control of assembly. Several functions limit symmetry, such as interaction with DNA or membranes, and directional motion. Symmetry is broken or modified in many forms: quasisymmetry, in which identical subunits adopt similar but different conformations; pleomorphism, in which identical subunits form different complexes; pseudosymmetry, in which different molecules form approximately symmetrical complexes; and symmetry mismatch, in which oligomers of different symmetries interact along their respective symmetry axes. Asymmetry is also observed at several levels. Nearly all complexes show local asymmetry at the level of side chain conformation. Several complexes have reciprocating mechanisms in which the complex is asymmetric, but, over time, all subunits cycle through the same set of conformations. Global asymmetry is only rarely observed. Evolution of oligomeric complexes may favor the formation of dimers over complexes with higher cyclic symmetry, through a mechanism of prepositioned pairs of interacting residues. However, examples have been found for all of the crystallographic point groups, demonstrating that functional need can drive the evolution of

  18. Efficient protein structure search using indexing methods.

    Science.gov (United States)

    Kim, Sungchul; Sael, Lee; Yu, Hwanjo

    2013-01-01

    Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively.

  19. Protein structure: geometry, topology and classification

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, William R.; May, Alex C.W.; Brown, Nigel P.; Aszodi, Andras [Division of Mathematical Biology, National Institute for Medical Research, London (United Kingdom)

    2001-04-01

    The structural principals of proteins are reviewed and analysed from a geometric perspective with a view to revealing the underlying regularities in their construction. Computer methods for the automatic comparison and classification of these structures are then reviewed with an analysis of the statistical significance of comparing different shapes. Following an analysis of the current state of the classification of proteins, more abstract geometric and topological representations are explored, including the occurrence of knotted topologies. The review concludes with a consideration of the origin of higher-level symmetries in protein structure. (author)

  20. Taking advantage of local structure descriptors to analyze interresidue contacts in protein structures and protein complexes.

    Science.gov (United States)

    Martin, Juliette; Regad, Leslie; Etchebest, Catherine; Camproux, Anne-Claude

    2008-11-15

    Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Voronoï tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions.

  1. Fast loop modeling for protein structures

    Science.gov (United States)

    Zhang, Jiong; Nguyen, Son; Shang, Yi; Xu, Dong; Kosztin, Ioan

    2015-03-01

    X-ray crystallography is the main method for determining 3D protein structures. In many cases, however, flexible loop regions of proteins cannot be resolved by this approach. This leads to incomplete structures in the protein data bank, preventing further computational study and analysis of these proteins. For instance, all-atom molecular dynamics (MD) simulation studies of structure-function relationship require complete protein structures. To address this shortcoming, we have developed and implemented an efficient computational method for building missing protein loops. The method is database driven and uses deep learning and multi-dimensional scaling algorithms. We have implemented the method as a simple stand-alone program, which can also be used as a plugin in existing molecular modeling software, e.g., VMD. The quality and stability of the generated structures are assessed and tested via energy scoring functions and by equilibrium MD simulations. The proposed method can also be used in template-based protein structure prediction. Work supported by the National Institutes of Health [R01 GM100701]. Computer time was provided by the University of Missouri Bioinformatics Consortium.

  2. Simultaneous determination of protein structure and dynamics

    DEFF Research Database (Denmark)

    Lindorff-Larsen, Kresten; Best, Robert B.; DePristo, M. A.

    2005-01-01

    at the atomic level about the structural and dynamical features of proteins-with the ability of molecular dynamics simulations to explore a wide range of protein conformations. We illustrate the method for human ubiquitin in solution and find that there is considerable conformational heterogeneity throughout......We present a protocol for the experimental determination of ensembles of protein conformations that represent simultaneously the native structure and its associated dynamics. The procedure combines the strengths of nuclear magnetic resonance spectroscopy-for obtaining experimental information...... the protein structure. The interior atoms of the protein are tightly packed in each individual conformation that contributes to the ensemble but their overall behaviour can be described as having a significant degree of liquid-like character. The protocol is completely general and should lead to significant...

  3. Protein Molecular Structures, Protein SubFractions, and Protein Availability Affected by Heat Processing: A Review

    International Nuclear Information System (INIS)

    Yu, P.

    2007-01-01

    The utilization and availability of protein depended on the types of protein and their specific susceptibility to enzymatic hydrolysis (inhibitory activities) in the gastrointestine and was highly associated with protein molecular structures. Studying internal protein structure and protein subfraction profiles leaded to an understanding of the components that make up a whole protein. An understanding of the molecular structure of the whole protein was often vital to understanding its digestive behavior and nutritive value in animals. In this review, recently obtained information on protein molecular structural effects of heat processing was reviewed, in relation to protein characteristics affecting digestive behavior and nutrient utilization and availability. The emphasis of this review was on (1) using the newly advanced synchrotron technology (S-FTIR) as a novel approach to reveal protein molecular chemistry affected by heat processing within intact plant tissues; (2) revealing the effects of heat processing on the profile changes of protein subfractions associated with digestive behaviors and kinetics manipulated by heat processing; (3) prediction of the changes of protein availability and supply after heat processing, using the advanced DVE/OEB and NRC-2001 models, and (4) obtaining information on optimal processing conditions of protein as intestinal protein source to achieve target values for potential high net absorbable protein in the small intestine. The information described in this article may give better insight in the mechanisms involved and the intrinsic protein molecular structural changes occurring upon processing.

  4. Human cancer protein-protein interaction network: a structural perspective.

    Directory of Open Access Journals (Sweden)

    Gozde Kar

    2009-12-01

    Full Text Available Protein-protein interaction networks provide a global picture of cellular function and biological processes. Some proteins act as hub proteins, highly connected to others, whereas some others have few interactions. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. Similar or overlapping binding sites should be used repeatedly in single interface hub proteins, making them promiscuous. Alternatively, multi-interface hub proteins make use of several distinct binding sites to bind to different partners. We propose a methodology to integrate protein interfaces into cancer interaction networks (ciSPIN, cancer structural protein interface network. The interactions in the human protein interaction network are replaced by interfaces, coming from either known or predicted complexes. We provide a detailed analysis of cancer related human protein-protein interfaces and the topological properties of the cancer network. The results reveal that cancer-related proteins have smaller, more planar, more charged and less hydrophobic binding sites than non-cancer proteins, which may indicate low affinity and high specificity of the cancer-related interactions. We also classified the genes in ciSPIN according to phenotypes. Within phenotypes, for breast cancer, colorectal cancer and leukemia, interface properties were found to be discriminating from non-cancer interfaces with an accuracy of 71%, 67%, 61%, respectively. In addition, cancer-related proteins tend to interact with their partners through distinct interfaces, corresponding mostly to multi-interface hubs, which comprise 56% of cancer-related proteins, and constituting the nodes with higher essentiality in the network (76%. We illustrate the interface related affinity properties of two cancer-related hub

  5. Protein Structure and the Sequential Structure of mRNA

    DEFF Research Database (Denmark)

    Brunak, Søren; Engelbrecht, Jacob

    1996-01-01

    entries in the Brookhaven Protein Data Bank produced 719 protein chains with matching mRNA sequence, amino acid sequence, and secondary structure assignment, By neural network analysis, we found strong signals in mRNA sequence regions surrounding helices and sheets, These signals do not originate from......A direct comparison of experimentally determined protein structures and their corresponding protein coding mRNA sequences has been performed, We examine whether real world data support the hypothesis that clusters of rare codons correlate with the location of structural units in the resulting...... protein, The degeneracy of the genetic code allows for a biased selection of codons which may control the translational rate of the ribosome, and may thus in vivo have a catalyzing effect on the folding of the polypeptide chain, A complete search for GenBank nucleotide sequences coding for structural...

  6. Protein structure database search and evolutionary classification.

    Science.gov (United States)

    Yang, Jinn-Moon; Tung, Chi-Hua

    2006-01-01

    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].

  7. Modeling protein structures: construction and their applications.

    Science.gov (United States)

    Ring, C S; Cohen, F E

    1993-06-01

    Although no general solution to the protein folding problem exists, the three-dimensional structures of proteins are being successfully predicted when experimentally derived constraints are used in conjunction with heuristic methods. In the case of interleukin-4, mutagenesis data and CD spectroscopy were instrumental in the accurate assignment of secondary structure. In addition, the tertiary structure was highly constrained by six cysteines separated by many residues that formed three disulfide bridges. Although the correct structure was a member of a short list of plausible structures, the "best" structure was the topological enantiomer of the experimentally determined conformation. For many proteases, other experimentally derived structures can be used as templates to identify the secondary structure elements. In a procedure called modeling by homology, the structure of a known protein is used as a scaffold to predict the structure of another related protein. This method has been used to model a serine and a cysteine protease that are important in the schistosome and malarial life cycles, respectively. The model structures were then used to identify putative small molecule enzyme inhibitors computationally. Experiments confirm that some of these nonpeptidic compounds are active at concentrations of less than 10 microM.

  8. Symptomatic type 1 protein C deficiency caused by a de novo Ser270Leu mutation in the catalytic domain

    DEFF Research Database (Denmark)

    Lind, B; Koefoed, P; Thorsen, S

    2001-01-01

    the intracellular content of mutant and wild-type protein was similar. Northern blot analysis of total mRNA from transfected cells showed no reduction of the mutant protein C mRNA compared with wild-type protein C mRNA. Collectively, these results indicate that the Ser270Leu mutation in the affected family caused......Heterozygosity for a C8524T transition in the protein C gene converting Ser270(TCG) to Leu(TTG) in the protease domain was identified in a family with venous thrombosis. The mutation was associated with parallel reduction in plasma levels of protein C anticoagulant activity and protein C antigen......, which is consistent with a type 1 deficiency. Transient expression of mutant protein C cDNA in human kidney 293 cells and analysis of protein C antigen in culture media and cell lysates showed that the secretion of mutant protein compared with wild-type protein was reduced by at least 97% while...

  9. Proteins with Novel Structure, Function and Dynamics

    Science.gov (United States)

    Pohorille, Andrew

    2014-01-01

    Recently, a small enzyme that ligates two RNA fragments with the rate of 10(exp 6) above background was evolved in vitro (Seelig and Szostak, Nature 448:828-831, 2007). This enzyme does not resemble any contemporary protein (Chao et al., Nature Chem. Biol. 9:81-83, 2013). It consists of a dynamic, catalytic loop, a small, rigid core containing two zinc ions coordinated by neighboring amino acids, and two highly flexible tails that might be unimportant for protein function. In contrast to other proteins, this enzyme does not contain ordered secondary structure elements, such as alpha-helix or beta-sheet. The loop is kept together by just two interactions of a charged residue and a histidine with a zinc ion, which they coordinate on the opposite side of the loop. Such structure appears to be very fragile. Surprisingly, computer simulations indicate otherwise. As the coordinating, charged residue is mutated to alanine, another, nearby charged residue takes its place, thus keeping the structure nearly intact. If this residue is also substituted by alanine a salt bridge involving two other, charged residues on the opposite sides of the loop keeps the loop in place. These adjustments are facilitated by high flexibility of the protein. Computational predictions have been confirmed experimentally, as both mutants retain full activity and overall structure. These results challenge our notions about what is required for protein activity and about the relationship between protein dynamics, stability and robustness. We hypothesize that small, highly dynamic proteins could be both active and fault tolerant in ways that many other proteins are not, i.e. they can adjust to retain their structure and activity even if subjected to mutations in structurally critical regions. This opens the doors for designing proteins with novel functions, structures and dynamics that have not been yet considered.

  10. Overcoming barriers to membrane protein structure determination

    NARCIS (Netherlands)

    Bill, Roslyn M.; Henderson, Peter J. F.; Iwata, So; Kunji, Edmund R. S.; Michel, Hartmut; Neutze, Richard; Newstead, Simon; Poolman, Bert; Tate, Christopher G.; Vogel, Horst

    After decades of slow progress, the pace of research on membrane protein structures is beginning to quicken thanks to various improvements in technology, including protein engineering and microfocus X-ray diffraction. Here we review these developments and, where possible, highlight generic new

  11. Protein structural similarity search by Ramachandran codes

    Directory of Open Access Journals (Sweden)

    Chang Chih-Hung

    2007-08-01

    Full Text Available Abstract Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation. SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.

  12. A 'periodic table' for protein structures.

    Science.gov (United States)

    Taylor, William R

    2002-04-11

    Current structural genomics programs aim systematically to determine the structures of all proteins coded in both human and other genomes, providing a complete picture of the number and variety of protein structures that exist. In the past, estimates have been made on the basis of the incomplete sample of structures currently known. These estimates have varied greatly (between 1,000 and 10,000; see for example refs 1 and 2), partly because of limited sample size but also owing to the difficulties of distinguishing one structure from another. This distinction is usually topological, based on the fold of the protein; however, in strict topological terms (neglecting to consider intra-chain cross-links), protein chains are open strings and hence are all identical. To avoid this trivial result, topologies are determined by considering secondary links in the form of intra-chain hydrogen bonds (secondary structure) and tertiary links formed by the packing of secondary structures. However, small additions to or loss of structure can make large changes to these perceived topologies and such subjective solutions are neither robust nor amenable to automation. Here I formalize both secondary and tertiary links to allow the rigorous and automatic definition of protein topology.

  13. DeNovoGUI: an open source graphical user interface for de novo sequencing of tandem mass spectra.

    Science.gov (United States)

    Muth, Thilo; Weilnböck, Lisa; Rapp, Erdmann; Huber, Christian G; Martens, Lennart; Vaudel, Marc; Barsnes, Harald

    2014-02-07

    De novo sequencing is a popular technique in proteomics for identifying peptides from tandem mass spectra without having to rely on a protein sequence database. Despite the strong potential of de novo sequencing algorithms, their adoption threshold remains quite high. We here present a user-friendly and lightweight graphical user interface called DeNovoGUI for running parallelized versions of the freely available de novo sequencing software PepNovo+, greatly simplifying the use of de novo sequencing in proteomics. Our platform-independent software is freely available under the permissible Apache2 open source license. Source code, binaries, and additional documentation are available at http://denovogui.googlecode.com .

  14. De Novo Mutations in Protein Kinase Genes CAMK2A and CAMK2B Cause Intellectual Disability

    NARCIS (Netherlands)

    Küry, Sébastien; van Woerden, Geeske M; Besnard, Thomas; Proietti Onori, Martina; Latypova, Xénia; Towne, Meghan C; Cho, Megan T.; Prescott, Trine E; Ploeg, Melissa A; Sanders, Jan-Stephan; Stessman, Holly A F; Pujol, Aurora; Distel, Ben; Robak, Laurie A; Bernstein, Jonathan A; Denommé-Pichon, Anne-Sophie; Lesca, Gaëtan; Sellars, Elizabeth A; Berg, Jonathan; Carré, Wilfrid; Busk, Øyvind Løvold; van Bon, Bregje W M; Waugh, Jeff L; Deardorff, Matthew; Hoganson, George E; Bosanko, Katherine B; Johnson, Diana S; Dabir, Tabib; Holla, Øystein Lunde; Sarkar, Ajoy; Tveten, Kristian; de Bellescize, Julitta; Braathen, Geir J; Terhal, Paulien A; Grange, Dorothy K; van Haeringen, Arie; Lam, Christina; Mirzaa, Ghayda; Burton, Jennifer; Bhoj, Elizabeth J.; Douglas, Jessica; Santani, Avni B; Nesbitt, Addie I; Helbig, Katherine L; Andrews, Marisa V; Begtrup, Amber; Tang, Sha; van Gassen, Koen L I; Juusola, Jane; Foss, Kimberly; Enns, Gregory M; Moog, Ute; Hinderhofer, Katrin; Paramasivam, Nagarajan; Lincoln, Sharyn; Kusako, Brandon H; Lindenbaum, Pierre; Charpentier, Eric; Nowak, Catherine B; Cherot, Elouan; Simonet, Thomas; Ruivenkamp, Claudia A L; Hahn, Sihoun; Brownstein, Catherine A; Xia, Fan; Schmitt, Sébastien; Deb, Wallid; Bonneau, Dominique; Nizon, Mathilde; Quinquis, Delphine; Chelly, Jamel; Rudolf, Gabrielle; Sanlaville, Damien; Parent, Philippe; Gilbert-Dussardier, Brigitte; Toutain, Annick; Sutton, Vernon R; Thies, Jenny; Peart-Vissers, Lisenka E L M; Boisseau, Pierre; Vincent, Marie; Grabrucker, Andreas M; Dubourg, Christèle; Tan, Wen-Hann; Verbeek, Nienke E; Granzow, Martin; Santen, Gijs W E; Shendure, Jay; Isidor, Bertrand; Pasquier, Laurent; Redon, Richard; Yang, Yaping; State, Matthew W; Kleefstra, Tjitske; Cogné, Benjamin; Petrovski, Slavé; Retterer, Kyle; Eichler, Evan E.; Rosenfeld, Jill A; Agrawal, Pankaj B; Bézieau, Stéphane; Odent, Sylvie; Elgersma, Ype; Mercier, Sandra

    2017-01-01

    Calcium/calmodulin-dependent protein kinase II (CAMK2) is one of the first proteins shown to be essential for normal learning and synaptic plasticity in mice, but its requirement for human brain development has not yet been established. Through a multi-center collaborative study based on a

  15. Tryptophan tags and de novo designed complementary affinity ligands for the expression and purification of recombinant proteins.

    Science.gov (United States)

    Pina, Ana Sofia; Carvalho, Sara; Dias, Ana Margarida G C; Guilherme, Márcia; Pereira, Alice S; Caraça, Luciana T; Coroadinha, Ana Sofia; Lowe, Christopher R; Roque, A Cecília A

    2016-11-11

    A common strategy for the production and purification of recombinant proteins is to fuse a tag to the protein terminal residues and employ a "tag-specific" ligand for fusion protein capture and purification. In this work, we explored the effect of two tryptophan-based tags, NWNWNW and WFWFWF, on the expression and purification of Green Fluorescence Protein (GFP) used as a model fusion protein. The titers obtained with the expression of these fusion proteins in soluble form were 0.11mgml -1 and 0.48mgml -1 for WFWFWF and NWNWNW, respectively. A combinatorial library comprising 64 ligands based on the Ugi reaction was prepared and screened for binding GFP-tagged and non-tagged proteins. Complementary ligands A2C2 and A3C1 were selected for the effective capture of NWNWNW and WFWFWF tagged proteins, respectively, in soluble forms. These affinity pairs displayed 10 6 M -1 affinity constants and Qmax values of 19.11±2.60ugg -1 and 79.39ugg -1 for the systems WFWFWF AND NWNWNW, respectively. GFP fused to the WFWFWF affinity tag was also produced as inclusion bodies, and a refolding-on column strategy was explored using the ligand A4C8, selected from the combinatorial library of ligands but in presence of denaturant agents. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. RosettaTMH: a method for membrane protein structure elucidation combining EPR distance restraints with assembly of transmembrane helices

    Directory of Open Access Journals (Sweden)

    Andrew Leaver-Fay

    2015-12-01

    Full Text Available Membrane proteins make up approximately one third of all proteins, and they play key roles in a plethora of physiological processes. However, membrane proteins make up less than 2% of experimentally determined structures, despite significant advances in structure determination methods, such as X-ray crystallography, nuclear magnetic resonance spectroscopy, and cryo-electron microscopy. One potential alternative means of structure elucidation is to combine computational methods with experimental EPR data. In 2011, Hirst and others introduced RosettaEPR and demonstrated that this approach could be successfully applied to fold soluble proteins. Furthermore, few computational methods for de novo folding of integral membrane proteins have been presented. In this work, we present RosettaTMH, a novel algorithm for structure prediction of helical membrane proteins. A benchmark set of 34 proteins, in which the proteins ranged in size from 91 to 565 residues, was used to compare RosettaTMH to Rosetta’s two existing membrane protein folding protocols: the published RosettaMembrane folding protocol (“MembraneAbinitio” and folding from an extended chain (“ExtendedChain”. When EPR distance restraints are used, RosettaTMH+EPR outperforms ExtendedChain+EPR for 11 proteins, including the largest six proteins tested. RosettaTMH+EPR is capable of achieving native-like folds for 30 of 34 proteins tested, including receptors and transporters. For example, the average RMSD100SSE relative to the crystal structure for rhodopsin was 6.1 ± 0.4 Å and 6.5 ± 0.6 Å for the 449-residue nitric oxide reductase subunit B, where the standard deviation reflects variance in RMSD100SSE values across ten different EPR distance restraint sets. The addition of RosettaTMH and RosettaTMH+EPR to the Rosetta family of de novo folding methods broadens the scope of helical membrane proteins that can be accurately modeled with this software suite.

  17. Structural analysis of recombinant human protein QM

    International Nuclear Information System (INIS)

    Gualberto, D.C.H.; Fernandes, J.L.; Silva, F.S.; Saraiva, K.W.; Affonso, R.; Pereira, L.M.; Silva, I.D.C.G.

    2012-01-01

    Full text: The ribosomal protein QM belongs to a family of ribosomal proteins, which is highly conserved from yeast to humans. The presence of the QM protein is necessary for joining the 60S and 40S subunits in a late step of the initiation of mRNA translation. Although the exact extra-ribosomal functions of QM are not yet fully understood, it has been identified as a putative tumor suppressor. This protein was reported to interact with the transcription factor c-Jun and thereby prevent c-Jun actives genes of the cellular growth. In this study, the human QM protein was expressed in bacterial system, in the soluble form and this structure was analyzed by Circular Dichroism and Fluorescence. The results of Circular Dichroism showed that this protein has less alpha helix than beta sheet, as described in the literature. QM protein does not contain a leucine zipper region; however the ion zinc is necessary for binding of QM to c-Jun. Then we analyzed the relationship between the removal of zinc ions and folding of protein. Preliminary results obtained by the technique Fluorescence showed a gradual increase in fluorescence with the addition of increasing concentration of EDTA. This suggests that the zinc is important in the tertiary structure of the protein. More studies are being made for better understand these results. (author)

  18. Against the odds? De novo structure determination of a pilin with two cysteine residues by sulfur SAD.

    Science.gov (United States)

    Gorgel, Manuela; Bøggild, Andreas; Ulstrup, Jakob Jensen; Weiss, Manfred S; Müller, Uwe; Nissen, Poul; Boesen, Thomas

    2015-05-01

    Exploiting the anomalous signal of the intrinsic S atoms to phase a protein structure is advantageous, as ideally only a single well diffracting native crystal is required. However, sulfur is a weak anomalous scatterer at the typical wavelengths used for X-ray diffraction experiments, and therefore sulfur SAD data sets need to be recorded with a high multiplicity. In this study, the structure of a small pilin protein was determined by sulfur SAD despite several obstacles such as a low anomalous signal (a theoretical Bijvoet ratio of 0.9% at a wavelength of 1.8 Å), radiation damage-induced reduction of the cysteines and a multiplicity of only 5.5. The anomalous signal was improved by merging three data sets from different volumes of a single crystal, yielding a multiplicity of 17.5, and a sodium ion was added to the substructure of anomalous scatterers. In general, all data sets were balanced around the threshold values for a successful phasing strategy. In addition, a collection of statistics on structures from the PDB that were solved by sulfur SAD are presented and compared with the data. Looking at the quality indicator R(anom)/R(p.i.m.), an inconsistency in the documentation of the anomalous R factor is noted and reported.

  19. Protein Structure Determination Using Chemical Shifts

    DEFF Research Database (Denmark)

    Christensen, Anders Steen

    is determined using only chemical shifts recorded and assigned through automated processes. The CARMSD to the experimental X-ray for this structure is 1.1. Å. Additionally, the method is combined with very sparse NOE-restraints and evolutionary distance restraints and tested on several protein structures >100...

  20. On characterization of anisotropic plant protein structures

    NARCIS (Netherlands)

    Krintiras, G.A.; Göbel, J.; Bouwman, W.G.; Goot, van der A.J.; Stefanidis, G.D.

    2014-01-01

    In this paper, a set of complementary techniques was used to characterize surface and bulk structures of an anisotropic Soy Protein Isolate (SPI)–vital wheat gluten blend after it was subjected to heat and simple shear flow in a Couette Cell. The structured biopolymer blend can form a basis for a

  1. Hidden Structural Codes in Protein Intrinsic Disorder.

    Science.gov (United States)

    Borkosky, Silvia S; Camporeale, Gabriela; Chemes, Lucía B; Risso, Marikena; Noval, María Gabriela; Sánchez, Ignacio E; Alonso, Leonardo G; de Prat Gay, Gonzalo

    2017-10-17

    Intrinsic disorder is a major structural category in biology, accounting for more than 30% of coding regions across the domains of life, yet consists of conformational ensembles in equilibrium, a major challenge in protein chemistry. Anciently evolved papillomavirus genomes constitute an unparalleled case for sequence to structure-function correlation in cases in which there are no folded structures. E7, the major transforming oncoprotein of human papillomaviruses, is a paradigmatic example among the intrinsically disordered proteins. Analysis of a large number of sequences of the same viral protein allowed for the identification of a handful of residues with absolute conservation, scattered along the sequence of its N-terminal intrinsically disordered domain, which intriguingly are mostly leucine residues. Mutation of these led to a pronounced increase in both α-helix and β-sheet structural content, reflected by drastic effects on equilibrium propensities and oligomerization kinetics, and uncovers the existence of local structural elements that oppose canonical folding. These folding relays suggest the existence of yet undefined hidden structural codes behind intrinsic disorder in this model protein. Thus, evolution pinpoints conformational hot spots that could have not been identified by direct experimental methods for analyzing or perturbing the equilibrium of an intrinsically disordered protein ensemble.

  2. Protein Structure Recognition: From Eigenvector Analysis to Structural Threading Method

    Energy Technology Data Exchange (ETDEWEB)

    Cao, Haibo [Iowa State Univ., Ames, IA (United States)

    2003-01-01

    In this work, they try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. They found a strong correlation between amino acid sequences and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, they give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part includes discussions of interactions among amino acids residues, lattice HP model, and the design ability principle. In the second part, they try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in the eigenvector study of protein contact matrix. They believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, they discuss a threading method based on the correlation between amino acid sequences and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, they list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches.

  3. Protein structure recognition: From eigenvector analysis to structural threading method

    Science.gov (United States)

    Cao, Haibo

    In this work, we try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. We found a strong correlation between amino acid sequence and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, we give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part include discussions of interactions among amino acids residues, lattice HP model, and the designablity principle. In the second part, we try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in our eigenvector study of protein contact matrix. We believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, we discuss a threading method based on the correlation between amino acid sequence and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, we list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches.

  4. Protein Structure Recognition: From Eigenvector Analysis to Structural Threading Method

    International Nuclear Information System (INIS)

    Haibo Cao

    2003-01-01

    In this work, they try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. They found a strong correlation between amino acid sequences and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, they give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part includes discussions of interactions among amino acids residues, lattice HP model, and the design ability principle. In the second part, they try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in the eigenvector study of protein contact matrix. They believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, they discuss a threading method based on the correlation between amino acid sequences and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, they list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches

  5. Structure and non-structure of centrosomal proteins.

    Science.gov (United States)

    Dos Santos, Helena G; Abia, David; Janowski, Robert; Mortuza, Gulnahar; Bertero, Michela G; Boutin, Maïlys; Guarín, Nayibe; Méndez-Giraldez, Raúl; Nuñez, Alfonso; Pedrero, Juan G; Redondo, Pilar; Sanz, María; Speroni, Silvia; Teichert, Florian; Bruix, Marta; Carazo, José M; Gonzalez, Cayetano; Reina, José; Valpuesta, José M; Vernos, Isabelle; Zabala, Juan C; Montoya, Guillermo; Coll, Miquel; Bastolla, Ugo; Serrano, Luis

    2013-01-01

    Here we perform a large-scale study of the structural properties and the expression of proteins that constitute the human Centrosome. Centrosomal proteins tend to be larger than generic human proteins (control set), since their genes contain in average more exons (20.3 versus 14.6). They are rich in predicted disordered regions, which cover 57% of their length, compared to 39% in the general human proteome. They also contain several regions that are dually predicted to be disordered and coiled-coil at the same time: 55 proteins (15%) contain disordered and coiled-coil fragments that cover more than 20% of their length. Helices prevail over strands in regions homologous to known structures (47% predicted helical residues against 17% predicted as strands), and even more in the whole centrosomal proteome (52% against 7%), while for control human proteins 34.5% of the residues are predicted as helical and 12.8% are predicted as strands. This difference is mainly due to residues predicted as disordered and helical (30% in centrosomal and 9.4% in control proteins), which may correspond to alpha-helix forming molecular recognition features (α-MoRFs). We performed expression assays for 120 full-length centrosomal proteins and 72 domain constructs that we have predicted to be globular. These full-length proteins are often insoluble: Only 39 out of 120 expressed proteins (32%) and 19 out of 72 domains (26%) were soluble. We built or retrieved structural models for 277 out of 361 human proteins whose centrosomal localization has been experimentally verified. We could not find any suitable structural template with more than 20% sequence identity for 84 centrosomal proteins (23%), for which around 74% of the residues are predicted to be disordered or coiled-coils. The three-dimensional models that we built are available at http://ub.cbm.uam.es/centrosome/models/index.php.

  6. Pb(II) and Hg(II) binding to $\\textit{de novo}$ designed proteins studied by $^{204m}$Pb- and $^{199m}$Hg-Perturbed Angular Correlation of $\\gamma$-rays (PAC) spectroscopy : Clues to heavy metal toxicity

    CERN Multimedia

    2002-01-01

    $\\textit{De novo}$ design of proteins combined with PAC spectroscopy offers a unique and powerful approach to the study of fundamental chemistry of heavy metal-protein interactions, and thus of the mechanisms underlying heavy metal toxicity. In this project we focus on Pb(II) and Hg(II) binding to designed three stranded coiled coil proteins with one or two binding sites, mimicking a variety of naturally occurring thiolate-rich metal ion binding sites in proteins. The $^{204m}$Pb- and $^{199m}$Hg-PAC experiments will complement data already recorded with EXAFS, NMR, UV-Vis and CD spectroscopies.

  7. Structural deformation upon protein-protein interaction: a structural alphabet approach.

    Science.gov (United States)

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-02-28

    In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Our study provides qualitative information about induced fit. These results could be of help for flexible docking.

  8. Structural deformation upon protein-protein interaction: A structural alphabet approach

    Directory of Open Access Journals (Sweden)

    Lecornet Hélène

    2008-02-01

    Full Text Available Abstract Background In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. Results In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%. This proportion is even greater in the interface regions (41%. Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Conclusion Our study provides qualitative information about induced fit. These results could be of help for flexible docking.

  9. De novo design of RNA-binding proteins with a prion-like domain related to ALS/FTD proteinopathies.

    Science.gov (United States)

    Mitsuhashi, Kana; Ito, Daisuke; Mashima, Kyoko; Oyama, Munenori; Takahashi, Shinichi; Suzuki, Norihiro

    2017-12-04

    Aberrant RNA-binding proteins form the core of the neurodegeneration cascade in spectrums of disease, such as amyotrophic lateral sclerosis (ALS)/frontotemporal dementia (FTD). Six ALS-related molecules, TDP-43, FUS, TAF15, EWSR1, heterogeneous nuclear (hn)RNPA1 and hnRNPA2 are RNA-binding proteins containing candidate mutations identified in ALS patients and those share several common features, including harboring an aggregation-prone prion-like domain (PrLD) containing a glycine/serine-tyrosine-glycine/serine (G/S-Y-G/S)-motif-enriched low-complexity sequence and rich in glutamine and/or asparagine. Additinally, these six molecules are components of RNA granules involved in RNA quality control and become mislocated from the nucleus to form cytoplasmic inclusion bodies (IBs) in the ALS/FTD-affected brain. To reveal the essential mechanisms involved in ALS/FTD-related cytotoxicity associated with RNA-binding proteins containing PrLDs, we designed artificial RNA-binding proteins harboring G/S-Y-G/S-motif repeats with and without enriched glutamine residues and nuclear-import/export-signal sequences and examined their cytotoxicity in vitro. These proteins recapitulated features of ALS-linked molecules, including insoluble aggregation, formation of cytoplasmic IBs and components of RNA granules, and cytotoxicity instigation. These findings indicated that these artificial RNA-binding proteins mimicked features of ALS-linked molecules and allowed the study of mechanisms associated with gain of toxic functions related to ALS/FTD pathogenesis.

  10. Beta-structures in fibrous proteins.

    Science.gov (United States)

    Kajava, Andrey V; Squire, John M; Parry, David A D

    2006-01-01

    The beta-form of protein folding, one of the earliest protein structures to be defined, was originally observed in studies of silks. It was then seen in early studies of synthetic polypeptides and, of course, is now known to be present in a variety of guises as an essential component of globular protein structures. However, in the last decade or so it has become clear that the beta-conformation of chains is present not only in many of the amyloid structures associated with, for example, Alzheimer's Disease, but also in the prion structures associated with the spongiform encephalopathies. Furthermore, X-ray crystallography studies have revealed the high incidence of the beta-fibrous proteins among virulence factors of pathogenic bacteria and viruses. Here we describe the basic forms of the beta-fold, summarize the many different new forms of beta-structural fibrous arrangements that have been discovered, and review advances in structural studies of amyloid and prion fibrils. These and other issues are described in detail in later chapters.

  11. Fibrous Protein Structures: Hierarchy, History and Heroes.

    Science.gov (United States)

    Squire, John M; Parry, David A D

    2017-01-01

    During the 1930s and 1940s the technique of X-ray diffraction was applied widely by William Astbury and his colleagues to a number of naturally-occurring fibrous materials. On the basis of the diffraction patterns obtained, he observed that the structure of each of the fibres was dominated by one of a small number of different types of molecular conformation. One group of fibres, known as the k-m-e-f group of proteins (keratin - myosin - epidermin - fibrinogen), gave rise to diffraction characteristics that became known as the α-pattern. Others, such as those from a number of silks, gave rise to a different pattern - the β-pattern, while connective tissues yielded a third unique set of diffraction characteristics. At the time of Astbury's work, the structures of these materials were unknown, though the spacings of the main X-ray reflections gave an idea of the axial repeats and the lateral packing distances. In a breakthrough in the early 1950s, the basic structures of all of these fibrous proteins were determined. It was found that the long protein chains, composed of strings of amino acids, could be folded up in a systematic manner to generate a limited number of structures that were consistent with the X-ray data. The most important of these were known as the α-helix, the β-sheet, and the collagen triple helix. These studies provided information about the basic building blocks of all proteins, both fibrous and globular. They did not, however, provide detailed information about how these molecules packed together in three-dimensions to generate the fibres found in vivo. A number of possible packing arrangements were subsequently deduced from the X-ray diffraction and other data, but it is only in the last few years, through the continued improvements of electron microscopy, that the packing details within some fibrous proteins can now be seen directly. Here we outline briefly some of the milestones in fibrous protein structure determination, the role of the

  12. Shaking alone induces de novo conversion of recombinant prion proteins to β-sheet rich oligomers and fibrils.

    Directory of Open Access Journals (Sweden)

    Carol L Ladner-Keay

    Full Text Available The formation of β-sheet rich prion oligomers and fibrils from native prion protein (PrP is thought to be a key step in the development of prion diseases. Many methods are available to convert recombinant prion protein into β-sheet rich fibrils using various chemical denaturants (urea, SDS, GdnHCl, high temperature, phospholipids, or mildly acidic conditions (pH 4. Many of these methods also require shaking or another form of agitation to complete the conversion process. We have identified that shaking alone causes the conversion of recombinant PrP to β-sheet rich oligomers and fibrils at near physiological pH (pH 5.5 to pH 6.2 and temperature. This conversion does not require any denaturant, detergent, or any other chemical cofactor. Interestingly, this conversion does not occur when the water-air interface is eliminated in the shaken sample. We have analyzed shaking-induced conversion using circular dichroism, resolution enhanced native acidic gel electrophoresis (RENAGE, electron microscopy, Fourier transform infrared spectroscopy, thioflavin T fluorescence and proteinase K resistance. Our results show that shaking causes the formation of β-sheet rich oligomers with a population distribution ranging from octamers to dodecamers and that further shaking causes a transition to β-sheet fibrils. In addition, we show that shaking-induced conversion occurs for a wide range of full-length and truncated constructs of mouse, hamster and cervid prion proteins. We propose that this method of conversion provides a robust, reproducible and easily accessible model for scrapie-like amyloid formation, allowing the generation of milligram quantities of physiologically stable β-sheet rich oligomers and fibrils. These results may also have interesting implications regarding our understanding of prion conversion and propagation both within the brain and via techniques such as protein misfolding cyclic amplification (PMCA and quaking induced conversion (QuIC.

  13. A Kernel for Protein Secondary Structure Prediction

    OpenAIRE

    Guermeur , Yann; Lifchitz , Alain; Vert , Régis

    2004-01-01

    http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10338&mode=toc; International audience; Multi-class support vector machines have already proved efficient in protein secondary structure prediction as ensemble methods, to combine the outputs of sets of classifiers based on different principles. In this chapter, their implementation as basic prediction methods, processing the primary structure or the profile of multiple alignments, is investigated. A kernel devoted to the task is in...

  14. 3D bioprinting of structural proteins.

    Science.gov (United States)

    Włodarczyk-Biegun, Małgorzata K; Del Campo, Aránzazu

    2017-07-01

    3D bioprinting is a booming method to obtain scaffolds of different materials with predesigned and customized morphologies and geometries. In this review we focus on the experimental strategies and recent achievements in the bioprinting of major structural proteins (collagen, silk, fibrin), as a particularly interesting technology to reconstruct the biochemical and biophysical composition and hierarchical morphology of natural scaffolds. The flexibility in molecular design offered by structural proteins, combined with the flexibility in mixing, deposition, and mechanical processing inherent to bioprinting technologies, enables the fabrication of highly functional scaffolds and tissue mimics with a degree of complexity and organization which has only just started to be explored. Here we describe the printing parameters and physical (mechanical) properties of bioinks based on structural proteins, including the biological function of the printed scaffolds. We describe applied printing techniques and cross-linking methods, highlighting the modifications implemented to improve scaffold properties. The used cell types, cell viability, and possible construct applications are also reported. We envision that the application of printing technologies to structural proteins will enable unprecedented control over their supramolecular organization, conferring printed scaffolds biological properties and functions close to natural systems. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Functions and structures of eukaryotic recombination proteins

    International Nuclear Information System (INIS)

    Ogawa, Tomoko

    1994-01-01

    We have found that Rad51 and RecA Proteins form strikingly similar structures together with dsDNA and ATP. Their right handed helical nucleoprotein filaments extend the B-form DNA double helixes to 1.5 times in length and wind the helix. The similarity and uniqueness of their structures must reflect functional homologies between these proteins. Therefore, it is highly probable that similar recombination proteins are present in various organisms of different evolutional states. We have succeeded to clone RAD51 genes from human, mouse, chicken and fission yeast genes, and found that the homologues are widely distributed in eukaryotes. The HsRad51 and MmRad51 or ChRad51 proteins consist of 339 amino acids differing only by 4 or 12 amino acids, respectively, and highly homologous to both yeast proteins, but less so to Dmcl. All of these proteins are homologous to the region from residues 33 to 240 of RecA which was named ''homologous core. The homologous core is likely to be responsible for functions common for all of them, such as the formation of helical nucleoprotein filament that is considered to be involved in homologous pairing in the recombination reaction. The mouse gene is transcribed at a high level in thymus, spleen, testis, and ovary, at lower level in brain and at a further lower level in some other tissues. It is transcribed efficiently in recombination active tissues. A clear functional difference of Rad51 homologues from RecA was suggested by the failure of heterologous genes to complement the deficiency of Scrad51 mutants. This failure seems to reflect the absence of a compatible partner, such as ScRad52 protein in the case of ScRad51 protein, between different species. Thus, these discoveries play a role of the starting point to understand the fundamental gene targeting in mammalian cells and in gene therapy. (J.P.N.)

  16. Protein structure based prediction of catalytic residues.

    Science.gov (United States)

    Fajardo, J Eduardo; Fiser, Andras

    2013-02-22

    Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases.

  17. Discrete Haar transform and protein structure.

    Science.gov (United States)

    Morosetti, S

    1997-12-01

    The discrete Haar transform of the sequence of the backbone dihedral angles (phi and psi) was performed over a set of X-ray protein structures of high resolution from the Brookhaven Protein Data Bank. Afterwards, the new dihedral angles were calculated by the inverse transform, using a growing number of Haar functions, from the lower to the higher degree. New structures were obtained using these dihedral angles, with standard values for bond lengths and angles, and with omega = 0 degree. The reconstructed structures were compared with the experimental ones, and analyzed by visual inspection and statistical analysis. When half of the Haar coefficients were used, all the reconstructed structures were not yet collapsed to a tertiary folding, but they showed yet realized most of the secondary motifs. These results indicate a substantial separation of structural information in the space of Haar transform, with the secondary structural information mainly present in the Haar coefficients of lower degrees, and the tertiary one present in the higher degree coefficients. Because of this separation, the representation of the folded structures in the space of Haar transform seems a promising candidate to encompass the problem of premature convergence in genetic algorithms.

  18. Recognition of functional sites in protein structures.

    Science.gov (United States)

    Shulman-Peleg, Alexandra; Nussinov, Ruth; Wolfson, Haim J

    2004-06-04

    Recognition of regions on the surface of one protein, that are similar to a binding site of another is crucial for the prediction of molecular interactions and for functional classifications. We first describe a novel method, SiteEngine, that assumes no sequence or fold similarities and is able to recognize proteins that have similar binding sites and may perform similar functions. We achieve high efficiency and speed by introducing a low-resolution surface representation via chemically important surface points, by hashing triangles of physico-chemical properties and by application of hierarchical scoring schemes for a thorough exploration of global and local similarities. We proceed to rigorously apply this method to functional site recognition in three possible ways: first, we search a given functional site on a large set of complete protein structures. Second, a potential functional site on a protein of interest is compared with known binding sites, to recognize similar features. Third, a complete protein structure is searched for the presence of an a priori unknown functional site, similar to known sites. Our method is robust and efficient enough to allow computationally demanding applications such as the first and the third. From the biological standpoint, the first application may identify secondary binding sites of drugs that may lead to side-effects. The third application finds new potential sites on the protein that may provide targets for drug design. Each of the three applications may aid in assigning a function and in classification of binding patterns. We highlight the advantages and disadvantages of each type of search, provide examples of large-scale searches of the entire Protein Data Base and make functional predictions.

  19. Automated Protein Structure Modeling with SWISS-MODEL Workspace and the Protein Model Portal

    OpenAIRE

    Bordoli, Lorenza; Schwede, Torsten

    2012-01-01

    Comparative protein structure modeling is a computational approach to build three-dimensional structural models for proteins using experimental structures of related protein family members as templates. Regular blind assessments of modeling accuracy have demonstrated that comparative protein structure modeling is currently the most reliable technique to model protein structures. Homology models are often sufficiently accurate to substitute for experimental structures in a wide variety of appl...

  20. De novo DESIGN AND SYNTHESIS OF AN ICE-BINDING, DENDRIMERIC, POLYPEPTIDE BASED ON INSECT ANTIFREEZE PROTEINS

    Directory of Open Access Journals (Sweden)

    Ricardo Vera Bravo

    2011-12-01

    Full Text Available A new strategy is presented for the designand synthesis of peptides that exhibitice-binding and antifreeze activity. Apennant-type dendrimer polypeptidescaffold combining an α-helical backbonewith four short β-strand branches wassynthesized in solid phase using Fmocchemistry in a divergent approach. The51-residue dendrimer was characterizedby reverse phase high performance liquidchromatography, mass spectrometry andcircular dichroism. Each β-strand branchcontained three overlapping TXT aminoacid repeats, an ice-binding motif foundin the ice-binding face of the sprucebudworm (Choristoneura fumiferanaand beetle (Tenebrio molitor antifreezeproteins. Ice crystals in the presence ofthe polypeptide monomer displayed flat,hexagonal plate morphology, similar tothat produced by weakly active antifreezeproteins. An oxidized dimeric form of thedendrimer polypeptide also produced flathexagonal ice crystals and was capableof inhibiting ice crystal growth upontemperature reduction, a phenomenontermed thermal hysteresis, a definingproperty of antifreeze proteins. Linkageof the pennant-type dendrimer to a trifunctionalcascade-type polypeptideproduced a trimeric macromolecule thatgave flat hexagonal ice crystals withhigher thermal hysteresis activity thanthe dimer or monomer and an ice crystal burst pattern similar to that producedby samples containing insect antifreezeproteins. This macromolecule was alsocapable of inhibiting ice recrystallization.

  1. Biophysical characterization of a de novo elastin

    Science.gov (United States)

    Greenland, Kelly Nicole

    Natural human elastin is found in tissue such as the lungs, arteries, and skin. This protein is formed at birth with no mechanism present to repair or supplement the initial quantity formed. As a result, the functionality and durability of elastin's elasticity is critically important. To date, the mechanics of this ability to stretch and recoil is not fully understood. This study utilizes de novo protein design to create a small library of simplistic versions of elastin-like proteins, demonstrate the elastin-like proteins, maintain elastin's functionality, and inquire into its structure using solution nuclear magnetic resonance (NMR). Elastin is formed from cross-linked tropoelastin. Therefore, the first generation of designed proteins consisted of one protein that utilized homogony of interspecies tropoelastin by using three common domains, two hydrophobic and one cross-linking domains. Basic modifications were made to open the hydrophobic region and also to make the protein easier to purify and characterize. The designed protein maintained its functionality, self-aggregating as the temperature increased. Uniquely, the protein remained self-aggregated as the temperature returned below the critical transition temperature. Self-aggregation was additionally induced by increasing salt concentrations and by modifying the pH. The protein appeared to have little secondary structure when studied with solution NMR. These results fueled a second generation of designed elastin-like proteins. This generation contained variations designed to study the cross-linking domain, one specific hydrophobic domain, and the effect of the length of the elastin-like protein. The cross-linking domain in one variation has been significantly modified while the flanking hydrophobic domains have remained unchanged. This characterization of this protein will answer questions regarding the specificity of the homologous nature of the cross-linking domain of tropoelastin across species. A second

  2. Structure of Plasmodium falciparum orotate phosphoribosyltransferase with autologous inhibitory protein–protein interactions

    International Nuclear Information System (INIS)

    Kumar, Shiva; Krishnamoorthy, Kalyanaraman; Mudeppa, Devaraja G.; Rathod, Pradipsinh K.

    2015-01-01

    P. falciparum orotate phosphoribosyltransferase, a potential target for antimalarial drugs and a conduit for prodrugs, crystallized as a structure with eight molecules per asymmetric unit that included some unique parasite-specific auto-inhibitory interactions between catalytic dimers. The most severe form of malaria is caused by the obligate parasite Plasmodium falciparum. Orotate phosphoribosyltransferase (OPRTase) is the fifth enzyme in the de novo pyrimidine-synthesis pathway in the parasite, which lacks salvage pathways. Among all of the malaria de novo pyrimidine-biosynthesis enzymes, the structure of P. falciparum OPRTase (PfOPRTase) was the only one unavailable until now. PfOPRTase that could be crystallized was obtained after some low-complexity sequences were removed. Four catalytic dimers were seen in the asymmetic unit (a total of eight polypeptides). In addition to revealing unique amino acids in the PfOPRTase active sites, asymmetric dimers in the larger structure pointed to novel parasite-specific protein–protein interactions that occlude the catalytic active sites. The latter could potentially modulate PfOPRTase activity in parasites and possibly provide new insights for blocking PfOPRTase functions

  3. Structure of Plasmodium falciparum orotate phosphoribosyltransferase with autologous inhibitory protein–protein interactions

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, Shiva; Krishnamoorthy, Kalyanaraman; Mudeppa, Devaraja G.; Rathod, Pradipsinh K., E-mail: rathod@chem.washington.edu [University of Washington, Seattle, WA 98195 (United States)

    2015-04-21

    P. falciparum orotate phosphoribosyltransferase, a potential target for antimalarial drugs and a conduit for prodrugs, crystallized as a structure with eight molecules per asymmetric unit that included some unique parasite-specific auto-inhibitory interactions between catalytic dimers. The most severe form of malaria is caused by the obligate parasite Plasmodium falciparum. Orotate phosphoribosyltransferase (OPRTase) is the fifth enzyme in the de novo pyrimidine-synthesis pathway in the parasite, which lacks salvage pathways. Among all of the malaria de novo pyrimidine-biosynthesis enzymes, the structure of P. falciparum OPRTase (PfOPRTase) was the only one unavailable until now. PfOPRTase that could be crystallized was obtained after some low-complexity sequences were removed. Four catalytic dimers were seen in the asymmetic unit (a total of eight polypeptides). In addition to revealing unique amino acids in the PfOPRTase active sites, asymmetric dimers in the larger structure pointed to novel parasite-specific protein–protein interactions that occlude the catalytic active sites. The latter could potentially modulate PfOPRTase activity in parasites and possibly provide new insights for blocking PfOPRTase functions.

  4. Design of a minimal protein oligomerization domain by a structural approach.

    Science.gov (United States)

    Burkhard, P; Meier, M; Lustig, A

    2000-12-01

    Because of the simplicity and regularity of the alpha-helical coiled coil relative to other structural motifs, it can be conveniently used to clarify the molecular interactions responsible for protein folding and stability. Here we describe the de novo design and characterization of a two heptad-repeat peptide stabilized by a complex network of inter- and intrahelical salt bridges. Circular dichroism spectroscopy and analytical ultracentrifugation show that this peptide is highly alpha-helical and 100% dimeric tinder physiological buffer conditions. Interestingly, the peptide was shown to switch its oligomerization state from a dimer to a trimer upon increasing ionic strength. The correctness of the rational design principles used here is supported by details of the atomic structure of the peptide deduced from X-ray crystallography. The structure of the peptide shows that it is not a molten globule but assumes a unique, native-like conformation. This de novo peptide thus represents an attractive model system for the design of a molecular recognition system.

  5. PCNA Structure and Interactions with Partner Proteins

    KAUST Repository

    Oke, Muse; Zaher, Manal S.; Hamdan, Samir

    2018-01-01

    Proliferating cell nuclear antigen (PCNA) consists of three identical monomers that topologically encircle double-stranded DNA. PCNA stimulates the processivity of DNA polymerase δ and, to a less extent, the intrinsically highly processive DNA polymerase ε. It also functions as a platform that recruits and coordinates the activities of a large number of DNA processing proteins. Emerging structural and biochemical studies suggest that the nature of PCNA-partner proteins interactions is complex. A hydrophobic groove at the front side of PCNA serves as a primary docking site for the consensus PIP box motifs present in many PCNA-binding partners. Sequences that immediately flank the PIP box motif or regions that are distant from it could also interact with the hydrophobic groove and other regions of PCNA. Posttranslational modifications on the backside of PCNA could add another dimension to its interaction with partner proteins. An encounter of PCNA with different DNA structures might also be involved in coordinating its interactions. Finally, the ability of PCNA to bind up to three proteins while topologically linked to DNA suggests that it would be a versatile toolbox in many different DNA processing reactions.

  6. PCNA Structure and Interactions with Partner Proteins

    KAUST Repository

    Oke, Muse

    2018-01-29

    Proliferating cell nuclear antigen (PCNA) consists of three identical monomers that topologically encircle double-stranded DNA. PCNA stimulates the processivity of DNA polymerase δ and, to a less extent, the intrinsically highly processive DNA polymerase ε. It also functions as a platform that recruits and coordinates the activities of a large number of DNA processing proteins. Emerging structural and biochemical studies suggest that the nature of PCNA-partner proteins interactions is complex. A hydrophobic groove at the front side of PCNA serves as a primary docking site for the consensus PIP box motifs present in many PCNA-binding partners. Sequences that immediately flank the PIP box motif or regions that are distant from it could also interact with the hydrophobic groove and other regions of PCNA. Posttranslational modifications on the backside of PCNA could add another dimension to its interaction with partner proteins. An encounter of PCNA with different DNA structures might also be involved in coordinating its interactions. Finally, the ability of PCNA to bind up to three proteins while topologically linked to DNA suggests that it would be a versatile toolbox in many different DNA processing reactions.

  7. Protein secondary structure: category assignment and predictability

    DEFF Research Database (Denmark)

    Andersen, Claus A.; Bohr, Henrik; Brunak, Søren

    2001-01-01

    In the last decade, the prediction of protein secondary structure has been optimized using essentially one and the same assignment scheme known as DSSP. We present here a different scheme, which is more predictable. This scheme predicts directly the hydrogen bonds, which stabilize the secondary......-forward neural network with one hidden layer on a data set identical to the one used in earlier work....

  8. Protein-mediated surface structuring in biomembranes

    Directory of Open Access Journals (Sweden)

    Maggio B.

    2005-01-01

    Full Text Available The lipids and proteins of biomembranes exhibit highly dissimilar conformations, geometrical shapes, amphipathicity, and thermodynamic properties which constrain their two-dimensional molecular packing, electrostatics, and interaction preferences. This causes inevitable development of large local tensions that frequently relax into phase or compositional immiscibility along lateral and transverse planes of the membrane. On the other hand, these effects constitute the very codes that mediate molecular and structural changes determining and controlling the possibilities for enzymatic activity, apposition and recombination in biomembranes. The presence of proteins constitutes a major perturbing factor for the membrane sculpturing both in terms of its surface topography and dynamics. We will focus on some results from our group within this context and summarize some recent evidence for the active involvement of extrinsic (myelin basic protein, integral (Folch-Lees proteolipid protein and amphitropic (c-Fos and c-Jun proteins, as well as a membrane-active amphitropic phosphohydrolytic enzyme (neutral sphingomyelinase, in the process of lateral segregation and dynamics of phase domains, sculpturing of the surface topography, and the bi-directional modulation of the membrane biochemical reactivity.

  9. PROCARB: A Database of Known and Modelled Carbohydrate-Binding Protein Structures with Sequence-Based Prediction Tools

    Directory of Open Access Journals (Sweden)

    Adeel Malik

    2010-01-01

    Full Text Available Understanding of the three-dimensional structures of proteins that interact with carbohydrates covalently (glycoproteins as well as noncovalently (protein-carbohydrate complexes is essential to many biological processes and plays a significant role in normal and disease-associated functions. It is important to have a central repository of knowledge available about these protein-carbohydrate complexes as well as preprocessed data of predicted structures. This can be significantly enhanced by tools de novo which can predict carbohydrate-binding sites for proteins in the absence of structure of experimentally known binding site. PROCARB is an open-access database comprising three independently working components, namely, (i Core PROCARB module, consisting of three-dimensional structures of protein-carbohydrate complexes taken from Protein Data Bank (PDB, (ii Homology Models module, consisting of manually developed three-dimensional models of N-linked and O-linked glycoproteins of unknown three-dimensional structure, and (iii CBS-Pred prediction module, consisting of web servers to predict carbohydrate-binding sites using single sequence or server-generated PSSM. Several precomputed structural and functional properties of complexes are also included in the database for quick analysis. In particular, information about function, secondary structure, solvent accessibility, hydrogen bonds and literature reference, and so forth, is included. In addition, each protein in the database is mapped to Uniprot, Pfam, PDB, and so forth.

  10. Annotating the protein-RNA interaction sites in proteins using evolutionary information and protein backbone structure.

    Science.gov (United States)

    Li, Tao; Li, Qian-Zhong

    2012-11-07

    RNA-protein interactions play important roles in various biological processes. The precise detection of RNA-protein interaction sites is very important for understanding essential biological processes and annotating the function of the proteins. In this study, based on various features from amino acid sequence and structure, including evolutionary information, solvent accessible surface area and torsion angles (φ, ψ) in the backbone structure of the polypeptide chain, a computational method for predicting RNA-binding sites in proteins is proposed. When the method is applied to predict RNA-binding sites in three datasets: RBP86 containing 86 protein chains, RBP107 containing 107 proteins chains and RBP109 containing 109 proteins chains, better sensitivities and specificities are obtained compared to previously published methods in five-fold cross-validation tests. In order to make further examination for the efficiency of our method, the RBP107 dataset is used as training set, RBP86 and RBP109 datasets are used as the independent test sets. In addition, as examples of our prediction, RNA-binding sites in a few proteins are presented. The annotated results are consistent with the PDB annotation. These results show that our method is useful for annotating RNA binding sites of novel proteins.

  11. Classification of proteins: available structural space for molecular modeling.

    Science.gov (United States)

    Andreeva, Antonina

    2012-01-01

    The wealth of available protein structural data provides unprecedented opportunity to study and better understand the underlying principles of protein folding and protein structure evolution. A key to achieving this lies in the ability to analyse these data and to organize them in a coherent classification scheme. Over the past years several protein classifications have been developed that aim to group proteins based on their structural relationships. Some of these classification schemes explore the concept of structural neighbourhood (structural continuum), whereas other utilize the notion of protein evolution and thus provide a discrete rather than continuum view of protein structure space. This chapter presents a strategy for classification of proteins with known three-dimensional structure. Steps in the classification process along with basic definitions are introduced. Examples illustrating some fundamental concepts of protein folding and evolution with a special focus on the exceptions to them are presented.

  12. Protein crystal structure analysis using synchrotron radiation at atomic resolution

    International Nuclear Information System (INIS)

    Nonaka, Takamasa

    1999-01-01

    We can now obtain a detailed picture of protein, allowing the identification of individual atoms, by interpreting the diffraction of X-rays from a protein crystal at atomic resolution, 1.2 A or better. As of this writing, about 45 unique protein structures beyond 1.2 A resolution have been deposited in the Protein Data Bank. This review provides a simplified overview of how protein crystallographers use such diffraction data to solve, refine, and validate protein structures. (author)

  13. Predicting Protein Secondary Structure with Markov Models

    DEFF Research Database (Denmark)

    Fischer, Paul; Larsen, Simon; Thomsen, Claus

    2004-01-01

    we are considering here, is to predict the secondary structure from the primary one. To this end we train a Markov model on training data and then use it to classify parts of unknown protein sequences as sheets, helices or coils. We show how to exploit the directional information contained...... in the Markov model for this task. Classifications that are purely based on statistical models might not always be biologically meaningful. We present combinatorial methods to incorporate biological background knowledge to enhance the prediction performance....

  14. GIS: a comprehensive source for protein structure similarities.

    Science.gov (United States)

    Guerler, Aysam; Knapp, Ernst-Walter

    2010-07-01

    A web service for analysis of protein structures that are sequentially or non-sequentially similar was generated. Recently, the non-sequential structure alignment algorithm GANGSTA+ was introduced. GANGSTA+ can detect non-sequential structural analogs for proteins stated to possess novel folds. Since GANGSTA+ ignores the polypeptide chain connectivity of secondary structure elements (i.e. alpha-helices and beta-strands), it is able to detect structural similarities also between proteins whose sequences were reshuffled during evolution. GANGSTA+ was applied in an all-against-all comparison on the ASTRAL40 database (SCOP version 1.75), which consists of >10,000 protein domains yielding about 55 x 10(6) possible protein structure alignments. Here, we provide the resulting protein structure alignments as a public web-based service, named GANGSTA+ Internet Services (GIS). We also allow to browse the ASTRAL40 database of protein structures with GANGSTA+ relative to an externally given protein structure using different constraints to select specific results. GIS allows us to analyze protein structure families according to the SCOP classification scheme. Additionally, users can upload their own protein structures for pairwise protein structure comparison, alignment against all protein structures of the ASTRAL40 database (SCOP version 1.75) or symmetry analysis. GIS is publicly available at http://agknapp.chemie.fu-berlin.de/gplus.

  15. Automated protein structure modeling with SWISS-MODEL Workspace and the Protein Model Portal.

    Science.gov (United States)

    Bordoli, Lorenza; Schwede, Torsten

    2012-01-01

    Comparative protein structure modeling is a computational approach to build three-dimensional structural models for proteins using experimental structures of related protein family members as templates. Regular blind assessments of modeling accuracy have demonstrated that comparative protein structure modeling is currently the most reliable technique to model protein structures. Homology models are often sufficiently accurate to substitute for experimental structures in a wide variety of applications. Since the usefulness of a model for specific application is determined by its accuracy, model quality estimation is an essential component of protein structure prediction. Comparative protein modeling has become a routine approach in many areas of life science research since fully automated modeling systems allow also nonexperts to build reliable models. In this chapter, we describe practical approaches for automated protein structure modeling with SWISS-MODEL Workspace and the Protein Model Portal.

  16. Structure based alignment and clustering of proteins (STRALCP)

    Science.gov (United States)

    Zemla, Adam T.; Zhou, Carol E.; Smith, Jason R.; Lam, Marisa W.

    2013-06-18

    Disclosed are computational methods of clustering a set of protein structures based on local and pair-wise global similarity values. Pair-wise local and global similarity values are generated based on pair-wise structural alignments for each protein in the set of protein structures. Initially, the protein structures are clustered based on pair-wise local similarity values. The protein structures are then clustered based on pair-wise global similarity values. For each given cluster both a representative structure and spans of conserved residues are identified. The representative protein structure is used to assign newly-solved protein structures to a group. The spans are used to characterize conservation and assign a "structural footprint" to the cluster.

  17. Alpha complexes in protein structure prediction

    DEFF Research Database (Denmark)

    Winter, Pawel; Fonseca, Rasmus

    2015-01-01

    Reducing the computational effort and increasing the accuracy of potential energy functions is of utmost importance in modeling biological systems, for instance in protein structure prediction, docking or design. Evaluating interactions between nonbonded atoms is the bottleneck of such computations......-complexes from scratch for every configuration encountered during the search for the native structure would make this approach hopelessly slow. However, it is argued that kinetic a-complexes can be used to reduce the computational effort of determining the potential energy when "moving" from one configuration...... to a neighboring one. As a consequence, relatively expensive (initial) construction of an a-complex is expected to be compensated by subsequent fast kinetic updates during the search process. Computational results presented in this paper are limited. However, they suggest that the applicability of a...

  18. Course 12: Proteins: Structural, Thermodynamic and Kinetic Aspects

    Science.gov (United States)

    Finkelstein, A. V.

    1 Introduction 2 Overview of protein architectures and discussion of physical background of their natural selection 2.1 Protein structures 2.2 Physical selection of protein structures 3 Thermodynamic aspects of protein folding 3.1 Reversible denaturation of protein structures 3.2 What do denatured proteins look like? 3.3 Why denaturation of a globular protein is the first-order phase transition 3.4 "Gap" in energy spectrum: The main characteristic that distinguishes protein chains from random polymers 4 Kinetic aspects of protein folding 4.1 Protein folding in vivo 4.2 Protein folding in vitro (in the test-tube) 4.3 Theory of protein folding rates and solution of the Levinthal paradox

  19. A real-time all-atom structural search engine for proteins.

    Science.gov (United States)

    Gonzalez, Gabriel; Hannigan, Brett; DeGrado, William F

    2014-07-01

    Protein designers use a wide variety of software tools for de novo design, yet their repertoire still lacks a fast and interactive all-atom search engine. To solve this, we have built the Suns program: a real-time, atomic search engine integrated into the PyMOL molecular visualization system. Users build atomic-level structural search queries within PyMOL and receive a stream of search results aligned to their query within a few seconds. This instant feedback cycle enables a new "designability"-inspired approach to protein design where the designer searches for and interactively incorporates native-like fragments from proven protein structures. We demonstrate the use of Suns to interactively build protein motifs, tertiary interactions, and to identify scaffolds compatible with hot-spot residues. The official web site and installer are located at http://www.degradolab.org/suns/ and the source code is hosted at https://github.com/godotgildor/Suns (PyMOL plugin, BSD license), https://github.com/Gabriel439/suns-cmd (command line client, BSD license), and https://github.com/Gabriel439/suns-search (search engine server, GPLv2 license).

  20. Structural determination of intact proteins using mass spectrometry

    Science.gov (United States)

    Kruppa, Gary [San Francisco, CA; Schoeniger, Joseph S [Oakland, CA; Young, Malin M [Livermore, CA

    2008-05-06

    The present invention relates to novel methods of determining the sequence and structure of proteins. Specifically, the present invention allows for the analysis of intact proteins within a mass spectrometer. Therefore, preparatory separations need not be performed prior to introducing a protein sample into the mass spectrometer. Also disclosed herein are new instrumental developments for enhancing the signal from the desired modified proteins, methods for producing controlled protein fragments in the mass spectrometer, eliminating complex microseparations, and protein preparatory chemical steps necessary for cross-linking based protein structure determination.Additionally, the preferred method of the present invention involves the determination of protein structures utilizing a top-down analysis of protein structures to search for covalent modifications. In the preferred method, intact proteins are ionized and fragmented within the mass spectrometer.

  1. Mimicking the action of folding chaperones by Hamiltonian replica-exchange molecular dynamics simulations : Application in the refinement of de novo models

    NARCIS (Netherlands)

    Fan, Hao; Periole, Xavier; Mark, Alan E.

    The efficiency of using a variant of Hamiltonian replica-exchange molecular dynamics (Chaperone H-replica-exchange molecular dynamics [CH-REMD]) for the refinement of protein structural models generated de novo is investigated. In CH-REMD, the interaction between the protein and its environment,

  2. Protein structure similarity from principle component correlation analysis

    Directory of Open Access Journals (Sweden)

    Chou James

    2006-01-01

    Full Text Available Abstract Background Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. Results We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. Conclusion The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum

  3. Monolayers of a De Novo Designed 4-Alpha-Helix Bundle Carboprotein and Partial Structures on Au(111)-Surfaces

    DEFF Research Database (Denmark)

    Brask, Jesper; Wackerbarth, Hainer; Jensen, Knud Jørgen

    2002-01-01

    on a galactopyranoside derivative with a thiol anchor aglycon suitable for surface immobilization on gold. The galactopyranoside with thiol anchor and the thiol anchor alone were prepared for comparison. Voltammetry of the three molecules on Au(111) showed reductive desorption peaks caused by monolayer adsorption via...... thiolate-Au bonding. In situ STM of the thiol anchor disclosed an ordered adlayer with clear domains and molecular features. This holds promise, broadly for single-molecule voltammetry and the SPM and scanning tunnelling microscopy (STM) of natural and synthetic proteins....

  4. Protein loop modeling using a new hybrid energy function and its application to modeling in inaccurate structural environments.

    Directory of Open Access Journals (Sweden)

    Hahnbeom Park

    Full Text Available Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.

  5. Nonlinear deterministic structures and the randomness of protein sequences

    CERN Document Server

    Huang Yan Zhao

    2003-01-01

    To clarify the randomness of protein sequences, we make a detailed analysis of a set of typical protein sequences representing each structural classes by using nonlinear prediction method. No deterministic structures are found in these protein sequences and this implies that they behave as random sequences. We also give an explanation to the controversial results obtained in previous investigations.

  6. The structure of a cholesterol-trapping protein

    Science.gov (United States)

    cholesterol-trapping protein Contact: Dan Krotz, dakrotz@lbl.gov Berkeley Lab Science Beat Lab website index Institute researchers determined the three-dimensional structure of a protein that controls cholesterol level in the bloodstream. Knowing the structure of the protein, a cellular receptor that ensnares

  7. STRUCTURAL FEATURES OF PLANT CHITINASES AND CHITIN-BINDING PROTEINS

    NARCIS (Netherlands)

    BEINTEMA, JJ

    1994-01-01

    Structural features of plant chitinases and chitin-binding proteins are discussed. Many of these proteins consist of multiple domains,of which the chitin-binding hevein domain is a predominant one. X-ray and NMR structures of representatives of the major classes of these proteins are available now,

  8. Membrane interaction and secondary structure of de novo designed arginine-and tryptophan peptides with dual function

    KAUST Repository

    Rydberg, Hanna A.

    2012-10-01

    Cell-penetrating peptides and antimicrobial peptides are two classes of positively charged membrane active peptides with several properties in common. The challenge is to combine knowledge about the membrane interaction mechanisms and structural properties of the two classes to design peptides with membrane-specific actions, useful either as transporters of cargo or as antibacterial substances. Membrane active peptides are commonly rich in arginine and tryptophan. We have previously designed a series of arg/trp peptides and investigated how the position and number of tryptophans affect cellular uptake. Here we explore the antimicrobial properties and the interaction with lipid model membranes of these peptides, using minimal inhibitory concentrations assay (MIC), circular dichroism (CD) and linear dichroism (LD). The results show that the arg/trp peptides inhibit the growth of the two gram positive strains Staphylococcus aureus and Staphylococcus pyogenes, with some individual variations depending on the position of the tryptophans. No inhibition of the gram negative strains Proteus mirabilis or Pseudomonas aeruginosa was noticed. CD indicated that when bound to lipid vesicles one of the peptides forms an α-helical like structure, whereas the other five exhibited rather random coiled structures. LD indicated that all six peptides were somehow aligned parallel with the membrane surface. Our results do not reveal any obvious connection between membrane interaction and antimicrobial effect for the studied peptides. By contrast cell-penetrating properties can be coupled to both the secondary structure and the degree of order of the peptides. © 2012 Elsevier Inc.

  9. Functional structural motifs for protein-ligand, protein-protein, and protein-nucleic acid interactions and their connection to supersecondary structures.

    Science.gov (United States)

    Kinjo, Akira R; Nakamura, Haruki

    2013-01-01

    Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.

  10. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-01-01

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment

  11. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-01-01

    operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching

  12. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng

    2016-08-26

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated protein structure classifications (e.g., SCOP and CATH) are very successful, but recently suffer the slow updating problem because of the increased throughput of newly solved protein structures. Thus, fully automatic methods to cluster proteins in the protein structure space have been designed and developed. In this study, we observed that the SCOP superfamilies are highly consistent with clustering trees representing hierarchical clustering procedures, but the tree cutting is very challenging and becomes the bottleneck of clustering accuracy. To overcome this challenge, we proposed a novel density-based K-nearest uphill clustering method that effectively eliminates noisy pairwise protein structure similarities and identifies density peaks as cluster centers. Specifically, the density peaks are identified based on K-nearest uphills (i.e., proteins with higher densities) and K-nearest neighbors. To our knowledge, this is the first attempt to apply and develop density-based clustering methods in the protein structure space. Our results show that our density-based clustering method outperforms the state-of-the-art clustering methods previously applied to the problem. Moreover, we observed that computational methods and human experts could produce highly similar clusters at high precision values, while computational methods also suggest to split some large superfamilies into smaller clusters. © 2016 Elsevier B.V.

  13. Use of designed sequences in protein structure recognition.

    Science.gov (United States)

    Kumar, Gayatri; Mudgal, Richa; Srinivasan, Narayanaswamy; Sandhya, Sankaran

    2018-05-09

    Knowledge of the protein structure is a pre-requisite for improved understanding of molecular function. The gap in the sequence-structure space has increased in the post-genomic era. Grouping related protein sequences into families can aid in narrowing the gap. In the Pfam database, structure description is provided for part or full-length proteins of 7726 families. For the remaining 52% of the families, information on 3-D structure is not yet available. We use the computationally designed sequences that are intermediately related to two protein domain families, which are already known to share the same fold. These strategically designed sequences enable detection of distant relationships and here, we have employed them for the purpose of structure recognition of protein families of yet unknown structure. We first measured the success rate of our approach using a dataset of protein families of known fold and achieved a success rate of 88%. Next, for 1392 families of yet unknown structure, we made structural assignments for part/full length of the proteins. Fold association for 423 domains of unknown function (DUFs) are provided as a step towards functional annotation. The results indicate that knowledge-based filling of gaps in protein sequence space is a lucrative approach for structure recognition. Such sequences assist in traversal through protein sequence space and effectively function as 'linkers', where natural linkers between distant proteins are unavailable. This article was reviewed by Oliviero Carugo, Christine Orengo and Srikrishna Subramanian.

  14. Using an alignment of fragment strings for comparing protein structures

    DEFF Research Database (Denmark)

    Friedberg, Iddo; Harder, Tim; Kolodny, Rachel

    2007-01-01

    . RESULTS: Here we describe the use of a particular structure fragment library, denoted here as KL-strings, for the 1D representation of protein structure. Using KL-strings, we develop an infrastructure for comparing protein structures with a 1D representation. This study focuses on the added value gained...

  15. Rheology and structure of milk protein gels

    NARCIS (Netherlands)

    Vliet, van T.; Lakemond, C.M.M.; Visschers, R.W.

    2004-01-01

    Recent studies on gel formation and rheology of milk gels are reviewed. A distinction is made between gels formed by aggregated casein, gels of `pure` whey proteins and gels in which both casein and whey proteins contribute to their properties. For casein' whey protein mixtures, it has been shown

  16. A hidden markov model derived structural alphabet for proteins.

    Science.gov (United States)

    Camproux, A C; Gautier, R; Tufféry, P

    2004-06-04

    Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have set up a hidden Markov model that discretizes protein backbone conformation as series of overlapping fragments (states) of four residues length. This approach learns simultaneously the geometry of the states and their connections. We obtain, using a statistical criterion, an optimal systematic decomposition of the conformational variability of the protein peptidic chain in 27 states with strong connection logic. This result is stable over different protein sets. Our model fits well the previous knowledge related to protein architecture organisation and seems able to grab some subtle details of protein organisation, such as helix sub-level organisation schemes. Taking into account the dependence between the states results in a description of local protein structure of low complexity. On an average, the model makes use of only 8.3 states among 27 to describe each position of a protein structure. Although we use short fragments, the learning process on entire protein conformations captures the logic of the assembly on a larger scale. Using such a model, the structure of proteins can be reconstructed with an average accuracy close to 1.1A root-mean-square deviation and for a complexity of only 3. Finally, we also observe that sequence specificity increases with the number of states of the structural alphabet. Such models can constitute a very relevant approach to the analysis of protein architecture in particular for protein structure prediction.

  17. Implementation of a Parallel Protein Structure Alignment Service on Cloud

    Directory of Open Access Journals (Sweden)

    Che-Lun Hung

    2013-01-01

    Full Text Available Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.

  18. Compare local pocket and global protein structure models by small structure patterns

    KAUST Repository

    Cui, Xuefeng; Kuwahara, Hiroyuki; Li, Shuai Cheng; Gao, Xin

    2015-01-01

    Researchers proposed several criteria to assess the quality of predicted protein structures because it is one of the essential tasks in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) competitions. Popular criteria

  19. PDB2CD visualises dynamics within protein structures.

    Science.gov (United States)

    Janes, Robert W

    2017-10-01

    Proteins tend to have defined conformations, a key factor in enabling their function. Atomic resolution structures of proteins are predominantly obtained by either solution nuclear magnetic resonance (NMR) or crystal structure methods. However, when considering a protein whose structure has been determined by both these approaches, on many occasions, the resultant conformations are subtly different, as illustrated by the examples in this study. The solution NMR approach invariably results in a cluster of structures whose conformations satisfy the distance boundaries imposed by the data collected; it might be argued that this is evidence of the dynamics of proteins when in solution. In crystal structures, the proteins are often in an energy minimum state which can result in an increase in the extent of regular secondary structure present relative to the solution state depicted by NMR, because the more dynamic ends of alpha helices and beta strands can become ordered at the lower temperatures. This study examines a novel way to display the differences in conformations within an NMR ensemble and between these and a crystal structure of a protein. Circular dichroism (CD) spectroscopy can be used to characterise protein structures in solution. Using the new bioinformatics tool, PDB2CD, which generates CD spectra from atomic resolution protein structures, the differences between, and possible dynamic range of, conformations adopted by a protein can be visualised.

  20. DNA mimic proteins: functions, structures, and bioinformatic analysis.

    Science.gov (United States)

    Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

    2014-05-13

    DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.

  1. Relation between native ensembles and experimental structures of proteins

    DEFF Research Database (Denmark)

    Best, R. B.; Lindorff-Larsen, Kresten; DePristo, M. A.

    2006-01-01

    Different experimental structures of the same protein or of proteins with high sequence similarity contain many small variations. Here we construct ensembles of "high-sequence similarity Protein Data Bank" (HSP) structures and consider the extent to which such ensembles represent the structural...... Data Bank ensembles; moreover, we show that the effects of uncertainties in structure determination are insufficient to explain the results. These results highlight the importance of accounting for native-state protein dynamics in making comparisons with ensemble-averaged experimental data and suggest...... heterogeneity of the native state in solution. We find that different NMR measurements probing structure and dynamics of given proteins in solution, including order parameters, scalar couplings, and residual dipolar couplings, are remarkably well reproduced by their respective high-sequence similarity Protein...

  2. De novo centriole formation in human cells is error-prone and does not require SAS-6 self-assembly.

    Science.gov (United States)

    Wang, Won-Jing; Acehan, Devrim; Kao, Chien-Han; Jane, Wann-Neng; Uryu, Kunihiro; Tsou, Meng-Fu Bryan

    2015-11-26

    Vertebrate centrioles normally propagate through duplication, but in the absence of preexisting centrioles, de novo synthesis can occur. Consistently, centriole formation is thought to strictly rely on self-assembly, involving self-oligomerization of the centriolar protein SAS-6. Here, through reconstitution of de novo synthesis in human cells, we surprisingly found that normal looking centrioles capable of duplication and ciliation can arise in the absence of SAS-6 self-oligomerization. Moreover, whereas canonically duplicated centrioles always form correctly, de novo centrioles are prone to structural errors, even in the presence of SAS-6 self-oligomerization. These results indicate that centriole biogenesis does not strictly depend on SAS-6 self-assembly, and may require preexisting centrioles to ensure structural accuracy, fundamentally deviating from the current paradigm.

  3. Arginine de novo and nitric oxide production in disease states

    OpenAIRE

    Luiking, Yvette C.; Ten Have, Gabriella A. M.; Wolfe, Robert R.; Deutz, Nicolaas E. P.

    2012-01-01

    Arginine is derived from dietary protein intake, body protein breakdown, or endogenous de novo arginine production. The latter may be linked to the availability of citrulline, which is the immediate precursor of arginine and limiting factor for de novo arginine production. Arginine metabolism is highly compartmentalized due to the expression of the enzymes involved in arginine metabolism in various organs. A small fraction of arginine enters the NO synthase (NOS) pathway. Tetrahydrobiopterin ...

  4. Non-interacting surface solvation and dynamics in protein-protein interactions

    NARCIS (Netherlands)

    Visscher, Koen M.; Kastritis, Panagiotis L.|info:eu-repo/dai/nl/315886668; Bonvin, Alexandre M J J|info:eu-repo/dai/nl/113691238

    2015-01-01

    Protein-protein interactions control a plethora of cellular processes, including cell proliferation, differentiation, apoptosis, and signal transduction. Understanding how and why proteins interact will inevitably lead to novel structure-based drug design methods, as well as design of de novo

  5. Current strategies for protein production and purification enabling membrane protein structural biology.

    Science.gov (United States)

    Pandey, Aditya; Shin, Kyungsoo; Patterson, Robin E; Liu, Xiang-Qin; Rainey, Jan K

    2016-12-01

    Membrane proteins are still heavily under-represented in the protein data bank (PDB), owing to multiple bottlenecks. The typical low abundance of membrane proteins in their natural hosts makes it necessary to overexpress these proteins either in heterologous systems or through in vitro translation/cell-free expression. Heterologous expression of proteins, in turn, leads to multiple obstacles, owing to the unpredictability of compatibility of the target protein for expression in a given host. The highly hydrophobic and (or) amphipathic nature of membrane proteins also leads to challenges in producing a homogeneous, stable, and pure sample for structural studies. Circumventing these hurdles has become possible through the introduction of novel protein production protocols; efficient protein isolation and sample preparation methods; and, improvement in hardware and software for structural characterization. Combined, these advances have made the past 10-15 years very exciting and eventful for the field of membrane protein structural biology, with an exponential growth in the number of solved membrane protein structures. In this review, we focus on both the advances and diversity of protein production and purification methods that have allowed this growth in structural knowledge of membrane proteins through X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM).

  6. Predicting nucleic acid binding interfaces from structural models of proteins.

    Science.gov (United States)

    Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

    2012-02-01

    The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.

  7. Ion pairs in non-redundant protein structures

    Indian Academy of Sciences (India)

    Ion pairs contribute to several functions including the activity of catalytic triads, fusion of viral membranes, stability in thermophilic proteins and solvent–protein interactions. Furthermore, they have the ability to affect the stability of protein structures and are also a part of the forces that act to hold monomers together.

  8. The structure and function of endophilin proteins

    DEFF Research Database (Denmark)

    Kjaerulff, Ole; Brodin, Lennart; Jung, Anita

    2011-01-01

    Members of the BAR domain protein superfamily are essential elements of cellular traffic. Endophilins are among the best studied BAR domain proteins. They have a prominent function in synaptic vesicle endocytosis (SVE), receptor trafficking and apoptosis, and in other processes that require...

  9. BLAST-based structural annotation of protein residues using Protein Data Bank.

    Science.gov (United States)

    Singh, Harinder; Raghava, Gajendra P S

    2016-01-25

    In the era of next-generation sequencing where thousands of genomes have been already sequenced; size of protein databases is growing with exponential rate. Structural annotation of these proteins is one of the biggest challenges for the computational biologist. Although, it is easy to perform BLAST search against Protein Data Bank (PDB) but it is difficult for a biologist to annotate protein residues from BLAST search. A web-server StarPDB has been developed for structural annotation of a protein based on its similarity with known protein structures. It uses standard BLAST software for performing similarity search of a query protein against protein structures in PDB. This server integrates wide range modules for assigning different types of annotation that includes, Secondary-structure, Accessible surface area, Tight-turns, DNA-RNA and Ligand modules. Secondary structure module allows users to predict regular secondary structure states to each residue in a protein. Accessible surface area predict the exposed or buried residues in a protein. Tight-turns module is designed to predict tight turns like beta-turns in a protein. DNA-RNA module developed for predicting DNA and RNA interacting residues in a protein. Similarly, Ligand module of server allows one to predicted ligands, metal and nucleotides ligand interacting residues in a protein. In summary, this manuscript presents a web server for comprehensive annotation of a protein based on similarity search. It integrates number of visualization tools that facilitate users to understand structure and function of protein residues. This web server is available freely for scientific community from URL http://crdd.osdd.net/raghava/starpdb .

  10. The contact activation proteins: a structure/function overview

    NARCIS (Netherlands)

    Meijers, J. C.; McMullen, B. A.; Bouma, B. N.

    1992-01-01

    In recent years, extensive knowledge has been obtained on the structure/function relationships of blood coagulation proteins. In this overview, we present recent developments on the structure/function relationships of the contact activation proteins: factor XII, high molecular weight kininogen,

  11. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng

    2016-06-15

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

  12. Prediction of protein–protein interactions: unifying evolution and structure at protein interfaces

    International Nuclear Information System (INIS)

    Tuncbag, Nurcan; Gursoy, Attila; Keskin, Ozlem

    2011-01-01

    The vast majority of the chores in the living cell involve protein–protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein–protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations

  13. Structure of synaptophysin: a hexameric MARVEL-domain channel protein.

    Science.gov (United States)

    Arthur, Christopher P; Stowell, Michael H B

    2007-06-01

    Synaptophysin I (SypI) is an archetypal member of the MARVEL-domain family of integral membrane proteins and one of the first synaptic vesicle proteins to be identified and cloned. Most all MARVEL-domain proteins are involved in membrane apposition and vesicle-trafficking events, but their precise role in these processes is unclear. We have purified mammalian SypI and determined its three-dimensional (3D) structure by using electron microscopy and single-particle 3D reconstruction. The hexameric structure resembles an open basket with a large pore and tenuous interactions within the cytosolic domain. The structure suggests a model for Synaptophysin's role in fusion and recycling that is regulated by known interactions with the SNARE machinery. This 3D structure of a MARVEL-domain protein provides a structural foundation for understanding the role of these important proteins in a variety of biological processes.

  14. Sampling Realistic Protein Conformations Using Local Structural Bias

    DEFF Research Database (Denmark)

    Hamelryck, Thomas Wim; Kent, John T.; Krogh, A.

    2006-01-01

    The prediction of protein structure from sequence remains a major unsolved problem in biology. The most successful protein structure prediction methods make use of a divide-and-conquer strategy to attack the problem: a conformational sampling method generates plausible candidate structures, which...... are subsequently accepted or rejected using an energy function. Conceptually, this often corresponds to separating local structural bias from the long-range interactions that stabilize the compact, native state. However, sampling protein conformations that are compatible with the local structural bias encoded...... in a given protein sequence is a long-standing open problem, especially in continuous space. We describe an elegant and mathematically rigorous method to do this, and show that it readily generates native-like protein conformations simply by enforcing compactness. Our results have far-reaching implications...

  15. Rapid and reliable protein structure determination via chemical shift threading.

    Science.gov (United States)

    Hafsa, Noor E; Berjanskii, Mark V; Arndt, David; Wishart, David S

    2018-01-01

    Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca .

  16. Is protein structure prediction still an enigma?

    African Journals Online (AJOL)

    STORAGESEVER

    2008-12-29

    Dec 29, 2008 ... Computer methods for protein analysis address this problem since they study the .... neighbor methods, molecular dynamic simulation, and approaches .... fuzzy clustering, neural net works, logistic regression, decision tree ...

  17. Solution structure and dynamics of melanoma inhibitory activity protein

    International Nuclear Information System (INIS)

    Lougheed, Julie C.; Domaille, Peter J.; Handel, Tracy M.

    2002-01-01

    Melanoma inhibitory activity (MIA) is a small secreted protein that is implicated in cartilage cell maintenance and melanoma metastasis. It is representative of a recently discovered family of proteins that contain a Src Homologous 3 (SH3) subdomain. While SH3 domains are normally found in intracellular proteins and mediate protein-protein interactions via recognition of polyproline helices, MIA is single-domain extracellular protein, and it probably binds to a different class of ligands.Here we report the assignments, solution structure, and dynamics of human MIA determined by heteronuclear NMR methods. The structures were calculated in a semi-automated manner without manual assignment of NOE crosspeaks, and have a backbone rmsd of 0.38 A over the ordered regions of the protein. The structure consists of an SH3-like subdomain with N- and C-terminal extensions of approximately 20 amino acids each that together form a novel fold. The rmsd between the solution structure and our recently reported crystal structure is 0.86 A over the ordered regions of the backbone, and the main differences are localized to the most dynamic regions of the protein. The similarity between the NMR and crystal structures supports the use of automated NOE assignments and ambiguous restraints to accelerate the calculation of NMR structures

  18. Function and structure of GFP-like proteins in the protein data bank.

    Science.gov (United States)

    Ong, Wayne J-H; Alvarez, Samuel; Leroux, Ivan E; Shahid, Ramza S; Samma, Alex A; Peshkepija, Paola; Morgan, Alicia L; Mulcahy, Shawn; Zimmer, Marc

    2011-04-01

    The RCSB protein databank contains 266 crystal structures of green fluorescent proteins (GFP) and GFP-like proteins. This is the first systematic analysis of all the GFP-like structures in the pdb. We have used the pdb to examine the function of fluorescent proteins (FP) in nature, aspects of excited state proton transfer (ESPT) in FPs, deformation from planarity of the chromophore and chromophore maturation. The conclusions reached in this review are that (1) The lid residues are highly conserved, particularly those on the "top" of the β-barrel. They are important to the function of GFP-like proteins, perhaps in protecting the chromophore or in β-barrel formation. (2) The primary/ancestral function of GFP-like proteins may well be to aid in light induced electron transfer. (3) The structural prerequisites for light activated proton pumps exist in many structures and it's possible that like bioluminescence, proton pumps are secondary functions of GFP-like proteins. (4) In most GFP-like proteins the protein matrix exerts a significant strain on planar chromophores forcing most GFP-like proteins to adopt non-planar chromophores. These chromophoric deviations from planarity play an important role in determining the fluorescence quantum yield. (5) The chemospatial characteristics of the chromophore cavity determine the isomerization state of the chromophore. The cavities of highlighter proteins that can undergo cis/trans isomerization have chemospatial properties that are common to both cis and trans GFP-like proteins.

  19. A protein relational database and protein family knowledge bases to facilitate structure-based design analyses.

    Science.gov (United States)

    Mobilio, Dominick; Walker, Gary; Brooijmans, Natasja; Nilakantan, Ramaswamy; Denny, R Aldrin; Dejoannis, Jason; Feyfant, Eric; Kowticwar, Rupesh K; Mankala, Jyoti; Palli, Satish; Punyamantula, Sairam; Tatipally, Maneesh; John, Reji K; Humblet, Christine

    2010-08-01

    The Protein Data Bank is the most comprehensive source of experimental macromolecular structures. It can, however, be difficult at times to locate relevant structures with the Protein Data Bank search interface. This is particularly true when searching for complexes containing specific interactions between protein and ligand atoms. Moreover, searching within a family of proteins can be tedious. For example, one cannot search for some conserved residue as residue numbers vary across structures. We describe herein three databases, Protein Relational Database, Kinase Knowledge Base, and Matrix Metalloproteinase Knowledge Base, containing protein structures from the Protein Data Bank. In Protein Relational Database, atom-atom distances between protein and ligand have been precalculated allowing for millisecond retrieval based on atom identity and distance constraints. Ring centroids, centroid-centroid and centroid-atom distances and angles have also been included permitting queries for pi-stacking interactions and other structural motifs involving rings. Other geometric features can be searched through the inclusion of residue pair and triplet distances. In Kinase Knowledge Base and Matrix Metalloproteinase Knowledge Base, the catalytic domains have been aligned into common residue numbering schemes. Thus, by searching across Protein Relational Database and Kinase Knowledge Base, one can easily retrieve structures wherein, for example, a ligand of interest is making contact with the gatekeeper residue.

  20. Tuning structure of oppositely charged nanoparticle and protein complexes

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, Sugam, E-mail: sugam@barc.gov.in; Aswal, V. K., E-mail: sugam@barc.gov.in [Solid State Physics Division, Bhabha Atomic Research Centre, Mumbai-400085 (India); Callow, P. [Institut Laue Langevin, DS/LSS, 6 rue Jules Horowitz, 38042 Grenoble Cedex 9 (France)

    2014-04-24

    Small-angle neutron scattering (SANS) has been used to probe the structures of anionic silica nanoparticles (LS30) and cationic lyszyme protein (M.W. 14.7kD, I.P. ∼ 11.4) by tuning their interaction through the pH variation. The protein adsorption on nanoparticles is found to be increasing with pH and determined by the electrostatic attraction between two components as well as repulsion between protein molecules. We show the strong electrostatic attraction between nanoparticles and protein molecules leads to protein-mediated aggregation of nanoparticles which are characterized by fractal structures. At pH 5, the protein adsorption gives rise to nanoparticle aggregation having surface fractal morphology with close packing of nanoparticles. The surface fractals transform to open structures of mass fractal morphology at higher pH (7 and 9) on approaching isoelectric point (I.P.)

  1. Studying Membrane Protein Structure and Function Using Nanodiscs

    DEFF Research Database (Denmark)

    Huda, Pie

    The structure and dynamic of membrane proteins can provide valuable information about general functions, diseases and effects of various drugs. Studying membrane proteins are a challenge as an amphiphilic environment is necessary to stabilise the protein in a functionally and structurally relevant...... form. This is most typically achieved through the use of detergent based reconstitution systems. However, time and again such systems fail to provide a suitable environment causing aggregation and inactivation. Nanodiscs are self-assembled lipoproteins containing two membrane scaffold proteins...... and a lipid bilayer in defined nanometer size, which can act as a stabiliser for membrane proteins. This enables both functional and structural investigation of membrane proteins in a detergent free environment which is closer to the native situation. Understanding the self-assembly of nanodiscs is important...

  2. Exploring protein dynamics space: the dynasome as the missing link between protein structure and function.

    Directory of Open Access Journals (Sweden)

    Ulf Hensen

    Full Text Available Proteins are usually described and classified according to amino acid sequence, structure or function. Here, we develop a minimally biased scheme to compare and classify proteins according to their internal mobility patterns. This approach is based on the notion that proteins not only fold into recurring structural motifs but might also be carrying out only a limited set of recurring mobility motifs. The complete set of these patterns, which we tentatively call the dynasome, spans a multi-dimensional space with axes, the dynasome descriptors, characterizing different aspects of protein dynamics. The unique dynamic fingerprint of each protein is represented as a vector in the dynasome space. The difference between any two vectors, consequently, gives a reliable measure of the difference between the corresponding protein dynamics. We characterize the properties of the dynasome by comparing the dynamics fingerprints obtained from molecular dynamics simulations of 112 proteins but our approach is, in principle, not restricted to any specific source of data of protein dynamics. We conclude that: 1. the dynasome consists of a continuum of proteins, rather than well separated classes. 2. For the majority of proteins we observe strong correlations between structure and dynamics. 3. Proteins with similar function carry out similar dynamics, which suggests a new method to improve protein function annotation based on protein dynamics.

  3. Host Proteins Determine MRSA Biofilm Structure and Integrity

    DEFF Research Database (Denmark)

    Dreier, Cindy; Nielsen, Astrid; Jørgensen, Nis Pedersen

    Human extracellular matrix (hECM) proteins aids the initial attachment and initiation of an infection, by specific binding to bacterial cell surface proteins. However, the importance of hECM proteins in structure, integrity and antibiotic resilience of a biofilm is unknown. This study aims...... to determine how specific hECM proteins affect S. aureus USA300 JE2 biofilms. Biofilms were grown in the presence of synovial fluid from rheumatoid arteritis patients to mimic in vivo conditions, where bacteria incorporate hECM proteins into the biofilm matrix. Difference in biofilm structure, with and without...... addition of hECM to growth media, was visualized by confocal laser scanning microscopy. Two enzymatic degradation experiments were used to study biofilm matrix composition and importance of hECM proteins: enzymatic removal of specific hECM proteins from growth media, before biofilm formation, and enzymatic...

  4. Integral membrane protein structure determination using pseudocontact shifts

    Energy Technology Data Exchange (ETDEWEB)

    Crick, Duncan J.; Wang, Jue X. [University of Cambridge, Department of Biochemistry (United Kingdom); Graham, Bim; Swarbrick, James D. [Monash University, Monash Institute of Pharmaceutical Sciences (Australia); Mott, Helen R.; Nietlispach, Daniel, E-mail: dn206@cam.ac.uk [University of Cambridge, Department of Biochemistry (United Kingdom)

    2015-04-15

    Obtaining enough experimental restraints can be a limiting factor in the NMR structure determination of larger proteins. This is particularly the case for large assemblies such as membrane proteins that have been solubilized in a membrane-mimicking environment. Whilst in such cases extensive deuteration strategies are regularly utilised with the aim to improve the spectral quality, these schemes often limit the number of NOEs obtainable, making complementary strategies highly beneficial for successful structure elucidation. Recently, lanthanide-induced pseudocontact shifts (PCSs) have been established as a structural tool for globular proteins. Here, we demonstrate that a PCS-based approach can be successfully applied for the structure determination of integral membrane proteins. Using the 7TM α-helical microbial receptor pSRII, we show that PCS-derived restraints from lanthanide binding tags attached to four different positions of the protein facilitate the backbone structure determination when combined with a limited set of NOEs. In contrast, the same set of NOEs fails to determine the correct 3D fold. The latter situation is frequently encountered in polytopical α-helical membrane proteins and a PCS approach is thus suitable even for this particularly challenging class of membrane proteins. The ease of measuring PCSs makes this an attractive route for structure determination of large membrane proteins in general.

  5. Using linear algebra for protein structural comparison and classification.

    Science.gov (United States)

    Gomide, Janaína; Melo-Minardi, Raquel; Dos Santos, Marcos Augusto; Neshich, Goran; Meira, Wagner; Lopes, Júlio César; Santoro, Marcelo

    2009-07-01

    In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.

  6. Using linear algebra for protein structural comparison and classification

    Directory of Open Access Journals (Sweden)

    Janaína Gomide

    2009-01-01

    Full Text Available In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD and Latent Semantic Indexing (LSI techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.

  7. Structural Mass Spectrometry of Proteins Using Hydroxyl Radical Based Protein Footprinting

    OpenAIRE

    Wang, Liwen; Chance, Mark R.

    2011-01-01

    Structural MS is a rapidly growing field with many applications in basic research and pharmaceutical drug development. In this feature article the overall technology is described and several examples of how hydroxyl radical based footprinting MS can be used to map interfaces, evaluate protein structure, and identify ligand dependent conformational changes in proteins are described.

  8. PSPP: a protein structure prediction pipeline for computing clusters.

    Directory of Open Access Journals (Sweden)

    Michael S Lee

    2009-07-01

    Full Text Available Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster.The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML formats. So far, the pipeline has been used to study viral and bacterial proteomes.The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited number of queries as well as perform

  9. Bayesian comparison of protein structures using partial Procrustes distance.

    Science.gov (United States)

    Ejlali, Nasim; Faghihi, Mohammad Reza; Sadeghi, Mehdi

    2017-09-26

    An important topic in bioinformatics is the protein structure alignment. Some statistical methods have been proposed for this problem, but most of them align two protein structures based on the global geometric information without considering the effect of neighbourhood in the structures. In this paper, we provide a Bayesian model to align protein structures, by considering the effect of both local and global geometric information of protein structures. Local geometric information is incorporated to the model through the partial Procrustes distance of small substructures. These substructures are composed of β-carbon atoms from the side chains. Parameters are estimated using a Markov chain Monte Carlo (MCMC) approach. We evaluate the performance of our model through some simulation studies. Furthermore, we apply our model to a real dataset and assess the accuracy and convergence rate. Results show that our model is much more efficient than previous approaches.

  10. Structural and Functional Annotation of Hypothetical Proteins of O139

    Directory of Open Access Journals (Sweden)

    Md. Saiful Islam

    2015-06-01

    Full Text Available In developing countries threat of cholera is a significant health concern whenever water purification and sewage disposal systems are inadequate. Vibrio cholerae is one of the responsible bacteria involved in cholera disease. The complete genome sequence of V. cholerae deciphers the presence of various genes and hypothetical proteins whose function are not yet understood. Hence analyzing and annotating the structure and function of hypothetical proteins is important for understanding the V. cholerae. V. cholerae O139 is the most common and pathogenic bacterial strain among various V. cholerae strains. In this study sequence of six hypothetical proteins of V. cholerae O139 has been annotated from NCBI. Various computational tools and databases have been used to determine domain family, protein-protein interaction, solubility of protein, ligand binding sites etc. The three dimensional structure of two proteins were modeled and their ligand binding sites were identified. We have found domains and families of only one protein. The analysis revealed that these proteins might have antibiotic resistance activity, DNA breaking-rejoining activity, integrase enzyme activity, restriction endonuclease, etc. Structural prediction of these proteins and detection of binding sites from this study would indicate a potential target aiding docking studies for therapeutic designing against cholera.

  11. Structural study of surfactant-dependent interaction with protein

    Energy Technology Data Exchange (ETDEWEB)

    Mehan, Sumit; Aswal, Vinod K., E-mail: vkaswal@barc.gov.in [Solid State Physics Division, Bhabha Atomic Research Centre, Mumbai 400 085 (India); Kohlbrecher, Joachim [Laboratory for Neutron Scattering, Paul Scherrer Institut, CH-5232 PSI Villigen (Switzerland)

    2015-06-24

    Small-angle neutron scattering (SANS) has been used to study the complex structure of anionic BSA protein with three different (cationic DTAB, anionic SDS and non-ionic C12E10) surfactants. These systems form very different surfactant-dependent complexes. We show that the structure of protein-surfactant complex is initiated by the site-specific electrostatic interaction between the components, followed by the hydrophobic interaction at high surfactant concentrations. It is also found that hydrophobic interaction is preferred over the electrostatic interaction in deciding the resultant structure of protein-surfactant complexes.

  12. Integrating protein structures and precomputed genealogies in the Magnum database: Examples with cellular retinoid binding proteins

    Directory of Open Access Journals (Sweden)

    Bradley Michael E

    2006-02-01

    Full Text Available Abstract Background When accurate models for the divergent evolution of protein sequences are integrated with complementary biological information, such as folded protein structures, analyses of the combined data often lead to new hypotheses about molecular physiology. This represents an excellent example of how bioinformatics can be used to guide experimental research. However, progress in this direction has been slowed by the lack of a publicly available resource suitable for general use. Results The precomputed Magnum database offers a solution to this problem for ca. 1,800 full-length protein families with at least one crystal structure. The Magnum deliverables include 1 multiple sequence alignments, 2 mapping of alignment sites to crystal structure sites, 3 phylogenetic trees, 4 inferred ancestral sequences at internal tree nodes, and 5 amino acid replacements along tree branches. Comprehensive evaluations revealed that the automated procedures used to construct Magnum produced accurate models of how proteins divergently evolve, or genealogies, and correctly integrated these with the structural data. To demonstrate Magnum's capabilities, we asked for amino acid replacements requiring three nucleotide substitutions, located at internal protein structure sites, and occurring on short phylogenetic tree branches. In the cellular retinoid binding protein family a site that potentially modulates ligand binding affinity was discovered. Recruitment of cellular retinol binding protein to function as a lens crystallin in the diurnal gecko afforded another opportunity to showcase the predictive value of a browsable database containing branch replacement patterns integrated with protein structures. Conclusion We integrated two areas of protein science, evolution and structure, on a large scale and created a precomputed database, known as Magnum, which is the first freely available resource of its kind. Magnum provides evolutionary and structural

  13. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

    KAUST Repository

    Najibi, Seyed Morteza

    2017-02-08

    Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  14. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

    KAUST Repository

    Najibi, Seyed Morteza; Maadooliat, Mehdi; Zhou, Lan; Huang, Jianhua Z.; Gao, Xin

    2017-01-01

    Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  15. 3D complex: a structural classification of protein complexes.

    Directory of Open Access Journals (Sweden)

    Emmanuel D Levy

    2006-11-01

    Full Text Available Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.

  16. Protein structure determination by exhaustive search of Protein Data Bank derived databases.

    Science.gov (United States)

    Stokes-Rees, Ian; Sliz, Piotr

    2010-12-14

    Parallel sequence and structure alignment tools have become ubiquitous and invaluable at all levels in the study of biological systems. We demonstrate the application and utility of this same parallel search paradigm to the process of protein structure determination, benefitting from the large and growing corpus of known structures. Such searches were previously computationally intractable. Through the method of Wide Search Molecular Replacement, developed here, they can be completed in a few hours with the aide of national-scale federated cyberinfrastructure. By dramatically expanding the range of models considered for structure determination, we show that small (less than 12% structural coverage) and low sequence identity (less than 20% identity) template structures can be identified through multidimensional template scoring metrics and used for structure determination. Many new macromolecular complexes can benefit significantly from such a technique due to the lack of known homologous protein folds or sequences. We demonstrate the effectiveness of the method by determining the structure of a full-length p97 homologue from Trichoplusia ni. Example cases with the MHC/T-cell receptor complex and the EmoB protein provide systematic estimates of minimum sequence identity, structure coverage, and structural similarity required for this method to succeed. We describe how this structure-search approach and other novel computationally intensive workflows are made tractable through integration with the US national computational cyberinfrastructure, allowing, for example, rapid processing of the entire Structural Classification of Proteins protein fragment database.

  17. De novo molecular design

    CERN Document Server

    Schneider, Gisbert

    2013-01-01

    Systematically examining current methods and strategies, this ready reference covers a wide range of molecular structures, from organic-chemical drugs to peptides, Proteins and nucleic acids, in line with emerging new drug classes derived from biomacromolecules. A leader in the field and one of the pioneers of this young discipline has assembled here the most prominent experts from across the world to provide first-hand knowledge. While most of their methods and examples come from the area of pharmaceutical discovery and development, the approaches are equally applicable for chemical probes an

  18. Topological properties of complex networks in protein structures

    Science.gov (United States)

    Kim, Kyungsik; Jung, Jae-Won; Min, Seungsik

    2014-03-01

    We study topological properties of networks in structural classification of proteins. We model the native-state protein structure as a network made of its constituent amino-acids and their interactions. We treat four structural classes of proteins composed predominantly of α helices and β sheets and consider several proteins from each of these classes whose sizes range from amino acids of the Protein Data Bank. Particularly, we simulate and analyze the network metrics such as the mean degree, the probability distribution of degree, the clustering coefficient, the characteristic path length, the local efficiency, and the cost. This work was supported by the KMAR and DP under Grant WISE project (153-3100-3133-302-350).

  19. Relationship between Molecular Structure Characteristics of Feed Proteins and Protein In vitro Digestibility and Solubility.

    Science.gov (United States)

    Bai, Mingmei; Qin, Guixin; Sun, Zewei; Long, Guohui

    2016-08-01

    The nutritional value of feed proteins and their utilization by livestock are related not only to the chemical composition but also to the structure of feed proteins, but few studies thus far have investigated the relationship between the structure of feed proteins and their solubility as well as digestibility in monogastric animals. To address this question we analyzed soybean meal, fish meal, corn distiller's dried grains with solubles, corn gluten meal, and feather meal by Fourier transform infrared (FTIR) spectroscopy to determine the protein molecular spectral band characteristics for amides I and II as well as α-helices and β-sheets and their ratios. Protein solubility and in vitro digestibility were measured with the Kjeldahl method using 0.2% KOH solution and the pepsin-pancreatin two-step enzymatic method, respectively. We found that all measured spectral band intensities (height and area) of feed proteins were correlated with their the in vitro digestibility and solubility (p≤0.003); moreover, the relatively quantitative amounts of α-helices, random coils, and α-helix to β-sheet ratio in protein secondary structures were positively correlated with protein in vitro digestibility and solubility (p≤0.004). On the other hand, the percentage of β-sheet structures was negatively correlated with protein in vitro digestibility (pdigestibility at 28 h and solubility. Furthermore, the α-helix-to-β-sheet ratio can be used to predict the nutritional value of feed proteins.

  20. Effects of NMR spectral resolution on protein structure calculation.

    Directory of Open Access Journals (Sweden)

    Suhas Tikole

    Full Text Available Adequate digital resolution and signal sensitivity are two critical factors for protein structure determinations by solution NMR spectroscopy. The prime objective for obtaining high digital resolution is to resolve peak overlap, especially in NOESY spectra with thousands of signals where the signal analysis needs to be performed on a large scale. Achieving maximum digital resolution is usually limited by the practically available measurement time. We developed a method utilizing non-uniform sampling for balancing digital resolution and signal sensitivity, and performed a large-scale analysis of the effect of the digital resolution on the accuracy of the resulting protein structures. Structure calculations were performed as a function of digital resolution for about 400 proteins with molecular sizes ranging between 5 and 33 kDa. The structural accuracy was assessed by atomic coordinate RMSD values from the reference structures of the proteins. In addition, we monitored also the number of assigned NOESY cross peaks, the average signal sensitivity, and the chemical shift spectral overlap. We show that high resolution is equally important for proteins of every molecular size. The chemical shift spectral overlap depends strongly on the corresponding spectral digital resolution. Thus, knowing the extent of overlap can be a predictor of the resulting structural accuracy. Our results show that for every molecular size a minimal digital resolution, corresponding to the natural linewidth, needs to be achieved for obtaining the highest accuracy possible for the given protein size using state-of-the-art automated NOESY assignment and structure calculation methods.

  1. Structural and Function Prediction of Musa acuminata subsp. Malaccensis Protein

    Directory of Open Access Journals (Sweden)

    Anum Munir

    2016-03-01

    Full Text Available Hypothetical proteins (HPs are the proteins whose presence has been anticipated, yet in vivo function has not been built up. Illustrating the structural and functional privileged insights of these HPs might likewise prompt a superior comprehension of the protein-protein associations or networks in diverse types of life. Bananas (Musa acuminata spp., including sweet and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister grouped to the all-around considered Poales, which incorporate oats. Bananas are crucial for nourishment security in numerous tropical and subtropical nations and the most prominent organic product in industrialized nations. In the present study, the hypothetical protein of M. acuminata (Banana was chosen for analysis and modeling by distinctive bioinformatics apparatuses and databases. As indicated by primary and secondary structure analysis, XP_009393594.1 is a stable hydrophobic protein containing a noteworthy extent of α-helices; Homology modeling was done utilizing SWISS-MODEL server where the templates identity with XP_009393594.1 protein was less which demonstrated novelty of our protein. Ab initio strategy was conducted to produce its 3D structure. A few evaluations of quality assessment and validation parameters determined the generated protein model as stable with genuinely great quality. Functional analysis was completed by ProtFun 2.2, and KEGG (KAAS, recommended that the hypothetical protein is a transcription factor with cytoplasmic domain as zinc finger. The protein was observed to be vital for translation process, involved in metabolism, signaling and cellular processes, genetic information processing and Zinc ion binding. It is suggested that further test approval would help to anticipate the structures and functions of other uncharacterized proteins of different plants and living being.

  2. Structural basis for target protein recognition by the protein disulfide reductase thioredoxin

    DEFF Research Database (Denmark)

    Maeda, Kenji; Hägglund, Per; Finnie, Christine

    2006-01-01

    Thioredoxin is ubiquitous and regulates various target proteins through disulfide bond reduction. We report the structure of thioredoxin (HvTrxh2 from barley) in a reaction intermediate complex with a protein substrate, barley alpha-amylase/subtilisin inhibitor (BASI). The crystal structure...... of this mixed disulfide shows a conserved hydrophobic motif in thioredoxin interacting with a sequence of residues from BASI through van der Waals contacts and backbone-backbone hydrogen bonds. The observed structural complementarity suggests that the recognition of features around protein disulfides plays...... a major role in the specificity and protein disulfide reductase activity of thioredoxin. This novel insight into the function of thioredoxin constitutes a basis for comprehensive understanding of its biological role. Moreover, comparison with structurally related proteins shows that thioredoxin shares...

  3. Structural studies of human glioma pathogenesis-related protein 1

    Energy Technology Data Exchange (ETDEWEB)

    Asojo, Oluwatoyin A., E-mail: oasojo@unmc.edu [College of Medicine, Nebraska Medical Center, Omaha, NE 68198-6495 (United States); Koski, Raymond A.; Bonafé, Nathalie [L2 Diagnostics LLC, 300 George Street, New Haven, CT 06511 (United States); College of Medicine, Nebraska Medical Center, Omaha, NE 68198-6495 (United States)

    2011-10-01

    Structural analysis of a truncated soluble domain of human glioma pathogenesis-related protein 1, a membrane protein implicated in the proliferation of aggressive brain cancer, is presented. Human glioma pathogenesis-related protein 1 (GLIPR1) is a membrane protein that is highly upregulated in brain cancers but is barely detectable in normal brain tissue. GLIPR1 is composed of a signal peptide that directs its secretion, a conserved cysteine-rich CAP (cysteine-rich secretory proteins, antigen 5 and pathogenesis-related 1 proteins) domain and a transmembrane domain. GLIPR1 is currently being investigated as a candidate for prostate cancer gene therapy and for glioblastoma targeted therapy. Crystal structures of a truncated soluble domain of the human GLIPR1 protein (sGLIPR1) solved by molecular replacement using a truncated polyalanine search model of the CAP domain of stecrisp, a snake-venom cysteine-rich secretory protein (CRISP), are presented. The correct molecular-replacement solution could only be obtained by removing all loops from the search model. The native structure was refined to 1.85 Å resolution and that of a Zn{sup 2+} complex was refined to 2.2 Å resolution. The latter structure revealed that the putative binding cavity coordinates Zn{sup 2+} similarly to snake-venom CRISPs, which are involved in Zn{sup 2+}-dependent mechanisms of inflammatory modulation. Both sGLIPR1 structures have extensive flexible loop/turn regions and unique charge distributions that were not observed in any of the previously reported CAP protein structures. A model is also proposed for the structure of full-length membrane-bound GLIPR1.

  4. Structure and function of nanoparticle-protein conjugates

    International Nuclear Information System (INIS)

    Aubin-Tam, M-E; Hamad-Schifferli, K

    2008-01-01

    Conjugation of proteins to nanoparticles has numerous applications in sensing, imaging, delivery, catalysis, therapy and control of protein structure and activity. Therefore, characterizing the nanoparticle-protein interface is of great importance. A variety of covalent and non-covalent linking chemistries have been reported for nanoparticle attachment. Site-specific labeling is desirable in order to control the protein orientation on the nanoparticle, which is crucial in many applications such as fluorescence resonance energy transfer. We evaluate methods for successful site-specific attachment. Typically, a specific protein residue is linked directly to the nanoparticle core or to the ligand. As conjugation often affects the protein structure and function, techniques to probe structure and activity are assessed. We also examine how molecular dynamics simulations of conjugates would complete those experimental techniques in order to provide atomistic details on the effect of nanoparticle attachment. Characterization studies of nanoparticle-protein complexes show that the structure and function are influenced by the chemistry of the nanoparticle ligand, the nanoparticle size, the nanoparticle material, the stoichiometry of the conjugates, the labeling site on the protein and the nature of the linkage (covalent versus non-covalent)

  5. Computing a new family of shape descriptors for protein structures

    DEFF Research Database (Denmark)

    Røgen, Peter; Sinclair, Robert

    2003-01-01

    The large-scale 3D structure of a protein can be represented by the polygonal curve through the carbon a atoms of the protein backbone. We introduce an algorithm for computing the average number of times that a given configuration of crossings on such polygonal curves is seen, the average being...

  6. Production in Pichia pastoris of complementary protein-based polymers with heterodimer-forming WW and PPxY domains

    NARCIS (Netherlands)

    Domeradzka, Natalia E.; Werten, Marc W.T.; Vries, de Renko; Wolf, de Frits A.

    2016-01-01

    Background: Specific coupling of de novo designed recombinant protein polymers for the construction of precisely structured nanomaterials is of interest for applications in biomedicine, pharmaceutics and diagnostics. An attractive coupling strategy is to incorporate specifically interacting

  7. Simulation of Protein Structure, Dynamics and Function in Organic Media

    National Research Council Canada - National Science Library

    Daggett, Valerie

    1998-01-01

    The overall goal of our ONR-sponsored research is to pursue realistic molecular modeling strudies pertinnent to the related properties of protein stability, dynamics, structure, function, and folding in aqueous solution...

  8. Protein structure estimation from NMR data by matrix completion.

    Science.gov (United States)

    Li, Zhicheng; Li, Yang; Lei, Qiang; Zhao, Qing

    2017-09-01

    Knowledge of protein structures is very important to understand their corresponding physical and chemical properties. Nuclear Magnetic Resonance (NMR) spectroscopy is one of the main methods to measure protein structure. In this paper, we propose a two-stage approach to calculate the structure of a protein from a highly incomplete distance matrix, where most data are obtained from NMR. We first randomly "guess" a small part of unobservable distances by utilizing the triangle inequality, which is crucial for the second stage. Then we use matrix completion to calculate the protein structure from the obtained incomplete distance matrix. We apply the accelerated proximal gradient algorithm to solve the corresponding optimization problem. Furthermore, the recovery error of our method is analyzed, and its efficiency is demonstrated by several practical examples.

  9. Modeling membrane protein structure through site-directed ESR spectroscopy

    NARCIS (Netherlands)

    Kavalenka, A.A.

    2009-01-01

    Site-directed spin labeling (SDSL) electron spin resonance (ESR) spectroscopy is a
    relatively new biophysical tool for obtaining structural information about proteins. This
    thesis presents a novel approach, based on powerful spectral analysis techniques (multicomponent
    spectral

  10. Potato leafroll virus structural proteins manipulate overlapping, yet distinct protein interaction networks during infection.

    Science.gov (United States)

    DeBlasio, Stacy L; Johnson, Richard; Sweeney, Michelle M; Karasev, Alexander; Gray, Stewart M; MacCoss, Michael J; Cilia, Michelle

    2015-06-01

    Potato leafroll virus (PLRV) produces a readthrough protein (RTP) via translational readthrough of the coat protein amber stop codon. The RTP functions as a structural component of the virion and as a nonincorporated protein in concert with numerous insect and plant proteins to regulate virus movement/transmission and tissue tropism. Affinity purification coupled to quantitative MS was used to generate protein interaction networks for a PLRV mutant that is unable to produce the read through domain (RTD) and compared to the known wild-type PLRV protein interaction network. By quantifying differences in the protein interaction networks, we identified four distinct classes of PLRV-plant interactions: those plant and nonstructural viral proteins interacting with assembled coat protein (category I); plant proteins in complex with both coat protein and RTD (category II); plant proteins in complex with the RTD (category III); and plant proteins that had higher affinity for virions lacking the RTD (category IV). Proteins identified as interacting with the RTD are potential candidates for regulating viral processes that are mediated by the RTP such as phloem retention and systemic movement and can potentially be useful targets for the development of strategies to prevent infection and/or viral transmission of Luteoviridae species that infect important crop species. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Binding free energy analysis of protein-protein docking model structures by evERdock.

    Science.gov (United States)

    Takemura, Kazuhiro; Matubayasi, Nobuyuki; Kitao, Akio

    2018-03-14

    To aid the evaluation of protein-protein complex model structures generated by protein docking prediction (decoys), we previously developed a method to calculate the binding free energies for complexes. The method combines a short (2 ns) all-atom molecular dynamics simulation with explicit solvent and solution theory in the energy representation (ER). We showed that this method successfully selected structures similar to the native complex structure (near-native decoys) as the lowest binding free energy structures. In our current work, we applied this method (evERdock) to 100 or 300 model structures of four protein-protein complexes. The crystal structures and the near-native decoys showed the lowest binding free energy of all the examined structures, indicating that evERdock can successfully evaluate decoys. Several decoys that show low interface root-mean-square distance but relatively high binding free energy were also identified. Analysis of the fraction of native contacts, hydrogen bonds, and salt bridges at the protein-protein interface indicated that these decoys were insufficiently optimized at the interface. After optimizing the interactions around the interface by including interfacial water molecules, the binding free energies of these decoys were improved. We also investigated the effect of solute entropy on binding free energy and found that consideration of the entropy term does not necessarily improve the evaluations of decoys using the normal model analysis for entropy calculation.

  12. Constraining cyclic peptides to mimic protein structure motifs

    DEFF Research Database (Denmark)

    Hill, Timothy A.; Shepherd, Nicholas E.; Diness, Frederik

    2014-01-01

    peptides can have protein-like biological activities and potencies, enabling their uses as biological probes and leads to therapeutics, diagnostics and vaccines. This Review highlights examples of cyclic peptides that mimic three-dimensional structures of strand, turn or helical segments of peptides...... and proteins, and identifies some additional restraints incorporated into natural product cyclic peptides and synthetic macrocyclic pepti-domimetics that refine peptide structure and confer biological properties....

  13. Overcoming bottlenecks in the membrane protein structural biology pipeline.

    Science.gov (United States)

    Hardy, David; Bill, Roslyn M; Jawhari, Anass; Rothnie, Alice J

    2016-06-15

    Membrane proteins account for a third of the eukaryotic proteome, but are greatly under-represented in the Protein Data Bank. Unfortunately, recent technological advances in X-ray crystallography and EM cannot account for the poor solubility and stability of membrane protein samples. A limitation of conventional detergent-based methods is that detergent molecules destabilize membrane proteins, leading to their aggregation. The use of orthologues, mutants and fusion tags has helped improve protein stability, but at the expense of not working with the sequence of interest. Novel detergents such as glucose neopentyl glycol (GNG), maltose neopentyl glycol (MNG) and calixarene-based detergents can improve protein stability without compromising their solubilizing properties. Styrene maleic acid lipid particles (SMALPs) focus on retaining the native lipid bilayer of a membrane protein during purification and biophysical analysis. Overcoming bottlenecks in the membrane protein structural biology pipeline, primarily by maintaining protein stability, will facilitate the elucidation of many more membrane protein structures in the near future. © 2016 The Author(s). published by Portland Press Limited on behalf of the Biochemical Society.

  14. Illuminating structural proteins in viral "dark matter" with metaproteomics.

    Science.gov (United States)

    Brum, Jennifer R; Ignacio-Espinoza, J Cesar; Kim, Eun-Hae; Trubl, Gareth; Jones, Robert M; Roux, Simon; VerBerkmoes, Nathan C; Rich, Virginia I; Sullivan, Matthew B

    2016-03-01

    Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional "viral dark matter." Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional dark matter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore, four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world's oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Together, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter.

  15. Functional diversification of structurally alike NLR proteins in plants.

    Science.gov (United States)

    Chakraborty, Joydeep; Jain, Akansha; Mukherjee, Dibya; Ghosh, Suchismita; Das, Sampa

    2018-04-01

    In due course of evolution many pathogens alter their effector molecules to modulate the host plants' metabolism and immune responses triggered upon proper recognition by the intracellular nucleotide-binding oligomerization domain containing leucine-rich repeat (NLR) proteins. Likewise, host plants have also evolved with diversified NLR proteins as a survival strategy to win the battle against pathogen invasion. NLR protein indeed detects pathogen derived effector proteins leading to the activation of defense responses associated with programmed cell death (PCD). In this interactive process, genome structure and plasticity play pivotal role in the development of innate immunity. Despite being quite conserved with similar biological functions in all eukaryotes, the intracellular NLR immune receptor proteins happen to be structurally distinct. Recent studies have made progress in identifying transcriptional regulatory complexes activated by NLR proteins. In this review, we attempt to decipher the intracellular NLR proteins mediated surveillance across the evolutionarily diverse taxa, highlighting some of the recent updates on NLR protein compartmentalization, molecular interactions before and after activation along with insights into the finer role of these receptor proteins to combat invading pathogens upon their recognition. Latest information on NLR sensors, helpers and NLR proteins with integrated domains in the context of plant pathogen interactions are also discussed. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. Combining neural networks for protein secondary structure prediction

    DEFF Research Database (Denmark)

    Riis, Søren Kamaric

    1995-01-01

    In this paper structured neural networks are applied to the problem of predicting the secondary structure of proteins. A hierarchical approach is used where specialized neural networks are designed for each structural class and then combined using another neural network. The submodels are designed...... by using a priori knowledge of the mapping between protein building blocks and the secondary structure and by using weight sharing. Since none of the individual networks have more than 600 adjustable weights over-fitting is avoided. When ensembles of specialized experts are combined the performance...

  17. A generative, probabilistic model of local protein structure

    DEFF Research Database (Denmark)

    Boomsma, Wouter; Mardia, Kanti V.; Taylor, Charles C.

    2008-01-01

    Despite significant progress in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. One of the key remaining challenges is an efficient probabilistic exploration of the structural space that correctly reflects the relative...... conformational stabilities. Here, we present a fully probabilistic, continuous model of local protein structure in atomic detail. The generative model makes efficient conformational sampling possible and provides a framework for the rigorous analysis of local sequence-structure correlations in the native state...

  18. Creation and structure determination of an artificial protein with three complete sequence repeats

    Energy Technology Data Exchange (ETDEWEB)

    Adachi, Motoyasu, E-mail: adachi.motoyasu@jaea.go.jp; Shimizu, Rumi; Kuroki, Ryota [Japan Atomic Energy Agency, Shirakatashirane 2-4, Nakagun Tokaimura, Ibaraki 319-1195 (Japan); Blaber, Michael [Japan Atomic Energy Agency, Shirakatashirane 2-4, Nakagun Tokaimura, Ibaraki 319-1195 (Japan); Florida State University, Tallahassee, FL 32306-4300 (United States)

    2013-11-01

    An artificial protein with three complete sequence repeats was created and the structure was determined by X-ray crystallography. The structure showed threefold symmetry even though there is an amino- and carboxy-terminal. The artificial protein with threefold symmetry may be useful as a scaffold to capture small materials with C3 symmetry. Symfoil-4P is a de novo protein exhibiting the threefold symmetrical β-trefoil fold designed based on the human acidic fibroblast growth factor. First three asparagine–glycine sequences of Symfoil-4P are replaced with glutamine–glycine (Symfoil-QG) or serine–glycine (Symfoil-SG) sequences protecting from deamidation, and His-Symfoil-II was prepared by introducing a protease digestion site into Symfoil-QG so that Symfoil-II has three complete repeats after removal of the N-terminal histidine tag. The Symfoil-QG and SG and His-Symfoil-II proteins were expressed in Eschericha coli as soluble protein, and purified by nickel affinity chromatography. Symfoil-II was further purified by anion-exchange chromatography after removing the HisTag by proteolysis. Both Symfoil-QG and Symfoil-II were crystallized in 0.1 M Tris-HCl buffer (pH 7.0) containing 1.8 M ammonium sulfate as precipitant at 293 K; several crystal forms were observed for Symfoil-QG and II. The maximum diffraction of Symfoil-QG and II crystals were 1.5 and 1.1 Å resolution, respectively. The Symfoil-II without histidine tag diffracted better than Symfoil-QG with N-terminal histidine tag. Although the crystal packing of Symfoil-II is slightly different from Symfoil-QG and other crystals of Symfoil derivatives having the N-terminal histidine tag, the refined crystal structure of Symfoil-II showed pseudo-threefold symmetry as expected from other Symfoils. Since the removal of the unstructured N-terminal histidine tag did not affect the threefold structure of Symfoil, the improvement of diffraction quality of Symfoil-II may be caused by molecular characteristics of

  19. SCOWLP classification: Structural comparison and analysis of protein binding regions

    Directory of Open Access Journals (Sweden)

    Anders Gerd

    2008-01-01

    Full Text Available Abstract Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions

  20. Sequential Release of Proteins from Structured Multishell Microcapsules.

    Science.gov (United States)

    Shimanovich, Ulyana; Michaels, Thomas C T; De Genst, Erwin; Matak-Vinkovic, Dijana; Dobson, Christopher M; Knowles, Tuomas P J

    2017-10-09

    In nature, a wide range of functional materials is based on proteins. Increasing attention is also turning to the use of proteins as artificial biomaterials in the form of films, gels, particles, and fibrils that offer great potential for applications in areas ranging from molecular medicine to materials science. To date, however, most such applications have been limited to single component materials despite the fact that their natural analogues are composed of multiple types of proteins with a variety of functionalities that are coassembled in a highly organized manner on the micrometer scale, a process that is currently challenging to achieve in the laboratory. Here, we demonstrate the fabrication of multicomponent protein microcapsules where the different components are positioned in a controlled manner. We use molecular self-assembly to generate multicomponent structures on the nanometer scale and droplet microfluidics to bring together the different components on the micrometer scale. Using this approach, we synthesize a wide range of multiprotein microcapsules containing three well-characterized proteins: glucagon, insulin, and lysozyme. The localization of each protein component in multishell microcapsules has been detected by labeling protein molecules with different fluorophores, and the final three-dimensional microcapsule structure has been resolved by using confocal microscopy together with image analysis techniques. In addition, we show that these structures can be used to tailor the release of such functional proteins in a sequential manner. Moreover, our observations demonstrate that the protein release mechanism from multishell capsules is driven by the kinetic control of mass transport of the cargo and by the dissolution of the shells. The ability to generate artificial materials that incorporate a variety of different proteins with distinct functionalities increases the breadth of the potential applications of artificial protein-based materials

  1. Structuring oil by protein building blocks

    NARCIS (Netherlands)

    Vries, de Auke

    2017-01-01

    Over the recent years, structuring of oil into ‘organogels’ or ‘oleogels’ has gained much attention amongst colloid-, material,- and food scientists. Potentially, these oleogels could be used as an alternative for saturated- and trans fats in food products. To develop oleogels as a

  2. Mass Spectrometry Coupled Experiments and Protein Structure Modeling Methods

    Directory of Open Access Journals (Sweden)

    Lee Sael

    2013-10-01

    Full Text Available With the accumulation of next generation sequencing data, there is increasing interest in the study of intra-species difference in molecular biology, especially in relation to disease analysis. Furthermore, the dynamics of the protein is being identified as a critical factor in its function. Although accuracy of protein structure prediction methods is high, provided there are structural templates, most methods are still insensitive to amino-acid differences at critical points that may change the overall structure. Also, predicted structures are inherently static and do not provide information about structural change over time. It is challenging to address the sensitivity and the dynamics by computational structure predictions alone. However, with the fast development of diverse mass spectrometry coupled experiments, low-resolution but fast and sensitive structural information can be obtained. This information can then be integrated into the structure prediction process to further improve the sensitivity and address the dynamics of the protein structures. For this purpose, this article focuses on reviewing two aspects: the types of mass spectrometry coupled experiments and structural data that are obtainable through those experiments; and the structure prediction methods that can utilize these data as constraints. Also, short review of current efforts in integrating experimental data in the structural modeling is provided.

  3. Chaperonin Structure - The Large Multi-Subunit Protein Complex

    Directory of Open Access Journals (Sweden)

    Irena Roterman

    2009-03-01

    Full Text Available The multi sub-unit protein structure representing the chaperonins group is analyzed with respect to its hydrophobicity distribution. The proteins of this group assist protein folding supported by ATP. The specific axial symmetry GroEL structure (two rings of seven units stacked back to back - 524 aa each and the GroES (single ring of seven units - 97 aa each polypeptide chains are analyzed using the hydrophobicity distribution expressed as excess/deficiency all over the molecule to search for structure-to-function relationships. The empirically observed distribution of hydrophobic residues is confronted with the theoretical one representing the idealized hydrophobic core with hydrophilic residues exposure on the surface. The observed discrepancy between these two distributions seems to be aim-oriented, determining the structure-to-function relation. The hydrophobic force field structure generated by the chaperonin capsule is presented. Its possible influence on substrate folding is suggested.

  4. NMR structural studies of peptides and proteins in membranes

    Energy Technology Data Exchange (ETDEWEB)

    Opella, S J [Pennsylvania Univ., Philadelphia, PA (United States). Dept. of Chemistry

    1994-12-31

    The use of NMR methodology in structural studies is described as applicable to larger proteins, considering that the majority of membrane proteins is constructed from a limited repertoire of structural and dynamic elements. The membrane associated domains of these proteins are made up of long hydrophobic membrane spanning helices, shorter amphipathic bridging helices in the plane of the bilayer, connecting loops with varying degrees of mobility, and mobile N- and C- terminal sections. NMR studies have been successful in identifying all of these elements and their orientations relative to each other and the membrane bilayer 19 refs., 9 figs.

  5. High throughput platforms for structural genomics of integral membrane proteins.

    Science.gov (United States)

    Mancia, Filippo; Love, James

    2011-08-01

    Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. Mining protein loops using a structural alphabet and statistical exceptionality

    Directory of Open Access Journals (Sweden)

    Martin Juliette

    2010-02-01

    Full Text Available Abstract Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times. Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words. These structural words have low structural variability (mean RMSd of 0.85 Å. As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues and long loops. Moreover, half of

  7. Mining protein loops using a structural alphabet and statistical exceptionality.

    Science.gov (United States)

    Regad, Leslie; Martin, Juliette; Nuel, Gregory; Camproux, Anne-Claude

    2010-02-04

    Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 A). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of

  8. Structural protein relationships among eastern equine encephalitis viruses.

    Science.gov (United States)

    Strizki, J M; Repik, P M

    1994-11-01

    We have re-evaluated the relationships among the polypeptides of eastern equine encephalitis (EEE) viruses using SDS-PAGE and peptide mapping of individual virion proteins. Four to five distinct polypeptide bands were detected upon SDS-PAGE analysis of viruses: the E1, E2 and C proteins normally associated with alphavirus virions, as well as an additional more rapidly-migrating E2-associated protein and a high M(r) (HMW) protein. In contrast with previous findings by others, the electrophoretic profiles of the virion proteins of EEE viruses displayed a marked correlation with serotype. The protein profiles of the 33 North American (NA)-serotype viruses examined were remarkably homogeneous, with variation detected only in the E1 protein of two isolates. In contrast, considerable heterogeneity was observed in the migration profiles of both the E1 and E2 glycoproteins of the 13 South American (SA)-type viruses examined. Peptide mapping of individual virion proteins using limited proteolysis with Staphylococcus aureus V8 protease confirmed that, in addition to the homogeneity evident among NA-type viruses and relative heterogeneity among SA-type viruses, the E1 and E2 proteins of NA- and SA-serotype viruses exhibited serotype-specific structural variation. The C protein was highly conserved among isolates of both virus serotypes. Endoglycosidase analyses of intact virions did not reveal substantial glycosylation differences between the glycoproteins of NA- and SA-serotype viruses. Both the HMW protein and the E2 protein (doublet) of EEE virus appeared to contain, at least in part, high-mannose type N-linked oligosaccharides. No evidence of O-linked glycans was found on either the E1 or the E2 glycoprotein. Despite the observed structural differences between proteins of NA- and SA-type viruses, Western blot analyses utilizing polyclonal antibodies indicated that immunoreactive epitopes appeared to be conserved.

  9. An Algebro-Topological Description of Protein Domain Structure

    Science.gov (United States)

    Penner, Robert Clark; Knudsen, Michael; Wiuf, Carsten; Andersen, Jørgen Ellegaard

    2011-01-01

    The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular interaction and function. We introduce a novel mathematical abstraction based on geometric topology to describe protein domain structure. Using the locations of the backbone atoms and the hydrogen bonds, we build a combinatorial object – a so-called fatgraph. The description is discrete yet gives rise to a 2-dimensional mathematical surface. Thus, each protein domain corresponds to a particular mathematical surface with characteristic topological invariants, such as the genus (number of holes) and the number of boundary components. Both invariants are global fatgraph features reflecting the interconnectivity of the domain by hydrogen bonds. We introduce the notion of robust variables, that is variables that are robust towards minor changes in the structure/fatgraph, and show that the genus and the number of boundary components are robust. Further, we invesigate the distribution of different fatgraph variables and show how only four variables are capable of distinguishing different folds. We use local (secondary) and global (tertiary) fatgraph features to describe domain structures and illustrate that they are useful for classification of domains in CATH. In addition, we combine our method with two other methods thereby using primary, secondary, and tertiary structure information, and show that we can identify a large percentage of new and unclassified structures in CATH. PMID:21629687

  10. Linking structural features of protein complexes and biological function.

    Science.gov (United States)

    Sowmya, Gopichandran; Breen, Edmond J; Ranganathan, Shoba

    2015-09-01

    Protein-protein interaction (PPI) establishes the central basis for complex cellular networks in a biological cell. Association of proteins with other proteins occurs at varying affinities, yet with a high degree of specificity. PPIs lead to diverse functionality such as catalysis, regulation, signaling, immunity, and inhibition, playing a crucial role in functional genomics. The molecular principle of such interactions is often elusive in nature. Therefore, a comprehensive analysis of known protein complexes from the Protein Data Bank (PDB) is essential for the characterization of structural interface features to determine structure-function relationship. Thus, we analyzed a nonredundant dataset of 278 heterodimer protein complexes, categorized into major functional classes, for distinguishing features. Interestingly, our analysis has identified five key features (interface area, interface polar residue abundance, hydrogen bonds, solvation free energy gain from interface formation, and binding energy) that are discriminatory among the functional classes using Kruskal-Wallis rank sum test. Significant correlations between these PPI interface features amongst functional categories are also documented. Salt bridges correlate with interface area in regulator-inhibitors (r = 0.75). These representative features have implications for the prediction of potential function of novel protein complexes. The results provide molecular insights for better understanding of PPIs and their relation to biological functions. © 2015 The Protein Society.

  11. A computer graphics program system for protein structure representation.

    Science.gov (United States)

    Ross, A M; Golub, E E

    1988-01-01

    We have developed a computer graphics program system for the schematic representation of several protein secondary structure analysis algorithms. The programs calculate the probability of occurrence of alpha-helix, beta-sheet and beta-turns by the method of Chou and Fasman and assign unique predicted structure to each residue using a novel conflict resolution algorithm based on maximum likelihood. A detailed structure map containing secondary structure, hydrophobicity, sequence identity, sequence numbering and the location of putative N-linked glycosylation sites is then produced. In addition, helical wheel diagrams and hydrophobic moment calculations can be performed to further analyze the properties of selected regions of the sequence. As they require only structure specification as input, the graphics programs can easily be adapted for use with other secondary structure prediction schemes. The use of these programs to analyze protein structure-function relationships is described and evaluated. PMID:2832829

  12. Crystal structure of Homo sapiens protein LOC79017

    Energy Technology Data Exchange (ETDEWEB)

    Bae, Euiyoung; Bingman, Craig A.; Aceti, David J.; Phillips, Jr., George N. (UW)

    2010-02-08

    LOC79017 (MW 21.0 kDa, residues 1-188) was annotated as a hypothetical protein encoded by Homo sapiens chromosome 7 open reading frame 24. It was selected as a target by the Center for Eukaryotic Structural Genomics (CESG) because it did not share more than 30% sequence identity with any protein for which the three-dimensional structure is known. The biological function of the protein has not been established yet. Parts of LOC79017 were identified as members of uncharacterized Pfam families (residues 1-95 as PB006073 and residues 104-180 as PB031696). BLAST searches revealed homologues of LOC79017 in many eukaryotes, but none of them have been functionally characterized. Here, we report the crystal structure of H. sapiens protein LOC79017 (UniGene code Hs.530024, UniProt code O75223, CESG target number go.35223).

  13. Deprotonated imidodiphosphate in AMPPNP-containing protein structures

    International Nuclear Information System (INIS)

    Dauter, Miroslawa; Dauter, Zbigniew

    2011-01-01

    In certain AMPPNP-containing protein structures, the nitrogen bridging the two terminal phosphate groups can be deprotonated. Many different proteins utilize the chemical energy provided by the cofactor adenosine triphosphate (ATP) for their proper function. A number of structures in the Protein Data Bank (PDB) contain adenosine 5′-(β,γ-imido)triphosphate (AMPPNP), a nonhydrolysable analog of ATP in which the bridging O atom between the two terminal phosphate groups is substituted by the imido function. Under mild conditions imides do not have acidic properties and thus the imide nitrogen should be protonated. However, an analysis of protein structures containing AMPPNP reveals that the imide group is deprotonated in certain complexes if the negative charges of the phosphate moieties in AMPPNP are in part neutralized by coordinating divalent metals or a guanidinium group of an arginine

  14. EVA: continuous automatic evaluation of protein structure prediction servers.

    Science.gov (United States)

    Eyrich, V A; Martí-Renom, M A; Przybylski, D; Madhusudhan, M S; Fiser, A; Pazos, F; Valencia, A; Sali, A; Rost, B

    2001-12-01

    Evaluation of protein structure prediction methods is difficult and time-consuming. Here, we describe EVA, a web server for assessing protein structure prediction methods, in an automated, continuous and large-scale fashion. Currently, EVA evaluates the performance of a variety of prediction methods available through the internet. Every week, the sequences of the latest experimentally determined protein structures are sent to prediction servers, results are collected, performance is evaluated, and a summary is published on the web. EVA has so far collected data for more than 3000 protein chains. These results may provide valuable insight to both developers and users of prediction methods. http://cubic.bioc.columbia.edu/eva. eva@cubic.bioc.columbia.edu

  15. Blind Test of Physics-Based Prediction of Protein Structures

    Science.gov (United States)

    Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

    2009-01-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130

  16. Relationship between Molecular Structure Characteristics of Feed Proteins and Protein Digestibility and Solubility

    Directory of Open Access Journals (Sweden)

    Mingmei Bai

    2016-08-01

    Full Text Available The nutritional value of feed proteins and their utilization by livestock are related not only to the chemical composition but also to the structure of feed proteins, but few studies thus far have investigated the relationship between the structure of feed proteins and their solubility as well as digestibility in monogastric animals. To address this question we analyzed soybean meal, fish meal, corn distiller’s dried grains with solubles, corn gluten meal, and feather meal by Fourier transform infrared (FTIR spectroscopy to determine the protein molecular spectral band characteristics for amides I and II as well as α-helices and β-sheets and their ratios. Protein solubility and in vitro digestibility were measured with the Kjeldahl method using 0.2% KOH solution and the pepsin-pancreatin two-step enzymatic method, respectively. We found that all measured spectral band intensities (height and area of feed proteins were correlated with their the in vitro digestibility and solubility (p≤0.003; moreover, the relatively quantitative amounts of α-helices, random coils, and α-helix to β-sheet ratio in protein secondary structures were positively correlated with protein in vitro digestibility and solubility (p≤0.004. On the other hand, the percentage of β-sheet structures was negatively correlated with protein in vitro digestibility (p<0.001 and solubility (p = 0.002. These results demonstrate that the molecular structure characteristics of feed proteins are closely related to their in vitro digestibility at 28 h and solubility. Furthermore, the α-helix-to-β-sheet ratio can be used to predict the nutritional value of feed proteins.

  17. Characterization of structural proteins of hirame rhabdovirus, HRV

    Science.gov (United States)

    Nishizawa, Toyohiko; Yoshimizu, Mamoru; Winton, James; Ahne, Winfried; Kimura, Takahisa

    1991-01-01

    Structural proteins of hirame rhabdovirus (HRV) were analyzed by SDS-polyacrylarnide gel electrophoresis, western blotting, 2-dimensional gel electrophoresis, and Triton X-100 treatment. Purified HRV virions were composed of: polymerase (L), glycoprotein (G), nucleoprotein (N), and 2 matrix proteins (M1 and M2). Based upon their relative mobilities, the estimated molecular weights of the proteins were: L, 156 KDa; G, 68 KDa; N, 46.4 KDa; M1, 26.4 KDa; and M2, 19.9 KDa. The electrophorehc pattern formed by the structural proteins of HRV was clearly different from that formed by pike fry rhabdovirus, spring viremia of carp virus, eel virus of America, and eel virus European X which belong to the Vesiculovirus genus; however, it resembled the pattern formed by structural proteins of viral hemorrhagic septicemia virus (VHSV) and infectious hematopoietic necrosis virus (IHNV) which are members of the Lyssavirus genus. Among HRV, IHNV, and VHSV, differences were observed in the relative mobilities of the G, N, M1, and M2 proteins. Western blot analysis revealed that the G. N, and M2 proteins of HRV shared antigenic determinants with IHNV and VHSV, but not with any of the 4 fish vesiculoviruses tested. Cross-reactions between the M1 proteins of HRV, IHNV, or VHSV were not detected in this assay. Two-dimensional gel electrophoresis was used to show that HRV differed from IHNV or VHSV in the isoelectric point (PI) of the M1 and M2 proteins. In this system, 2 forms of the M1 protein of HRV and IHNV were observed.These subspecies of M1 had the same relative mobility but different p1 values. Treatment of purified virions with 2% Triton X-100 in Tris buffer containing NaCl removed the G, M1, and M2 proteins of IHNV, but HRV virions were more stable under these conditions.

  18. Cold-set globular protein gels: Interactions, structure and rheology as a function of protein concentration.

    NARCIS (Netherlands)

    Alting, A.C.; Hamer, R.J.; Kruif, de C.G.

    2003-01-01

    We identified the contribution of covalent and noncovalent interactions to the scaling behavior of the structural and rheological properties in a cold gelling protein system. The system we studied consisted of two types of whey protein aggregates, equal in size but different in the amount of

  19. Identification of structural domains in proteins by a graph heuristic

    NARCIS (Netherlands)

    Wernisch, Lorenz; Hunting, M.M.G.; Wodak, Shoshana J.

    1999-01-01

    A novel automatic procedure for identifying domains from protein atomic coordinates is presented. The procedure, termed STRUDL (STRUctural Domain Limits), does not take into account information on secondary structures and handles any number of domains made up of contiguous or non-contiguous chain

  20. Connecting Protein Structure to Intermolecular Interactions: A Computer Modeling Laboratory

    Science.gov (United States)

    Abualia, Mohammed; Schroeder, Lianne; Garcia, Megan; Daubenmire, Patrick L.; Wink, Donald J.; Clark, Ginevra A.

    2016-01-01

    An understanding of protein folding relies on a solid foundation of a number of critical chemical concepts, such as molecular structure, intra-/intermolecular interactions, and relating structure to function. Recent reports show that students struggle on all levels to achieve these understandings and use them in meaningful ways. Further, several…

  1. Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment

    Directory of Open Access Journals (Sweden)

    Daniels Noah M

    2012-10-01

    Full Text Available Abstract Background The quality of multiple protein structure alignments are usually computed and assessed based on geometric functions of the coordinates of the backbone atoms from the protein chains. These purely geometric methods do not utilize directly protein sequence similarity, and in fact, determining the proper way to incorporate sequence similarity measures into the construction and assessment of protein multiple structure alignments has proved surprisingly difficult. Results We present Formatt, a multiple structure alignment based on the Matt purely geometric multiple structure alignment program, that also takes into account sequence similarity when constructing alignments. We show that Formatt outperforms Matt and other popular structure alignment programs on the popular HOMSTRAD benchmark. For the SABMark twilight zone benchmark set that captures more remote homology, Formatt and Matt outperform other programs; depending on choice of embedded sequence aligner, Formatt produces either better sequence and structural alignments with a smaller core size than Matt, or similarly sized alignments with better sequence similarity, for a small cost in average RMSD. Conclusions Considering sequence information as well as purely geometric information seems to improve quality of multiple structure alignments, though defining what constitutes the best alignment when sequence and structural measures would suggest different alignments remains a difficult open question.

  2. The Protein Model Portal--a comprehensive resource for protein structure and model information.

    Science.gov (United States)

    Haas, Juergen; Roth, Steven; Arnold, Konstantin; Kiefer, Florian; Schmidt, Tobias; Bordoli, Lorenza; Schwede, Torsten

    2013-01-01

    The Protein Model Portal (PMP) has been developed to foster effective use of 3D molecular models in biomedical research by providing convenient and comprehensive access to structural information for proteins. Both experimental structures and theoretical models for a given protein can be searched simultaneously and analyzed for structural variability. By providing a comprehensive view on structural information, PMP offers the opportunity to apply consistent assessment and validation criteria to the complete set of structural models available for proteins. PMP is an open project so that new methods developed by the community can contribute to PMP, for example, new modeling servers for creating homology models and model quality estimation servers for model validation. The accuracy of participating modeling servers is continuously evaluated by the Continuous Automated Model EvaluatiOn (CAMEO) project. The PMP offers a unique interface to visualize structural coverage of a protein combining both theoretical models and experimental structures, allowing straightforward assessment of the model quality and hence their utility. The portal is updated regularly and actively developed to include latest methods in the field of computational structural biology. Database URL: http://www.proteinmodelportal.org.

  3. The Protein Model Portal—a comprehensive resource for protein structure and model information

    Science.gov (United States)

    Haas, Juergen; Roth, Steven; Arnold, Konstantin; Kiefer, Florian; Schmidt, Tobias; Bordoli, Lorenza; Schwede, Torsten

    2013-01-01

    The Protein Model Portal (PMP) has been developed to foster effective use of 3D molecular models in biomedical research by providing convenient and comprehensive access to structural information for proteins. Both experimental structures and theoretical models for a given protein can be searched simultaneously and analyzed for structural variability. By providing a comprehensive view on structural information, PMP offers the opportunity to apply consistent assessment and validation criteria to the complete set of structural models available for proteins. PMP is an open project so that new methods developed by the community can contribute to PMP, for example, new modeling servers for creating homology models and model quality estimation servers for model validation. The accuracy of participating modeling servers is continuously evaluated by the Continuous Automated Model EvaluatiOn (CAMEO) project. The PMP offers a unique interface to visualize structural coverage of a protein combining both theoretical models and experimental structures, allowing straightforward assessment of the model quality and hence their utility. The portal is updated regularly and actively developed to include latest methods in the field of computational structural biology. Database URL: http://www.proteinmodelportal.org PMID:23624946

  4. Protein Structural Change Data - PSCDB | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us PSCDB Protein Structural Change Data Data detail Data name Protein Structural Change Data DO...History of This Database Site Policy | Contact Us Protein Structural Change Data - PSCDB | LSDB Archive ...

  5. Structure and Dynamic Properties of Membrane Proteins using NMR

    DEFF Research Database (Denmark)

    Rösner, Heike; Kragelund, Birthe

    2012-01-01

    conformational changes. Their structural and functional decoding is challenging and has imposed demanding experimental development. Solution nuclear magnetic resonance (NMR) spectroscopy is one of the techniques providing the capacity to make a significant difference in the deciphering of the membrane protein...... structure-function paradigm. The method has evolved dramatically during the last decade resulting in a plethora of new experiments leading to a significant increase in the scientific repertoire for studying membrane proteins. Besides solving the three-dimensional structures using state-of-the-art approaches......-populated states, this review seeks to introduce the vast possibilities solution NMR can offer to the study of membrane protein structure-function analyses with special focus on applicability. © 2012 American Physiological Society. Compr Physiol 2:1491-1539, 2012....

  6. Perspective: Structural fluctuation of protein and Anfinsen's thermodynamic hypothesis

    Science.gov (United States)

    Hirata, Fumio; Sugita, Masatake; Yoshida, Masasuke; Akasaka, Kazuyuki

    2018-01-01

    The thermodynamics hypothesis, casually referred to as "Anfinsen's dogma," is described theoretically in terms of a concept of the structural fluctuation of protein or the first moment (average structure) and the second moment (variance and covariance) of the structural distribution. The new theoretical concept views the unfolding and refolding processes of protein as a shift of the structural distribution induced by a thermodynamic perturbation, with the variance-covariance matrix varying. Based on the theoretical concept, a method to characterize the mechanism of folding (or unfolding) is proposed. The transition state, if any, between two stable states is interpreted as a gap in the distribution, which is created due to an extensive reorganization of hydrogen bonds among back-bone atoms of protein and with water molecules in the course of conformational change. Further perspective to applying the theory to the computer-aided drug design, and to the material science, is briefly discussed.

  7. Crystal structure of secretory protein Hcp3 from Pseudomonas aeruginosa.

    Science.gov (United States)

    Osipiuk, Jerzy; Xu, Xiaohui; Cui, Hong; Savchenko, Alexei; Edwards, Aled; Joachimiak, Andrzej

    2011-03-01

    The Type VI secretion pathway transports proteins across the cell envelope of Gram-negative bacteria. Pseudomonas aeruginosa, an opportunistic Gram-negative bacterial pathogen infecting humans, uses the type VI secretion pathway to export specific effector proteins crucial for its pathogenesis. The HSI-I virulence locus encodes for several proteins that has been proposed to participate in protein transport including the Hcp1 protein, which forms hexameric rings that assemble into nanotubes in vitro. Two Hcp1 paralogues have been identified in the P. aeruginosa genome, Hsp2 and Hcp3. Here, we present the structure of the Hcp3 protein from P. aeruginosa. The overall structure of the monomer resembles Hcp1 despite the lack of amino-acid sequence similarity between the two proteins. The monomers assemble into hexamers similar to Hcp1. However, instead of forming nanotubes in head-to-tail mode like Hcp1, Hcp3 stacks its rings in head-to-head mode forming double-ring structures.

  8. Structural Elements Regulating AAA+ Protein Quality Control Machines.

    Science.gov (United States)

    Chang, Chiung-Wen; Lee, Sukyeong; Tsai, Francis T F

    2017-01-01

    Members of the ATPases Associated with various cellular Activities (AAA+) superfamily participate in essential and diverse cellular pathways in all kingdoms of life by harnessing the energy of ATP binding and hydrolysis to drive their biological functions. Although most AAA+ proteins share a ring-shaped architecture, AAA+ proteins have evolved distinct structural elements that are fine-tuned to their specific functions. A central question in the field is how ATP binding and hydrolysis are coupled to substrate translocation through the central channel of ring-forming AAA+ proteins. In this mini-review, we will discuss structural elements present in AAA+ proteins involved in protein quality control, drawing similarities to their known role in substrate interaction by AAA+ proteins involved in DNA translocation. Elements to be discussed include the pore loop-1, the Inter-Subunit Signaling (ISS) motif, and the Pre-Sensor I insert (PS-I) motif. Lastly, we will summarize our current understanding on the inter-relationship of those structural elements and propose a model how ATP binding and hydrolysis might be coupled to polypeptide translocation in protein quality control machines.

  9. Models of protein-ligand crystal structures: trust, but verify.

    Science.gov (United States)

    Deller, Marc C; Rupp, Bernhard

    2015-09-01

    X-ray crystallography provides the most accurate models of protein-ligand structures. These models serve as the foundation of many computational methods including structure prediction, molecular modelling, and structure-based drug design. The success of these computational methods ultimately depends on the quality of the underlying protein-ligand models. X-ray crystallography offers the unparalleled advantage of a clear mathematical formalism relating the experimental data to the protein-ligand model. In the case of X-ray crystallography, the primary experimental evidence is the electron density of the molecules forming the crystal. The first step in the generation of an accurate and precise crystallographic model is the interpretation of the electron density of the crystal, typically carried out by construction of an atomic model. The atomic model must then be validated for fit to the experimental electron density and also for agreement with prior expectations of stereochemistry. Stringent validation of protein-ligand models has become possible as a result of the mandatory deposition of primary diffraction data, and many computational tools are now available to aid in the validation process. Validation of protein-ligand complexes has revealed some instances of overenthusiastic interpretation of ligand density. Fundamental concepts and metrics of protein-ligand quality validation are discussed and we highlight software tools to assist in this process. It is essential that end users select high quality protein-ligand models for their computational and biological studies, and we provide an overview of how this can be achieved.

  10. Rotational order–disorder structure of fluorescent protein FP480

    International Nuclear Information System (INIS)

    Pletnev, Sergei; Morozova, Kateryna S.; Verkhusha, Vladislav V.; Dauter, Zbigniew

    2009-01-01

    An analysis of the rotational order–disorder structure of fluorescent protein FP480 is presented. In the last decade, advances in instrumentation and software development have made crystallography a powerful tool in structural biology. Using this method, structural information can now be acquired from pathological crystals that would have been abandoned in earlier times. In this paper, the order–disorder (OD) structure of fluorescent protein FP480 is discussed. The structure is composed of tetramers with 222 symmetry incorporated into the lattice in two different ways, namely rotated 90° with respect to each other around the crystal c axis, with tetramer axes coincident with crystallographic twofold axes. The random distribution of alternatively oriented tetramers in the crystal creates a rotational OD structure with statistically averaged I422 symmetry, although the presence of very weak and diffuse additional reflections suggests that the randomness is only approximate

  11. Tertiary alphabet for the observable protein structural universe.

    Science.gov (United States)

    Mackenzie, Craig O; Zhou, Jianfu; Grigoryan, Gevorg

    2016-11-22

    Here, we systematically decompose the known protein structural universe into its basic elements, which we dub tertiary structural motifs (TERMs). A TERM is a compact backbone fragment that captures the secondary, tertiary, and quaternary environments around a given residue, comprising one or more disjoint segments (three on average). We seek the set of universal TERMs that capture all structure in the Protein Data Bank (PDB), finding remarkable degeneracy. Only ∼600 TERMs are sufficient to describe 50% of the PDB at sub-Angstrom resolution. However, more rare geometries also exist, and the overall structural coverage grows logarithmically with the number of TERMs. We go on to show that universal TERMs provide an effective mapping between sequence and structure. We demonstrate that TERM-based statistics alone are sufficient to recapitulate close-to-native sequences given either NMR or X-ray backbones. Furthermore, sequence variability predicted from TERM data agrees closely with evolutionary variation. Finally, locations of TERMs in protein chains can be predicted from sequence alone based on sequence signatures emergent from TERM instances in the PDB. For multisegment motifs, this method identifies spatially adjacent fragments that are not contiguous in sequence-a major bottleneck in structure prediction. Although all TERMs recur in diverse proteins, some appear specialized for certain functions, such as interface formation, metal coordination, or even water binding. Structural biology has benefited greatly from previously observed degeneracies in structure. The decomposition of the known structural universe into a finite set of compact TERMs offers exciting opportunities toward better understanding, design, and prediction of protein structure.

  12. Fragger: a protein fragment picker for structural queries.

    Science.gov (United States)

    Berenger, Francois; Simoncini, David; Voet, Arnout; Shrestha, Rojan; Zhang, Kam Y J

    2017-01-01

    Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.

  13. DNA nanotubes for NMR structure determination of membrane proteins.

    Science.gov (United States)

    Bellot, Gaëtan; McClintock, Mark A; Chou, James J; Shih, William M

    2013-04-01

    Finding a way to determine the structures of integral membrane proteins using solution nuclear magnetic resonance (NMR) spectroscopy has proved to be challenging. A residual-dipolar-coupling-based refinement approach can be used to resolve the structure of membrane proteins up to 40 kDa in size, but to do this you need a weak-alignment medium that is detergent-resistant and it has thus far been difficult to obtain such a medium suitable for weak alignment of membrane proteins. We describe here a protocol for robust, large-scale synthesis of detergent-resistant DNA nanotubes that can be assembled into dilute liquid crystals for application as weak-alignment media in solution NMR structure determination of membrane proteins in detergent micelles. The DNA nanotubes are heterodimers of 400-nm-long six-helix bundles, each self-assembled from a M13-based p7308 scaffold strand and >170 short oligonucleotide staple strands. Compatibility with proteins bearing considerable positive charge as well as modulation of molecular alignment, toward collection of linearly independent restraints, can be introduced by reducing the negative charge of DNA nanotubes using counter ions and small DNA-binding molecules. This detergent-resistant liquid-crystal medium offers a number of properties conducive for membrane protein alignment, including high-yield production, thermal stability, buffer compatibility and structural programmability. Production of sufficient nanotubes for four or five NMR experiments can be completed in 1 week by a single individual.

  14. The structure of pyogenecin immunity protein, a novel bacteriocin-like immunity protein from streptococcus pyogenes.

    Energy Technology Data Exchange (ETDEWEB)

    Chang, C.; Coggill, P.; Bateman, A.; Finn, R.; Cymborowski, M.; Otwinowski, Z.; Minor, W.; Volkart, L.; Joachimiak, A.; Wellcome Trust Sanger Inst.; Univ. of Virginia; UT Southwestern Medical Center

    2009-12-17

    Many Gram-positive lactic acid bacteria (LAB) produce anti-bacterial peptides and small proteins called bacteriocins, which enable them to compete against other bacteria in the environment. These peptides fall structurally into three different classes, I, II, III, with class IIa being pediocin-like single entities and class IIb being two-peptide bacteriocins. Self-protective cognate immunity proteins are usually co-transcribed with these toxins. Several examples of cognates for IIa have already been solved structurally. Streptococcus pyogenes, closely related to LAB, is one of the most common human pathogens, so knowledge of how it competes against other LAB species is likely to prove invaluable. We have solved the crystal structure of the gene-product of locus Spy-2152 from S. pyogenes, (PDB: 2fu2), and found it to comprise an anti-parallel four-helix bundle that is structurally similar to other bacteriocin immunity proteins. Sequence analyses indicate this protein to be a possible immunity protein protective against class IIa or IIb bacteriocins. However, given that S. pyogenes appears to lack any IIa pediocin-like proteins but does possess class IIb bacteriocins, we suggest this protein confers immunity to IIb-like peptides. Combined structural, genomic and proteomic analyses have allowed the identification and in silico characterization of a new putative immunity protein from S. pyogenes, possibly the first structure of an immunity protein protective against potential class IIb two-peptide bacteriocins. We have named the two pairs of putative bacteriocins found in S. pyogenes pyogenecin 1, 2, 3 and 4.

  15. Constraint Logic Programming approach to protein structure prediction

    Directory of Open Access Journals (Sweden)

    Fogolari Federico

    2004-11-01

    Full Text Available Abstract Background The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Results Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. Conclusions The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

  16. Constraint Logic Programming approach to protein structure prediction.

    Science.gov (United States)

    Dal Palù, Alessandro; Dovier, Agostino; Fogolari, Federico

    2004-11-30

    The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

  17. The mTORC1-4E-BP-eIF4E axis controls de novo Bcl6 protein synthesis in T cells and systemic autoimmunity.

    Science.gov (United States)

    Yi, Woelsung; Gupta, Sanjay; Ricker, Edd; Manni, Michela; Jessberger, Rolf; Chinenov, Yurii; Molina, Henrik; Pernis, Alessandra B

    2017-08-15

    Post-transcriptional modifications can control protein abundance, but the extent to which these alterations contribute to the expression of T helper (T H ) lineage-defining factors is unknown. Tight regulation of Bcl6 expression, an essential transcription factor for T follicular helper (T FH ) cells, is critical as aberrant T FH cell expansion is associated with autoimmune diseases, such as systemic lupus erythematosus (SLE). Here we show that lack of the SLE risk variant Def6 results in deregulation of Bcl6 protein synthesis in T cells as a result of enhanced activation of the mTORC1-4E-BP-eIF4E axis, secondary to aberrant assembly of a raptor-p62-TRAF6 complex. Proteomic analysis reveals that this pathway selectively controls the abundance of a subset of proteins. Rapamycin or raptor deletion ameliorates the aberrant T FH cell expansion in mice lacking Def6. Thus deregulation of mTORC1-dependent pathways controlling protein synthesis can result in T-cell dysfunction, indicating a mechanism by which mTORC1 can promote autoimmunity.Excessive expansion of the T follicular helper (T FH ) cell pool is associated with autoimmune disease and Def6 has been identified as an SLE risk variant. Here the authors show that Def6 limits proliferation of T FH cells in mice via alteration of mTORC1 signaling and inhibition of Bcl6 expression.

  18. Structural Basis for Target Protein Regcognition by Thiredoxin

    DEFF Research Database (Denmark)

    Maeda, Kenji

    2007-01-01

    Ser) and a mutant of an in vitro substrate alpha-amylase/subtilisin inhibitor (BASI) (Cys144Ser), as a reaction intermediate-mimic of Trx-catalyzed disulfide reduction. The resultant structure showed a sequence of BASI residues along a conserved hydrophobic groove constituted of three loop segments...... of Trx-fold proteins glutaredoxin and glutathione transferase. This study suggests that the features of main chain conformation as well as charge property around disulfide bonds in protein substrates are important factors for interaction with Trx. Moreover, this study describes a detailed structural......Thioredoxin (Trx) is an ubiquitous protein disulfide reductase that possesses two redox active cysteines in the conserved active site sequence motif, Trp-CysN-Gly/Pro-Pro-CysC situated in the so called Trx-fold. The lack of insight into the protein substrate recognition mechanism of Trx has to date...

  19. Fundamental Characteristics of AAA+ Protein Family Structure and Function.

    Science.gov (United States)

    Miller, Justin M; Enemark, Eric J

    2016-01-01

    Many complex cellular events depend on multiprotein complexes known as molecular machines to efficiently couple the energy derived from adenosine triphosphate hydrolysis to the generation of mechanical force. Members of the AAA+ ATPase superfamily (ATPases Associated with various cellular Activities) are critical components of many molecular machines. AAA+ proteins are defined by conserved modules that precisely position the active site elements of two adjacent subunits to catalyze ATP hydrolysis. In many cases, AAA+ proteins form a ring structure that translocates a polymeric substrate through the central channel using specialized loops that project into the central channel. We discuss the major features of AAA+ protein structure and function with an emphasis on pivotal aspects elucidated with archaeal proteins.

  20. A resource for benchmarking the usefulness of protein structure models.

    KAUST Repository

    Carbajo, Daniel

    2012-08-02

    BACKGROUND: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. RESULTS: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. CONCLUSIONS: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by

  1. A resource for benchmarking the usefulness of protein structure models.

    Science.gov (United States)

    Carbajo, Daniel; Tramontano, Anna

    2012-08-02

    Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.

  2. A resource for benchmarking the usefulness of protein structure models.

    KAUST Repository

    Carbajo, Daniel; Tramontano, Anna

    2012-01-01

    BACKGROUND: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. RESULTS: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. CONCLUSIONS: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by

  3. Lipid nanotechnologies for structural studies of membrane-associated proteins.

    Science.gov (United States)

    Stoilova-McPhie, Svetla; Grushin, Kirill; Dalm, Daniela; Miller, Jaimy

    2014-11-01

    We present a methodology of lipid nanotubes (LNT) and nanodisks technologies optimized in our laboratory for structural studies of membrane-associated proteins at close to physiological conditions. The application of these lipid nanotechnologies for structure determination by cryo-electron microscopy (cryo-EM) is fundamental for understanding and modulating their function. The LNTs in our studies are single bilayer galactosylceramide based nanotubes of ∼20 nm inner diameter and a few microns in length, that self-assemble in aqueous solutions. The lipid nanodisks (NDs) are self-assembled discoid lipid bilayers of ∼10 nm diameter, which are stabilized in aqueous solutions by a belt of amphipathic helical scaffold proteins. By combining LNT and ND technologies, we can examine structurally how the membrane curvature and lipid composition modulates the function of the membrane-associated proteins. As proof of principle, we have engineered these lipid nanotechnologies to mimic the activated platelet's phosphtaidylserine rich membrane and have successfully assembled functional membrane-bound coagulation factor VIII in vitro for structure determination by cryo-EM. The macromolecular organization of the proteins bound to ND and LNT are further defined by fitting the known atomic structures within the calculated three-dimensional maps. The combination of LNT and ND technologies offers a means to control the design and assembly of a wide range of functional membrane-associated proteins and complexes for structural studies by cryo-EM. The presented results confirm the suitability of the developed methodology for studying the functional structure of membrane-associated proteins, such as the coagulation factors, at a close to physiological environment. © 2014 Wiley Periodicals, Inc.

  4. Distance matrix-based approach to protein structure prediction.

    Science.gov (United States)

    Kloczkowski, Andrzej; Jernigan, Robert L; Wu, Zhijun; Song, Guang; Yang, Lei; Kolinski, Andrzej; Pokarowski, Piotr

    2009-03-01

    Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r(ij)(2)] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r (ij) is greater or less than a cutoff value r (cutoff). We have performed spectral decomposition of the distance matrices D = sigma lambda(k)V(k)V(kT), in terms of eigenvalues lambda kappa and the corresponding eigenvectors v kappa and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r (2)--the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r (2) from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 A, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 A. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the

  5. De Novo Glutamine Synthesis

    Science.gov (United States)

    He, Qiao; Shi, Xinchong; Zhang, Linqi; Yi, Chang; Zhang, Xuezhen

    2016-01-01

    Purpose: The aim of this study was to investigate the role of de novo glutamine (Gln) synthesis in the proliferation of C6 glioma cells and its detection with 13N-ammonia. Methods: Chronic Gln-deprived C6 glioma (0.06C6) cells were established. The proliferation rates of C6 and 0.06C6 cells were measured under the conditions of Gln deprivation along with or without the addition of ammonia or glutamine synthetase (GS) inhibitor. 13N-ammonia uptake was assessed in C6 cells by gamma counting and in rats with C6 and 0.06C6 xenografts by micro–positron emission tomography (PET) scanning. The expression of GS in C6 cells and xenografts was assessed by Western blotting and immunohistochemistry, respectively. Results: The Gln-deprived C6 cells showed decreased proliferation ability but had a significant increase in GS expression. Furthermore, we found that low concentration of ammonia was sufficient to maintain the proliferation of Gln-deprived C6 cells, and 13N-ammonia uptake in C6 cells showed Gln-dependent decrease, whereas inhibition of GS markedly reduced the proliferation of C6 cells as well as the uptake of 13N-ammoina. Additionally, microPET/computed tomography exhibited that subcutaneous 0.06C6 xenografts had higher 13N-ammonia uptake and GS expression in contrast to C6 xenografts. Conclusion: De novo Gln synthesis through ammonia–glutamate reaction plays an important role in the proliferation of C6 cells. 13N-ammonia can be a potential metabolic PET tracer for Gln-dependent tumors. PMID:27118759

  6. Accurate protein structure modeling using sparse NMR data and homologous structure information.

    Science.gov (United States)

    Thompson, James M; Sgourakis, Nikolaos G; Liu, Gaohua; Rossi, Paolo; Tang, Yuefeng; Mills, Jeffrey L; Szyperski, Thomas; Montelione, Gaetano T; Baker, David

    2012-06-19

    While information from homologous structures plays a central role in X-ray structure determination by molecular replacement, such information is rarely used in NMR structure determination because it can be incorrect, both locally and globally, when evolutionary relationships are inferred incorrectly or there has been considerable evolutionary structural divergence. Here we describe a method that allows robust modeling of protein structures of up to 225 residues by combining (1)H(N), (13)C, and (15)N backbone and (13)Cβ chemical shift data, distance restraints derived from homologous structures, and a physically realistic all-atom energy function. Accurate models are distinguished from inaccurate models generated using incorrect sequence alignments by requiring that (i) the all-atom energies of models generated using the restraints are lower than models generated in unrestrained calculations and (ii) the low-energy structures converge to within 2.0 Å backbone rmsd over 75% of the protein. Benchmark calculations on known structures and blind targets show that the method can accurately model protein structures, even with very remote homology information, to a backbone rmsd of 1.2-1.9 Å relative to the conventional determined NMR ensembles and of 0.9-1.6 Å relative to X-ray structures for well-defined regions of the protein structures. This approach facilitates the accurate modeling of protein structures using backbone chemical shift data without need for side-chain resonance assignments and extensive analysis of NOESY cross-peak assignments.

  7. Phylogenetic and structural analysis of centromeric DNA and kinetochore proteins

    OpenAIRE

    Meraldi, Patrick; McAinsh, Andrew D; Rheinbay, Esther; Sorger, Peter K

    2006-01-01

    Background: Kinetochores are large multi-protein structures that assemble on centromeric DNA (CEN DNA) and mediate the binding of chromosomes to microtubules. Comprising 125 base-pairs of CEN DNA and 70 or more protein components, Saccharomyces cerevisiae kinetochores are among the best understood. In contrast, most fungal, plant and animal cells assemble kinetochores on CENs that are longer and more complex, raising the question of whether kinetochore architecture has been conserved through ...

  8. Predicting protein structures with a multiplayer online game

    OpenAIRE

    Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran

    2010-01-01

    People exert significant amounts of problem solving effort playing computer games. Simple image- and text-recognition tasks have been successfully crowd-sourced through gamesi, ii, iii, but it is not clear if more complex scientific problems can be similarly solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search sp...

  9. Structures and Interactions of Proteins in the Brain

    DEFF Research Database (Denmark)

    Nielsen, Lau Dalby

    The protein low density lipoprotein receptor related protein 1 (LRP1) plays multiple roles in the biology of amyloid β peptide (Aβ) and Alzheimer’s disease. LRP1 is very important for clearance of Aβ both in the brain and by facilitating Aβ export over the blood brain barrier. In spite of the app......The protein low density lipoprotein receptor related protein 1 (LRP1) plays multiple roles in the biology of amyloid β peptide (Aβ) and Alzheimer’s disease. LRP1 is very important for clearance of Aβ both in the brain and by facilitating Aβ export over the blood brain barrier. In spite...... coding for Arc protein has been domesticated from the same branch of genes that has given rise to retroviruses. We show that even despite the large evolutional distance between Arc and retroviruses. Despite large evolutionary distance Arc still self-assemble into higher order structures that resembles...

  10. Structure and assembly of a paramyxovirus matrix protein.

    Science.gov (United States)

    Battisti, Anthony J; Meng, Geng; Winkler, Dennis C; McGinnes, Lori W; Plevka, Pavel; Steven, Alasdair C; Morrison, Trudy G; Rossmann, Michael G

    2012-08-28

    Many pleomorphic, lipid-enveloped viruses encode matrix proteins that direct their assembly and budding, but the mechanism of this process is unclear. We have combined X-ray crystallography and cryoelectron tomography to show that the matrix protein of Newcastle disease virus, a paramyxovirus and relative of measles virus, forms dimers that assemble into pseudotetrameric arrays that generate the membrane curvature necessary for virus budding. We show that the glycoproteins are anchored in the gaps between the matrix proteins and that the helical nucleocapsids are associated in register with the matrix arrays. About 90% of virions lack matrix arrays, suggesting that, in agreement with previous biological observations, the matrix protein needs to dissociate from the viral membrane during maturation, as is required for fusion and release of the nucleocapsid into the host's cytoplasm. Structure and sequence conservation imply that other paramyxovirus matrix proteins function similarly.

  11. Structure and Modification of Electrode Materials for Protein Electrochemistry.

    Science.gov (United States)

    Jeuken, Lars J C

    The interactions between proteins and electrode surfaces are of fundamental importance in bioelectrochemistry, including photobioelectrochemistry. In order to optimise the interaction between electrode and redox protein, either the electrode or the protein can be engineered, with the former being the most adopted approach. This tutorial review provides a basic description of the most commonly used electrode materials in bioelectrochemistry and discusses approaches to modify these surfaces. Carbon, gold and transparent electrodes (e.g. indium tin oxide) are covered, while approaches to form meso- and macroporous structured electrodes are also described. Electrode modifications include the chemical modification with (self-assembled) monolayers and the use of conducting polymers in which the protein is imbedded. The proteins themselves can either be in solution, electrostatically adsorbed on the surface or covalently bound to the electrode. Drawbacks and benefits of each material and its modifications are discussed. Where examples exist of applications in photobioelectrochemistry, these are highlighted.

  12. (PS)2: protein structure prediction server version 3.0.

    Science.gov (United States)

    Huang, Tsun-Tsao; Hwang, Jenn-Kang; Chen, Chu-Huang; Chu, Chih-Sheng; Lee, Chi-Wen; Chen, Chih-Chieh

    2015-07-01

    Protein complexes are involved in many biological processes. Examining coupling between subunits of a complex would be useful to understand the molecular basis of protein function. Here, our updated (PS)(2) web server predicts the three-dimensional structures of protein complexes based on comparative modeling; furthermore, this server examines the coupling between subunits of the predicted complex by combining structural and evolutionary considerations. The predicted complex structure could be indicated and visualized by Java-based 3D graphics viewers and the structural and evolutionary profiles are shown and compared chain-by-chain. For each subunit, considerations with or without the packing contribution of other subunits cause the differences in similarities between structural and evolutionary profiles, and these differences imply which form, complex or monomeric, is preferred in the biological condition for the subunit. We believe that the (PS)(2) server would be a useful tool for biologists who are interested not only in the structures of protein complexes but also in the coupling between subunits of the complexes. The (PS)(2) is freely available at http://ps2v3.life.nctu.edu.tw/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Structural Conservation of the Myoviridae Phage Tail Sheath Protein Fold

    Energy Technology Data Exchange (ETDEWEB)

    Aksyuk, Anastasia A.; Kurochkina, Lidia P.; Fokine, Andrei; Forouhar, Farhad; Mesyanzhinov, Vadim V.; Tong, Liang; Rossmann, Michael G. (SOIBC); (Purdue); (Columbia)

    2012-02-21

    Bacteriophage phiKZ is a giant phage that infects Pseudomonas aeruginosa, a human pathogen. The phiKZ virion consists of a 1450 {angstrom} diameter icosahedral head and a 2000 {angstrom}-long contractile tail. The structure of the whole virus was previously reported, showing that its tail organization in the extended state is similar to the well-studied Myovirus bacteriophage T4 tail. The crystal structure of a tail sheath protein fragment of phiKZ was determined to 2.4 {angstrom} resolution. Furthermore, crystal structures of two prophage tail sheath proteins were determined to 1.9 and 3.3 {angstrom} resolution. Despite low sequence identity between these proteins, all of these structures have a similar fold. The crystal structure of the phiKZ tail sheath protein has been fitted into cryo-electron-microscopy reconstructions of the extended tail sheath and of a polysheath. The structural rearrangement of the phiKZ tail sheath contraction was found to be similar to that of phage T4.

  14. Structural History of Human SRGAP2 Proteins.

    Science.gov (United States)

    Sporny, Michael; Guez-Haddad, Julia; Kreusch, Annett; Shakartzi, Sivan; Neznansky, Avi; Cross, Alice; Isupov, Michail N; Qualmann, Britta; Kessels, Michael M; Opatowsky, Yarden

    2017-06-01

    In the development of the human brain, human-specific genes are considered to play key roles, conferring its unique advantages and vulnerabilities. At the time of Homo lineage divergence from Australopithecus, SRGAP2C gradually emerged through a process of serial duplications and mutagenesis from ancestral SRGAP2A (3.4-2.4 Ma). Remarkably, ectopic expression of SRGAP2C endows cultured mouse brain cells, with human-like characteristics, specifically, increased dendritic spine length and density. To understand the molecular mechanisms underlying this change in neuronal morphology, we determined the structure of SRGAP2A and studied the interplay between SRGAP2A and SRGAP2C. We found that: 1) SRGAP2A homo-dimerizes through a large interface that includes an F-BAR domain, a newly identified F-BAR extension (Fx), and RhoGAP-SH3 domains. 2) SRGAP2A has an unusual inverse geometry, enabling associations with lamellipodia and dendritic spine heads in vivo, and scaffolding of membrane protrusions in cell culture. 3) As a result of the initial partial duplication event (∼3.4 Ma), SRGAP2C carries a defective Fx-domain that severely compromises its solubility and membrane-scaffolding ability. Consistently, SRGAP2A:SRAGP2C hetero-dimers form, but are insoluble, inhibiting SRGAP2A activity. 4) Inactivation of SRGAP2A is sensitive to the level of hetero-dimerization with SRGAP2C. 5) The primal form of SRGAP2C (P-SRGAP2C, existing between ∼3.4 and 2.4 Ma) is less effective in hetero-dimerizing with SRGAP2A than the modern SRGAP2C, which carries several substitutions (from ∼2.4 Ma). Thus, the genetic mutagenesis phase contributed to modulation of SRGAP2A's inhibition of neuronal expansion, by introducing and improving the formation of inactive SRGAP2A:SRGAP2C hetero-dimers, indicating a stepwise involvement of SRGAP2C in human evolutionary history. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  15. Structural basis of protein oxidation resistance: a lysozyme study.

    Directory of Open Access Journals (Sweden)

    Marion Girod

    Full Text Available Accumulation of oxidative damage in proteins correlates with aging since it can cause irreversible and progressive degeneration of almost all cellular functions. Apparently, native protein structures have evolved intrinsic resistance to oxidation since perfectly folded proteins are, by large most robust. Here we explore the structural basis of protein resistance to radiation-induced oxidation using chicken egg white lysozyme in the native and misfolded form. We study the differential resistance to oxidative damage of six different parts of native and misfolded lysozyme by a targeted tandem/mass spectrometry approach of its tryptic fragments. The decay of the amount of each lysozyme fragment with increasing radiation dose is found to be a two steps process, characterized by a double exponential evolution of their amounts: the first one can be largely attributed to oxidation of specific amino acids, while the second one corresponds to further degradation of the protein. By correlating these results to the structural parameters computed from molecular dynamics (MD simulations, we find the protein parts with increased root-mean-square deviation (RMSD to be more susceptible to modifications. In addition, involvement of amino acid side-chains in hydrogen bonds has a protective effect against oxidation Increased exposure to solvent of individual amino acid side chains correlates with high susceptibility to oxidative and other modifications like side chain fragmentation. Generally, while none of the structural parameters alone can account for the fate of peptides during radiation, together they provide an insight into the relationship between protein structure and susceptibility to oxidation.

  16. Contingency Table Browser - prediction of early stage protein structure.

    Science.gov (United States)

    Kalinowska, Barbara; Krzykalski, Artur; Roterman, Irena

    2015-01-01

    The Early Stage (ES) intermediate represents the starting structure in protein folding simulations based on the Fuzzy Oil Drop (FOD) model. The accuracy of FOD predictions is greatly dependent on the accuracy of the chosen intermediate. A suitable intermediate can be constructed using the sequence-structure relationship information contained in the so-called contingency table - this table expresses the likelihood of encountering various structural motifs for each tetrapeptide fragment in the amino acid sequence. The limited accuracy with which such structures could previously be predicted provided the motivation for a more indepth study of the contingency table itself. The Contingency Table Browser is a tool which can visualize, search and analyze the table. Our work presents possible applications of Contingency Table Browser, among them - analysis of specific protein sequences from the point of view of their structural ambiguity.

  17. Electron transfer reactions in structural units of copper proteins

    International Nuclear Information System (INIS)

    Faraggi, M.

    1975-01-01

    In previous pulse radiolysis studies it was suggested that the reduction of the Cu(II) ions in copper proteins by the hydrated electron is a multi-step electron migration process. The technique has been extended to investigate the reduction of some structural units of these proteins. These studies include: the reaction of the hydrated electron with peptides, the reaction of the disulphide bridge with formate radical ion and radicals produced by the reduction of peptides, and the reaction of Cu(II)-peptide complex with esub(aq)sup(-) and CO 2 - . Using these results the reduction mechanism of copper and other proteins will be discussed. (author)

  18. Three-dimensional protein structure prediction: Methods and computational strategies.

    Science.gov (United States)

    Dorn, Márcio; E Silva, Mariel Barbachan; Buriol, Luciana S; Lamb, Luis C

    2014-10-12

    A long standing problem in structural bioinformatics is to determine the three-dimensional (3-D) structure of a protein when only a sequence of amino acid residues is given. Many computational methodologies and algorithms have been proposed as a solution to the 3-D Protein Structure Prediction (3-D-PSP) problem. These methods can be divided in four main classes: (a) first principle methods without database information; (b) first principle methods with database information; (c) fold recognition and threading methods; and (d) comparative modeling methods and sequence alignment strategies. Deterministic computational techniques, optimization techniques, data mining and machine learning approaches are typically used in the construction of computational solutions for the PSP problem. Our main goal with this work is to review the methods and computational strategies that are currently used in 3-D protein prediction. Copyright © 2014 Elsevier Ltd. All rights reserved.

  19. PROGRAM SYSTEM AND INFORMATION METADATA BANK OF TERTIARY PROTEIN STRUCTURES

    Directory of Open Access Journals (Sweden)

    T. A. Nikitin

    2013-01-01

    Full Text Available The article deals with the architecture of metadata storage model for check results of three-dimensional protein structures. Concept database model was built. The service and procedure of database update as well as data transformation algorithms for protein structures and their quality were presented. Most important information about entries and their submission forms to store, access, and delivery to users were highlighted. Software suite was developed for the implementation of functional tasks using Java programming language in the NetBeans v.7.0 environment and JQL to query and interact with the database JavaDB. The service was tested and results have shown system effectiveness while protein structures filtration.

  20. SA-Search: a web tool for protein structure mining based on a Structural Alphabet

    OpenAIRE

    Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

    2004-01-01

    SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of f...

  1. RACK1, A Multifaceted Scaffolding Protein: Structure and Function

    LENUS (Irish Health Repository)

    Adams, David R

    2011-10-06

    Abstract The Receptor for Activated C Kinase 1 (RACK1) is a member of the tryptophan-aspartate repeat (WD-repeat) family of proteins and shares significant homology to the β subunit of G-proteins (Gβ). RACK1 adopts a seven-bladed β-propeller structure which facilitates protein binding. RACK1 has a significant role to play in shuttling proteins around the cell, anchoring proteins at particular locations and in stabilising protein activity. It interacts with the ribosomal machinery, with several cell surface receptors and with proteins in the nucleus. As a result, RACK1 is a key mediator of various pathways and contributes to numerous aspects of cellular function. Here, we discuss RACK1 gene and structure and its role in specific signaling pathways, and address how posttranslational modifications facilitate subcellular location and translocation of RACK1. This review condenses several recent studies suggesting a role for RACK1 in physiological processes such as development, cell migration, central nervous system (CN) function and circadian rhythm as well as reviewing the role of RACK1 in disease.

  2. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  3. Functional classification of protein structures by local structure matching in graph representation.

    Science.gov (United States)

    Mills, Caitlyn L; Garg, Rohan; Lee, Joslynn S; Tian, Liang; Suciu, Alexandru; Cooperman, Gene; Beuning, Penny J; Ondrechen, Mary Jo

    2018-03-31

    As a result of high-throughput protein structure initiatives, over 14,400 protein structures have been solved by structural genomics (SG) centers and participating research groups. While the totality of SG data represents a tremendous contribution to genomics and structural biology, reliable functional information for these proteins is generally lacking. Better functional predictions for SG proteins will add substantial value to the structural information already obtained. Our method described herein, Graph Representation of Active Sites for Prediction of Function (GRASP-Func), predicts quickly and accurately the biochemical function of proteins by representing residues at the predicted local active site as graphs rather than in Cartesian coordinates. We compare the GRASP-Func method to our previously reported method, structurally aligned local sites of activity (SALSA), using the ribulose phosphate binding barrel (RPBB), 6-hairpin glycosidase (6-HG), and Concanavalin A-like Lectins/Glucanase (CAL/G) superfamilies as test cases. In each of the superfamilies, SALSA and the much faster method GRASP-Func yield similar correct classification of previously characterized proteins, providing a validated benchmark for the new method. In addition, we analyzed SG proteins using our SALSA and GRASP-Func methods to predict function. Forty-one SG proteins in the RPBB superfamily, nine SG proteins in the 6-HG superfamily, and one SG protein in the CAL/G superfamily were successfully classified into one of the functional families in their respective superfamily by both methods. This improved, faster, validated computational method can yield more reliable predictions of function that can be used for a wide variety of applications by the community. © 2018 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  4. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

    Science.gov (United States)

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-11

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  5. Phosphorus Binding Sites in Proteins: Structural Preorganization and Coordination

    DEFF Research Database (Denmark)

    Gruber, Mathias Felix; Greisen, Per Junior; Junker, Märta Caroline

    2014-01-01

    to individual structures that bind to phosphate groups; here, we investigate a total of 8307 structures obtained from the RCSB Protein Data Bank (PDB). An analysis of the binding site amino acid propensities reveals very characteristic first shell residue distributions, which are found to be influenced...... by the characteristics of the phosphorus compound and by the presence of cobound cations. The second shell, which supports the coordinating residues in the first shell, is found to consist mainly of protein backbone groups. Our results show how the second shell residue distribution is dictated mainly by the first shell...

  6. Taking MAD to the extreme: ultrafast protein structure determination

    International Nuclear Information System (INIS)

    Walsh, M.A.; Dementieva, I.; Evans, G.; Sanishvili, R.; Joachimiak, A.

    1999-01-01

    Multiwavelength anomalous diffraction data were measured in 23 min from a 16 kDa selenomethionyl-substituted protein, producing experimental phases to 2.25 (angstrom) resolution. The data were collected on a mosaic 3 x 3 charge-coupled device using undulator radiation from the Structural Biology Center 19ID beamline at the Argonne National Laboratory's Advanced Photon Source. The phases were independently obtained semiautomatically by two crystallographic program suites, CCP4 and CNS. The quality and speed of this data acquisition exemplify the opportunities at third-generation synchrotron sources for high-throughput protein crystal structure determination

  7. Automatic protein structure solution from weak X-ray data

    Science.gov (United States)

    Skubák, Pavol; Pannu, Navraj S.

    2013-11-01

    Determining new protein structures from X-ray diffraction data at low resolution or with a weak anomalous signal is a difficult and often an impossible task. Here we propose a multivariate algorithm that simultaneously combines the structure determination steps. In tests on over 140 real data sets from the protein data bank, we show that this combined approach can automatically build models where current algorithms fail, including an anisotropically diffracting 3.88 Å RNA polymerase II data set. The method seamlessly automates the process, is ideal for non-specialists and provides a mathematical framework for successfully combining various sources of information in image processing.

  8. Prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoEM structures

    Directory of Open Access Journals (Sweden)

    Srinivasan Narayanaswamy

    2010-06-01

    Full Text Available Abstract Background Dengue virus along with the other members of the flaviviridae family has reemerged as deadly human pathogens. Understanding the mechanistic details of these infections can be highly rewarding in developing effective antivirals. During maturation of the virus inside the host cell, the coat proteins E and M undergo conformational changes, altering the morphology of the viral coat. However, due to low resolution nature of the available 3-D structures of viral assemblies, the atomic details of these changes are still elusive. Results In the present analysis, starting from Cα positions of low resolution cryo electron microscopic structures the residue level details of protein-protein interaction interfaces of dengue virus coat proteins have been predicted. By comparing the preexisting structures of virus in different phases of life cycle, the changes taking place in these predicted protein-protein interaction interfaces were followed as a function of maturation process of the virus. Besides changing the current notion about the presence of only homodimers in the mature viral coat, the present analysis indicated presence of a proline-rich motif at the protein-protein interaction interface of the coat protein. Investigating the conservation status of these seemingly functionally crucial residues across other members of flaviviridae family enabled dissecting common mechanisms used for infections by these viruses. Conclusions Thus, using computational approach the present analysis has provided better insights into the preexisting low resolution structures of virus assemblies, the findings of which can be made use of in designing effective antivirals against these deadly human pathogens.

  9. Examining the process of de novo gene birth: an educational primer on "integration of new genes into cellular networks, and their structural maturation".

    Science.gov (United States)

    Frietze, Seth; Leatherman, Judith

    2014-03-01

    New genes that arise from modification of the noncoding portion of a genome rather than being duplicated from parent genes are called de novo genes. These genes, identified by their brief evolution and lack of parent genes, provide an opportunity to study the timeframe in which emerging genes integrate into cellular networks, and how the characteristics of these genes change as they mature into bona fide genes. An article by G. Abrusán provides an opportunity to introduce students to fundamental concepts in evolutionary and comparative genetics and to provide a technical background by which to discuss systems biology approaches when studying the evolutionary process of gene birth. Basic background needed to understand the Abrusán study and details on comparative genomic concepts tailored for a classroom discussion are provided, including discussion questions and a supplemental exercise on navigating a genome database.

  10. Critical Features of Fragment Libraries for Protein Structure Prediction.

    Science.gov (United States)

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  11. Predicting and validating protein interactions using network structure.

    Directory of Open Access Journals (Sweden)

    Pao-Yang Chen

    2008-07-01

    Full Text Available Protein interactions play a vital part in the function of a cell. As experimental techniques for detection and validation of protein interactions are time consuming, there is a need for computational methods for this task. Protein interactions appear to form a network with a relatively high degree of local clustering. In this paper we exploit this clustering by suggesting a score based on triplets of observed protein interactions. The score utilises both protein characteristics and network properties. Our score based on triplets is shown to complement existing techniques for predicting protein interactions, outperforming them on data sets which display a high degree of clustering. The predicted interactions score highly against test measures for accuracy. Compared to a similar score derived from pairwise interactions only, the triplet score displays higher sensitivity and specificity. By looking at specific examples, we show how an experimental set of interactions can be enriched and validated. As part of this work we also examine the effect of different prior databases upon the accuracy of prediction and find that the interactions from the same kingdom give better results than from across kingdoms, suggesting that there may be fundamental differences between the networks. These results all emphasize that network structure is important and helps in the accurate prediction of protein interactions. The protein interaction data set and the program used in our analysis, and a list of predictions and validations, are available at http://www.stats.ox.ac.uk/bioinfo/resources/PredictingInteractions.

  12. Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures

    Science.gov (United States)

    Manolakos, Elias S.

    2015-01-01

    Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332

  13. Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.

    Science.gov (United States)

    Sharma, Anuj; Manolakos, Elias S

    2015-01-01

    Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.

  14. Exploring the universe of protein structures beyond the Protein Data Bank.

    Science.gov (United States)

    Cossio, Pilar; Trovato, Antonio; Pietrucci, Fabio; Seno, Flavio; Maritan, Amos; Laio, Alessandro

    2010-11-04

    It is currently believed that the atlas of existing protein structures is faithfully represented in the Protein Data Bank. However, whether this atlas covers the full universe of all possible protein structures is still a highly debated issue. By using a sophisticated numerical approach, we performed an exhaustive exploration of the conformational space of a 60 amino acid polypeptide chain described with an accurate all-atom interaction potential. We generated a database of around 30,000 compact folds with at least of secondary structure corresponding to local minima of the potential energy. This ensemble plausibly represents the universe of protein folds of similar length; indeed, all the known folds are represented in the set with good accuracy. However, we discover that the known folds form a rather small subset, which cannot be reproduced by choosing random structures in the database. Rather, natural and possible folds differ by the contact order, on average significantly smaller in the former. This suggests the presence of an evolutionary bias, possibly related to kinetic accessibility, towards structures with shorter loops between contacting residues. Beside their conceptual relevance, the new structures open a range of practical applications such as the development of accurate structure prediction strategies, the optimization of force fields, and the identification and design of novel folds.

  15. Exploring the universe of protein structures beyond the Protein Data Bank.

    Directory of Open Access Journals (Sweden)

    Pilar Cossio

    Full Text Available It is currently believed that the atlas of existing protein structures is faithfully represented in the Protein Data Bank. However, whether this atlas covers the full universe of all possible protein structures is still a highly debated issue. By using a sophisticated numerical approach, we performed an exhaustive exploration of the conformational space of a 60 amino acid polypeptide chain described with an accurate all-atom interaction potential. We generated a database of around 30,000 compact folds with at least of secondary structure corresponding to local minima of the potential energy. This ensemble plausibly represents the universe of protein folds of similar length; indeed, all the known folds are represented in the set with good accuracy. However, we discover that the known folds form a rather small subset, which cannot be reproduced by choosing random structures in the database. Rather, natural and possible folds differ by the contact order, on average significantly smaller in the former. This suggests the presence of an evolutionary bias, possibly related to kinetic accessibility, towards structures with shorter loops between contacting residues. Beside their conceptual relevance, the new structures open a range of practical applications such as the development of accurate structure prediction strategies, the optimization of force fields, and the identification and design of novel folds.

  16. Insulin stimulates phospholipase D-dependent phosphatidylcholine hydrolysis, Rho translocation, de novo phospholipid synthesis, and diacylglycerol/protein kinase C signaling in L6 myotubes.

    Science.gov (United States)

    Standaert, M L; Bandyopadhyay, G; Zhou, X; Galloway, L; Farese, R V

    1996-07-01

    Previous studies have provided conflicting findings on whether insulin activates certain, potentially important, phospholipid signaling systems in skeletal muscle preparations. In particular, insulin effects on the hydrolysis of phosphatidylcholine (PC) and subsequent activation of protein kinase C (PKC) have not been apparent in some studies. Presently, we examined insulin effects on phospholipid signaling systems, diacylglycerol (DAG) production, and PKC translocation/activation in L6 myotubes. We found that insulin provoked rapid increases in phospholipase D (PLD)-dependent hydrolysis of PC, as evidenced by increases in choline release and phosphatidylethanol production in cells incubated in the presence of ethanol. In association with PC-PLD activation, Rho, a small G protein that is known to activate PC-PLD activation, translocated from the cytosol to the membrane fraction in response to insulin treatment. PC-PLD activation was also accompanied by increases in total DAG production and increases in the translocation of both PKC enzyme activity and DAG-sensitive PKC-alpha, -beta, -delta, and -epsilon from the cytosol to the membrane fraction. A potential role for PKC or a related protein kinase in insulin action was suggested by the finding that RO 31-8220 inhibited both PKC enzyme activity and insulin-stimulated [3H]2-deoxyglucose uptake. Our findings provide the first evidence that insulin stimulates Rho translocation and activates PC-PLD in L6 skeletal muscle cells. Moreover, this signaling system appears to lead to increases in DAG/PKC signaling, which, along with other related signaling factors, may regulate certain metabolic processes, such as glucose transport, in these cells.

  17. Structure of Pfu Pop5, an archaeal RNase P protein.

    Science.gov (United States)

    Wilson, Ross C; Bohlen, Christopher J; Foster, Mark P; Bell, Charles E

    2006-01-24

    We have used NMR spectroscopy and x-ray crystallography to determine the three-dimensional structure of PF1378 (Pfu Pop5), one of four protein subunits of archaeal RNase P that shares a homolog in the eukaryotic enzyme. RNase P is an essential and ubiquitous ribonucleoprotein enzyme required for maturation of tRNA. In bacteria, the enzyme's RNA subunit is responsible for cleaving the single-stranded 5' leader sequence of precursor tRNA molecules (pre-tRNA), whereas the protein subunit assists in substrate binding. Although in bacteria the RNase P holoenzyme consists of one large catalytic RNA and one small protein subunit, in archaea and eukarya the enzyme contains several (> or =4) protein subunits, each of which lacks sequence similarity to the bacterial protein. The functional role of the proteins is poorly understood, as is the increased complexity in comparison to the bacterial enzyme. Pfu Pop5 has been directly implicated in catalysis by the observation that it pairs with PF1914 (Pfu Rpp30) to functionally reconstitute the catalytic domain of the RNA subunit. The protein adopts an alpha-beta sandwich fold highly homologous to the single-stranded RNA binding RRM domain. Furthermore, the three-dimensional arrangement of Pfu Pop5's structural elements is remarkably similar to that of the bacterial protein subunit. NMR spectra have been used to map the interaction of Pop5 with Pfu Rpp30. The data presented permit tantalizing hypotheses regarding the role of this protein subunit shared by archaeal and eukaryotic RNase P.

  18. Systematic comparison of crystal and NMR protein structures deposited in the protein data bank.

    Science.gov (United States)

    Sikic, Kresimir; Tomic, Sanja; Carugo, Oliviero

    2010-09-03

    Nearly all the macromolecular three-dimensional structures deposited in Protein Data Bank were determined by either crystallographic (X-ray) or Nuclear Magnetic Resonance (NMR) spectroscopic methods. This paper reports a systematic comparison of the crystallographic and NMR results deposited in the files of the Protein Data Bank, in order to find out to which extent these information can be aggregated in bioinformatics. A non-redundant data set containing 109 NMR - X-ray structure pairs of nearly identical proteins was derived from the Protein Data Bank. A series of comparisons were performed by focusing the attention towards both global features and local details. It was observed that: (1) the RMDS values between NMR and crystal structures range from about 1.5 Å to about 2.5 Å; (2) the correlation between conformational deviations and residue type reveals that hydrophobic amino acids are more similar in crystal and NMR structures than hydrophilic amino acids; (3) the correlation between solvent accessibility of the residues and their conformational variability in solid state and in solution is relatively modest (correlation coefficient = 0.462); (4) beta strands on average match better between NMR and crystal structures than helices and loops; (5) conformational differences between loops are independent of crystal packing interactions in the solid state; (6) very seldom, side chains buried in the protein interior are observed to adopt different orientations in the solid state and in solution.

  19. Quality assessment of protein model-structures based on structural and functional similarities.

    Science.gov (United States)

    Konopka, Bogumil M; Nebel, Jean-Christophe; Kotulska, Malgorzata

    2012-09-21

    Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and

  20. Improving the accuracy of protein secondary structure prediction using structural alignment

    Directory of Open Access Journals (Sweden)

    Gallin Warren J

    2006-06-01

    Full Text Available Abstract Background The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3 of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences, the probability of a newly identified sequence having a structural homologue is actually quite high. Results We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25% onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics indicate that this new method can achieve a Q3 score approaching 88%. Conclusion By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at http://wishart.biology.ualberta.ca/proteus. For high throughput or batch sequence analyses, the PROTEUS programs

  1. Structure of haze forming proteins in white wines: Vitis vinifera thaumatin-like proteins.

    Science.gov (United States)

    Marangon, Matteo; Van Sluyter, Steven C; Waters, Elizabeth J; Menz, Robert I

    2014-01-01

    Grape thaumatin-like proteins (TLPs) play roles in plant-pathogen interactions and can cause protein haze in white wine unless removed prior to bottling. Different isoforms of TLPs have different hazing potential and aggregation behavior. Here we present the elucidation of the molecular structures of three grape TLPs that display different hazing potential. The three TLPs have very similar structures despite belonging to two different classes (F2/4JRU is a thaumatin-like protein while I/4L5H and H2/4MBT are VVTL1), and having different unfolding temperatures (56 vs. 62°C), with protein F2/4JRU being heat unstable and forming haze, while I/4L5H does not. These differences in properties are attributable to the conformation of a single loop and the amino acid composition of its flanking regions.

  2. Structure of haze forming proteins in white wines: Vitis vinifera thaumatin-like proteins.

    Directory of Open Access Journals (Sweden)

    Matteo Marangon

    Full Text Available Grape thaumatin-like proteins (TLPs play roles in plant-pathogen interactions and can cause protein haze in white wine unless removed prior to bottling. Different isoforms of TLPs have different hazing potential and aggregation behavior. Here we present the elucidation of the molecular structures of three grape TLPs that display different hazing potential. The three TLPs have very similar structures despite belonging to two different classes (F2/4JRU is a thaumatin-like protein while I/4L5H and H2/4MBT are VVTL1, and having different unfolding temperatures (56 vs. 62°C, with protein F2/4JRU being heat unstable and forming haze, while I/4L5H does not. These differences in properties are attributable to the conformation of a single loop and the amino acid composition of its flanking regions.

  3. Diversification of Protein Cage Structure Using Circularly Permuted Subunits.

    Science.gov (United States)

    Azuma, Yusuke; Herger, Michael; Hilvert, Donald

    2018-01-17

    Self-assembling protein cages are useful as nanoscale molecular containers for diverse applications in biotechnology and medicine. To expand the utility of such systems, there is considerable interest in customizing the structures of natural cage-forming proteins and designing new ones. Here we report that a circularly permuted variant of lumazine synthase, a cage-forming enzyme from Aquifex aeolicus (AaLS) affords versatile building blocks for the construction of nanocompartments that can be easily produced, tailored, and diversified. The topologically altered protein, cpAaLS, self-assembles into spherical and tubular cage structures with morphologies that can be controlled by the length of the linker connecting the native termini. Moreover, cpAaLS proteins integrate into wild-type and other engineered AaLS assemblies by coproduction in Escherichia coli to form patchwork cages. This coassembly strategy enables encapsulation of guest proteins in the lumen, modification of the exterior through genetic fusion, and tuning of the size and electrostatics of the compartments. This addition to the family of AaLS cages broadens the scope of this system for further applications and highlights the utility of circular permutation as a potentially general strategy for tailoring the properties of cage-forming proteins.

  4. Protein flexibility: coordinate uncertainties and interpretation of structural differences

    Energy Technology Data Exchange (ETDEWEB)

    Rashin, Alexander A., E-mail: alexander-rashin@hotmail.com [BioChemComp Inc., 543 Sagamore Avenue, Teaneck, NJ 07666 (United States); LH Baker Center for Bioinformatics and Department of Biochemistry, Biophysics and Molecular Biology, 112 Office and Lab Building, Iowa State University, Ames, IA 50011-3020 (United States); Rashin, Abraham H. L. [BioChemComp Inc., 543 Sagamore Avenue, Teaneck, NJ 07666 (United States); Rutgers, The State University of New Jersey, 22371 BPO WAY, Piscataway, NJ 08854-8123 (United States); Jernigan, Robert L. [LH Baker Center for Bioinformatics and Department of Biochemistry, Biophysics and Molecular Biology, 112 Office and Lab Building, Iowa State University, Ames, IA 50011-3020 (United States); BioChemComp Inc., 543 Sagamore Avenue, Teaneck, NJ 07666 (United States)

    2009-11-01

    Criteria for the interpretability of coordinate differences and a new method for identifying rigid-body motions and nonrigid deformations in protein conformational changes are developed and applied to functionally induced and crystallization-induced conformational changes. Valid interpretations of conformational movements in protein structures determined by X-ray crystallography require that the movement magnitudes exceed their uncertainty threshold. Here, it is shown that such thresholds can be obtained from the distance difference matrices (DDMs) of 1014 pairs of independently determined structures of bovine ribonuclease A and sperm whale myoglobin, with no explanations provided for reportedly minor coordinate differences. The smallest magnitudes of reportedly functional motions are just above these thresholds. Uncertainty thresholds can provide objective criteria that distinguish between true conformational changes and apparent ‘noise’, showing that some previous interpretations of protein coordinate changes attributed to external conditions or mutations may be doubtful or erroneous. The use of uncertainty thresholds, DDMs, the newly introduced CDDMs (contact distance difference matrices) and a novel simple rotation algorithm allows a more meaningful classification and description of protein motions, distinguishing between various rigid-fragment motions and nonrigid conformational deformations. It is also shown that half of 75 pairs of identical molecules, each from the same asymmetric crystallographic cell, exhibit coordinate differences that range from just outside the coordinate uncertainty threshold to the full magnitude of large functional movements. Thus, crystallization might often induce protein conformational changes that are comparable to those related to or induced by the protein function.

  5. Improved hybrid optimization algorithm for 3D protein structure prediction.

    Science.gov (United States)

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.

  6. Reflections on protein splicing: structures, functions and mechanisms

    Science.gov (United States)

    Anraku, Yasuhiro; Satow, Yoshinori

    2009-01-01

    Twenty years ago, evidence that one gene produces two enzymes via protein splicing emerged from structural and expression studies of the VMA1 gene in Saccharomyces cerevisiae. VMA1 consists of a single open reading frame and contains two independent genetic information for Vma1p (a catalytic 70-kDa subunit of the vacuolar H+-ATPase) and VDE (a 50-kDa DNA endonuclease) as an in-frame spliced insert in the gene. Protein splicing is a posttranslational cellular process, in which an intervening polypeptide termed as the VMA1 intein is self-catalytically excised out from a nascent 120-kDa VMA1 precursor and two flanking polypeptides of the N- and C-exteins are ligated to produce the mature Vma1p. Subsequent studies have demonstrated that protein splicing is not unique to the VMA1 precursor and there are many operons in nature, which implement genetic information editing at protein level. To elucidate its structure-directed chemical mechanisms, a series of biochemical and crystal structural studies has been carried out with the use of various VMA1 recombinants. This article summarizes a VDE-mediated self-catalytic mechanism for protein splicing that is triggered and terminated solely via thiazolidine intermediates with tetrahedral configurations formed within the splicing sites where proton ingress and egress are driven by balanced protonation and deprotonation. PMID:19907126

  7. Structure and Pathology of Tau Protein in Alzheimer Disease

    Directory of Open Access Journals (Sweden)

    Michala Kolarova

    2012-01-01

    Full Text Available Alzheimer's disease (AD is the most common type of dementia. In connection with the global trend of prolonging human life and the increasing number of elderly in the population, the AD becomes one of the most serious health and socioeconomic problems of the present. Tau protein promotes assembly and stabilizes microtubules, which contributes to the proper function of neuron. Alterations in the amount or the structure of tau protein can affect its role as a stabilizer of microtubules as well as some of the processes in which it is implicated. The molecular mechanisms governing tau aggregation are mainly represented by several posttranslational modifications that alter its structure and conformational state. Hence, abnormal phosphorylation and truncation of tau protein have gained attention as key mechanisms that become tau protein in a pathological entity. Evidences about the clinicopathological significance of phosphorylated and truncated tau have been documented during the progression of AD as well as their capacity to exert cytotoxicity when expressed in cell and animal models. This paper describes the normal structure and function of tau protein and its major alterations during its pathological aggregation in AD.

  8. The Structure and Function of Non-Collagenous Bone Proteins

    Science.gov (United States)

    Hook, Magnus

    1997-01-01

    The long-term goal for this program is to determine the structural and functional relationships of bone proteins and proteins that interact with bone. This information will used to design useful pharmacological compounds that will have a beneficial effect in osteoporotic patients and in the osteoporotic-like effects experienced on long duration space missions. The first phase of this program, funded under a cooperative research agreement with NASA through the Texas Medical Center, aimed to develop powerful recombinant expression systems and purification methods for production of large amounts of target proteins. Proteins expressed in sufficient'amount and purity would be characterized by a variety of structural methods, and made available for crystallization studies. In order to increase the likelihood of crystallization and subsequent high resolution solution of structures, we undertook to develop expression of normal and mutant forms of proteins by bacterial and mammalian cells. In addition to the main goals of this program, we would also be able to provide reagents for other related studies, including development of anti-fibrotic and anti-metastatic therapeutics.

  9. Utilizing knowledge base of amino acids structural neighborhoods to predict protein-protein interaction sites.

    Science.gov (United States)

    Jelínek, Jan; Škoda, Petr; Hoksza, David

    2017-12-06

    Protein-protein interactions (PPI) play a key role in an investigation of various biochemical processes, and their identification is thus of great importance. Although computational prediction of which amino acids take part in a PPI has been an active field of research for some time, the quality of in-silico methods is still far from perfect. We have developed a novel prediction method called INSPiRE which benefits from a knowledge base built from data available in Protein Data Bank. All proteins involved in PPIs were converted into labeled graphs with nodes corresponding to amino acids and edges to pairs of neighboring amino acids. A structural neighborhood of each node was then encoded into a bit string and stored in the knowledge base. When predicting PPIs, INSPiRE labels amino acids of unknown proteins as interface or non-interface based on how often their structural neighborhood appears as interface or non-interface in the knowledge base. We evaluated INSPiRE's behavior with respect to different types and sizes of the structural neighborhood. Furthermore, we examined the suitability of several different features for labeling the nodes. Our evaluations showed that INSPiRE clearly outperforms existing methods with respect to Matthews correlation coefficient. In this paper we introduce a new knowledge-based method for identification of protein-protein interaction sites called INSPiRE. Its knowledge base utilizes structural patterns of known interaction sites in the Protein Data Bank which are then used for PPI prediction. Extensive experiments on several well-established datasets show that INSPiRE significantly surpasses existing PPI approaches.

  10. Cloud prediction of protein structure and function with PredictProtein for Debian.

    Science.gov (United States)

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome.

  11. Structure modification and functionality of whey proteins: quantitative structure-activity relationship approach.

    Science.gov (United States)

    Nakai, S; Li-Chan, E

    1985-10-01

    According to the original idea of quantitative structure-activity relationship, electric, hydrophobic, and structural parameters should be taken into consideration for elucidating functionality. Changes in these parameters are reflected in the property of protein solubility upon modification of whey proteins by heating. Although solubility is itself a functional property, it has been utilized to explain other functionalities of proteins. However, better correlations were obtained when hydrophobic parameters of the proteins were used in conjunction with solubility. Various treatments reported in the literature were applied to whey protein concentrate in an attempt to obtain whipping and gelling properties similar to those of egg white. Mapping simplex optimization was used to search for the best results. Improvement in whipping properties by pepsin hydrolysis may have been due to higher protein solubility, and good gelling properties resulting from polyphosphate treatment may have been due to an increase in exposable hydrophobicity. However, the results of angel food cake making were still unsatisfactory.

  12. PCI-SS: MISO dynamic nonlinear protein secondary structure prediction

    Directory of Open Access Journals (Sweden)

    Aboul-Magd Mohammed O

    2009-07-01

    Full Text Available Abstract Background Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing α-helices, β-strands, and non-regular structures from primary sequence data which makes use of Parallel Cascade Identification (PCI, a powerful technique from the field of nonlinear system identification. Results Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at http://bioinf.sce.carleton.ca/PCISS. In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input

  13. Extreme-Scale De Novo Genome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Georganas, Evangelos [Intel Corporation, Santa Clara, CA (United States); Hofmeyr, Steven [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Egan, Rob [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Buluc, Aydin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Rokhsar, Daniel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Yelick, Katherine [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.

    2017-09-26

    De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.

  14. What determines the structures of native folds of proteins?

    International Nuclear Information System (INIS)

    Trovato, Antonio; Hoang, Trinh X; Banavar, Jayanth R; Maritan, Amos; Seno, Flavio

    2005-01-01

    We review a simple physical model (Hoang et al 2004 Proc. Natl Acad. Sci. USA 101 7960, Banavar et al 2004 Phys. Rev. E at press) which captures the essential physico-chemical ingredients that determine protein structure, such as the inherent anisotropy of a chain molecule, the geometrical and energetic constraints placed by hydrogen bonds, sterics, and hydrophobicity. Within this framework, marginally compact conformations resembling the native state folds of proteins emerge as competing minima in the free energy landscape. Here we demonstrate that a hydrophobic-polar (HP) sequence composed of regularly repeated patterns has as its ground state a β-helical structure remarkably similar to a known architecture in the Protein Data Bank

  15. Ultrafast protein structure-based virtual screening with Panther

    Science.gov (United States)

    Niinivehmas, Sanna P.; Salokas, Kari; Lätti, Sakari; Raunio, Hannu; Pentikäinen, Olli T.

    2015-10-01

    Molecular docking is by far the most common method used in protein structure-based virtual screening. This paper presents Panther, a novel ultrafast multipurpose docking tool. In Panther, a simple shape-electrostatic model of the ligand-binding area of the protein is created by utilizing the protein crystal structure. The features of the possible ligands are then compared to the model by using a similarity search algorithm. On average, one ligand can be processed in a few minutes by using classical docking methods, whereas using Panther processing takes Panther protocol can be used in several applications, such as speeding up the early phases of drug discovery projects, reducing the number of failures in the clinical phase of the drug development process, and estimating the environmental toxicity of chemicals. Panther-code is available in our web pages (http://www.jyu.fi/panther) free of charge after registration.

  16. Neutron structure of the hydrophobic plant protein crambin

    International Nuclear Information System (INIS)

    Teeter, M.M.; Kossiakoff, A.A.

    1982-01-01

    Crystals of the small hydrophobic protein crambin have been shown to diffract to a resolution of at least 0.88 A. This means that crambin presents a rare opportunity to study a protein structure at virtually atomic resolution. The high resolution of the diffraction pattern coupled with the assets of neutron diffraction present the distinct possibility that crambin's analysis may surpass that of any other protein system in degree and accuracy of detail. The neutron crambin structure is currently being refined at 1.50 A (44.9% of the data to 1.2 A has also been included). It is expected that a nominal resolution of 1.0 A can be achieved. 15 references, 6 figures, 2 tables

  17. Structure and Function of Caltrin (cium ansport hibitor Proteins

    Directory of Open Access Journals (Sweden)

    Ernesto Javier Grasso

    2017-12-01

    Full Text Available Caltrin ( cal cium tr ansport in hibitor is a family of small and basic proteins of the mammalian seminal plasma which bind to sperm cells during ejaculation and inhibit the extracellular Ca 2+ uptake, preventing the premature acrosomal exocytosis and hyperactivation when sperm cells ascend through the female reproductive tract. The binding of caltrin proteins to specific areas of the sperm surface suggests the existence of caltrin receptors, or precise protein-phospholipid arrangements in the sperm membrane, distributed in the regions where Ca 2+ influx may take place. However, the molecular mechanisms of recognition and interaction between caltrin and spermatozoa have not been elucidated. Therefore, the aim of this article is to describe in depth the known structural features and functional properties of caltrin proteins, to find out how they may possibly interact with the sperm membranes to control the intracellular signaling that trigger physiological events required for fertilization.

  18. Structure and assembly of scalable porous protein cages

    NARCIS (Netherlands)

    Sasaki, Eita; Böhringer, Daniel; van de Waterbeemd, Michiel; Leibundgut, Marc; Zschoche, Reinhard; Heck, Albert J R; Ban, Nenad; Hilvert, Donald

    2017-01-01

    Proteins that self-assemble into regular shell-like polyhedra are useful, both in nature and in the laboratory, as molecular containers. Here we describe cryo-electron microscopy (EM) structures of two versatile encapsulation systems that exploit engineered electrostatic interactions for cargo

  19. Progression of 3D Protein Structure and Dynamics Measurements

    Science.gov (United States)

    Sato-Tomita, Ayana; Sekiguchi, Hiroshi; Sasaki, Yuji C.

    2018-06-01

    New measurement methodologies have begun to be proposed with the recent progress in the life sciences. Here, we introduce two new methodologies, X-ray fluorescence holography for protein structural analysis and diffracted X-ray tracking (DXT), to observe the dynamic behaviors of individual single molecules.

  20. Flow-induced structuring of dense protein dispersions

    NARCIS (Netherlands)

    Manski, J.M.

    2007-01-01

    Both health and sustainability are drivers for the increased interest in the creation of novel foods comprising a high protein content. The key challenge is the formation of an attractive, stable and palatable food texture, which is mainly determined by the food structure. In this research, new

  1. Crystal structure of human protein kinase CK2

    DEFF Research Database (Denmark)

    Niefind, K; Guerra, B; Ermakowa, I

    2001-01-01

    The crystal structure of a fully active form of human protein kinase CK2 (casein kinase 2) consisting of two C-terminally truncated catalytic and two regulatory subunits has been determined at 3.1 A resolution. In the CK2 complex the regulatory subunits form a stable dimer linking the two catalyt...... as a docking partner for various protein kinases. Furthermore it shows an inter-domain mobility in the catalytic subunit known to be functionally important in protein kinases and detected here for the first time directly within one crystal structure.......The crystal structure of a fully active form of human protein kinase CK2 (casein kinase 2) consisting of two C-terminally truncated catalytic and two regulatory subunits has been determined at 3.1 A resolution. In the CK2 complex the regulatory subunits form a stable dimer linking the two catalytic...... subunits, which make no direct contact with one another. Each catalytic subunit interacts with both regulatory chains, predominantly via an extended C-terminal tail of the regulatory subunit. The CK2 structure is consistent with its constitutive activity and with a flexible role of the regulatory subunit...

  2. Protein mechanics: a route from structure to function

    Indian Academy of Sciences (India)

    PRAKASH KUMAR

    and how fast individual amino acid side chains change their conformational ... within the overall protein structure, we could simply analyze the fluctuations of the mean ... value simply acts as an overall scale factor on the final results). In this case .... database (Porter et al 2004) or in an earlier elastic network study (Yang and ...

  3. Correlated mutations in protein sequences: Phylogenetic and structural effects

    Energy Technology Data Exchange (ETDEWEB)

    Lapedes, A.S. [Los Alamos National Lab., NM (United States). Theoretical Div.]|[Santa Fe Inst., NM (United States); Giraud, B.G. [C.E.N. Saclay, Gif/Yvette (France). Service Physique Theorique; Liu, L.C. [Los Alamos National Lab., NM (United States). Theoretical Div.; Stormo, G.D. [Univ. of Colorado, Boulder, CO (United States). Dept. of Molecular, Cellular and Developmental Biology

    1998-12-01

    Covariation analysis of sets of aligned sequences for RNA molecules is relatively successful in elucidating RNA secondary structure, as well as some aspects of tertiary structure. Covariation analysis of sets of aligned sequences for protein molecules is successful in certain instances in elucidating certain structural and functional links, but in general, pairs of sites displaying highly covarying mutations in protein sequences do not necessarily correspond to sites that are spatially close in the protein structure. In this paper the authors identify two reasons why naive use of covariation analysis for protein sequences fails to reliably indicate sequence positions that are spatially proximate. The first reason involves the bias introduced in calculation of covariation measures due to the fact that biological sequences are generally related by a non-trivial phylogenetic tree. The authors present a null-model approach to solve this problem. The second reason involves linked chains of covariation which can result in pairs of sites displaying significant covariation even though they are not spatially proximate. They present a maximum entropy solution to this classic problem of causation versus correlation. The methodologies are validated in simulation.

  4. A probabilistic fragment-based protein structure prediction algorithm.

    Directory of Open Access Journals (Sweden)

    David Simoncini

    Full Text Available Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of [Formula: see text] proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold's decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software.html [corrected].

  5. Solution structure of the human signaling protein RACK1

    Directory of Open Access Journals (Sweden)

    Papa Priscila F

    2010-06-01

    Full Text Available Abstract Background The adaptor protein RACK1 (receptor of activated kinase 1 was originally identified as an anchoring protein for protein kinase C. RACK1 is a 36 kDa protein, and is composed of seven WD repeats which mediate its protein-protein interactions. RACK1 is ubiquitously expressed and has been implicated in diverse cellular processes involving: protein translation regulation, neuropathological processes, cellular stress, and tissue development. Results In this study we performed a biophysical analysis of human RACK1 with the aim of obtaining low resolution structural information. Small angle X-ray scattering (SAXS experiments demonstrated that human RACK1 is globular and monomeric in solution and its low resolution structure is strikingly similar to that of an homology model previously calculated by us and to the crystallographic structure of RACK1 isoform A from Arabidopsis thaliana. Both sedimentation velocity and sedimentation equilibrium analytical ultracentrifugation techniques showed that RACK1 is predominantly a monomer of around 37 kDa in solution, but also presents small amounts of oligomeric species. Moreover, hydrodynamic data suggested that RACK1 has a slightly asymmetric shape. The interaction of RACK1 and Ki-1/57 was tested by sedimentation equilibrium. The results suggested that the association between RACK1 and Ki-1/57(122-413 follows a stoichiometry of 1:1. The binding constant (KB observed for RACK1-Ki-1/57(122-413 interaction was of around (1.5 ± 0.2 × 106 M-1 and resulted in a dissociation constant (KD of (0.7 ± 0.1 × 10-6 M. Moreover, the fluorescence data also suggests that the interaction may occur in a cooperative fashion. Conclusion Our SAXS and analytical ultracentrifugation experiments indicated that RACK1 is predominantly a monomer in solution. RACK1 and Ki-1/57(122-413 interact strongly under the tested conditions.

  6. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field.

    Science.gov (United States)

    Xu, Dong; Zhang, Yang

    2012-07-01

    Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field. Copyright © 2012 Wiley Periodicals, Inc.

  7. Improved protein surface comparison and application to low-resolution protein structure data

    Directory of Open Access Journals (Sweden)

    Kihara Daisuke

    2010-12-01

    Full Text Available Abstract Background Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM, which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs. The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. Results The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Conclusions Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy.

  8. Improved protein surface comparison and application to low-resolution protein structure data.

    Science.gov (United States)

    Sael, Lee; Kihara, Daisuke

    2010-12-14

    Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM), which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs). The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy.

  9. Thermal green protein, an extremely stable, nonaggregating fluorescent protein created by structure-guided surface engineering.

    Science.gov (United States)

    Close, Devin W; Paul, Craig Don; Langan, Patricia S; Wilce, Matthew C J; Traore, Daouda A K; Halfmann, Randal; Rocha, Reginaldo C; Waldo, Geoffery S; Payne, Riley J; Rucker, Joseph B; Prescott, Mark; Bradbury, Andrew R M

    2015-07-01

    In this article, we describe the engineering and X-ray crystal structure of Thermal Green Protein (TGP), an extremely stable, highly soluble, non-aggregating green fluorescent protein. TGP is a soluble variant of the fluorescent protein eCGP123, which despite being highly stable, has proven to be aggregation-prone. The X-ray crystal structure of eCGP123, also determined within the context of this paper, was used to carry out rational surface engineering to improve its solubility, leading to TGP. The approach involved simultaneously eliminating crystal lattice contacts while increasing the overall negative charge of the protein. Despite intentional disruption of lattice contacts and introduction of high entropy glutamate side chains, TGP crystallized readily in a number of different conditions and the X-ray crystal structure of TGP was determined to 1.9 Å resolution. The structural reasons for the enhanced stability of TGP and eCGP123 are discussed. We demonstrate the utility of using TGP as a fusion partner in various assays and significantly, in amyloid assays in which the standard fluorescent protein, EGFP, is undesirable because of aberrant oligomerization. © 2014 Wiley Periodicals, Inc.

  10. Identification of similar regions of protein structures using integrated sequence and structure analysis tools

    Directory of Open Access Journals (Sweden)

    Heiland Randy

    2006-03-01

    Full Text Available Abstract Background Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Results Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. Conclusion With structural genomics initiatives determining structures with little, if any, functional characterization

  11. Protein kinase CK2 in health and disease: Protein kinase CK2: from structures to insights

    DEFF Research Database (Denmark)

    Niefind, K; Raaf, J; Issinger, Olaf-Georg

    2009-01-01

    the critical region of CK2alpha recruitment is pre-formed in the unbound state. In CK2alpha the activation segment - a key element of protein kinase regulation - adapts invariably the typical conformation of the active enzymes. Recent structures of human CK2alpha revealed a surprising plasticity in the ATP......Within the last decade, 40 crystal structures corresponding to protein kinase CK2 (former name 'casein kinase 2'), to its catalytic subunit CK2alpha and to its regulatory subunit CK2beta were published. Together they provide a valuable, yet by far not complete basis to rationalize the biochemical...

  12. 3DProIN: Protein-Protein Interaction Networks and Structure Visualization.

    Science.gov (United States)

    Li, Hui; Liu, Chunmei

    2014-06-14

    3DProIN is a computational tool to visualize protein-protein interaction networks in both two dimensional (2D) and three dimensional (3D) view. It models protein-protein interactions in a graph and explores the biologically relevant features of the tertiary structures of each protein in the network. Properties such as color, shape and name of each node (protein) of the network can be edited in either 2D or 3D views. 3DProIN is implemented using 3D Java and C programming languages. The internet crawl technique is also used to parse dynamically grasped protein interactions from protein data bank (PDB). It is a java applet component that is embedded in the web page and it can be used on different platforms including Linux, Mac and Window using web browsers such as Firefox, Internet Explorer, Chrome and Safari. It also was converted into a mac app and submitted to the App store as a free app. Mac users can also download the app from our website. 3DProIN is available for academic research at http://bicompute.appspot.com.

  13. Compare local pocket and global protein structure models by small structure patterns

    KAUST Repository

    Cui, Xuefeng

    2015-09-09

    Researchers proposed several criteria to assess the quality of predicted protein structures because it is one of the essential tasks in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) competitions. Popular criteria include root mean squared deviation (RMSD), MaxSub score, TM-score, GDT-TS and GDT-HA scores. All these criteria require calculation of rigid transformations to superimpose the the predicted protein structure to the native protein structure. Yet, how to obtain the rigid transformations is unknown or with high time complexity, and, hence, heuristic algorithms were proposed. In this work, we carefully design various small structure patterns, including the ones specifically tuned for local pockets. Such structure patterns are biologically meaningful, and address the issue of relying on a sufficient number of backbone residue fragments for existing methods. We sample the rigid transformations from these small structure patterns; and the optimal superpositions yield by these small structures are refined and reported. As a result, among 11; 669 pairs of predicted and native local protein pocket models from the CASP10 dataset, the GDT-TS scores calculated by our method are significantly higher than those calculated by LGA. Moreover, our program is computationally much more efficient. Source codes and executables are publicly available at http://www.cbrc.kaust.edu.sa/prosta/

  14. Analyzing the simplicial decomposition of spatial protein structures

    Directory of Open Access Journals (Sweden)

    Szabadka Zoltán

    2008-02-01

    Full Text Available Abstract Background The fast growing Protein Data Bank contains the three-dimensional description of more than 45000 protein- and nucleic-acid structures today. The large majority of the data in the PDB are measured by X-ray crystallography by thousands of researchers in millions of work-hours. Unfortunately, lots of structural errors, bad labels, missing atoms, falsely identified chains and groups make dificult the automated processing of this treasury of structural biological data. Results After we performed a rigorous re-structuring of the whole PDB on graph-theoretical basis, we created the RS-PDB (Rich-Structure PDB database. Using this cleaned and repaired database, we defined simplicial complexes on the heavy-atoms of the PDB, and analyzed the tetrahedra for geometric properties. Conclusion We have found surprisingly characteristic differences between simplices with atomic vertices of different types, and between the atomic neighborhoods – described also by simplices – of different ligand atoms in proteins.

  15. Optimal neural networks for protein-structure prediction

    International Nuclear Information System (INIS)

    Head-Gordon, T.; Stillinger, F.H.

    1993-01-01

    The successful application of neural-network algorithms for prediction of protein structure is stymied by three problem areas: the sparsity of the database of known protein structures, poorly devised network architectures which make the input-output mapping opaque, and a global optimization problem in the multiple-minima space of the network variables. We present a simplified polypeptide model residing in two dimensions with only two amino-acid types, A and B, which allows the determination of the global energy structure for all possible sequences of pentamer, hexamer, and heptamer lengths. This model simplicity allows us to compile a complete structural database and to devise neural networks that reproduce the tertiary structure of all sequences with absolute accuracy and with the smallest number of network variables. These optimal networks reveal that the three problem areas are convoluted, but that thoughtful network designs can actually deconvolute these detrimental traits to provide network algorithms that genuinely impact on the ability of the network to generalize or learn the desired mappings. Furthermore, the two-dimensional polypeptide model shows sufficient chemical complexity so that transfer of neural-network technology to more realistic three-dimensional proteins is evident

  16. Water polygons in high-resolution protein crystal structures.

    Science.gov (United States)

    Lee, Jonas; Kim, Sung-Hou

    2009-07-01

    We have analyzed the interstitial water (ISW) structures in 1500 protein crystal structures deposited in the Protein Data Bank that have greater than 1.5 A resolution with less than 90% sequence similarity with each other. We observed varieties of polygonal water structures composed of three to eight water molecules. These polygons may represent the time- and space-averaged structures of "stable" water oligomers present in liquid water, and their presence as well as relative population may be relevant in understanding physical properties of liquid water at a given temperature. On an average, 13% of ISWs are localized enough to be visible by X-ray diffraction. Of those, averages of 78% are water molecules in the first water layer on the protein surface. Of the localized ISWs beyond the first layer, almost half of them form water polygons such as trigons, tetragons, as well as expected pentagons, hexagons, higher polygons, partial dodecahedrons, and disordered networks. Most of the octagons and nanogons are formed by fusion of smaller polygons. The trigons are most commonly observed. We suggest that our observation provides an experimental basis for including these water polygon structures in correlating and predicting various water properties in liquid state.

  17. De novo characterisation of the greenlip abalone transcriptome (Haliotis laevigata) with a focus on the heat shock protein 70 (HSP70) family.

    Science.gov (United States)

    Shiel, Brett P; Hall, Nathan E; Cooke, Ira R; Robinson, Nicholas A; Strugnell, Jan M

    2015-02-01

    Abalone (Haliotis) are economically important molluscs for fisheries and aquaculture industries worldwide. Despite this, genomic resources for abalone and molluscs are still limited. Here we present a description and functional annotation of the greenlip abalone (Haliotis laevigata) transcriptome. We present a focused analysis on the heat shock protein 70 (HSP70) family of genes with putative functions affecting temperature stress and immunity. A total of ~38 million paired end Illumina reads were obtained, resulting in a Trinity assembly of 222,172 contigs with minimum length of 200 base pairs and maximum length of 33 kilobases. The 20,702 contigs were annotated with gene descriptions by BLAST. We created a program to maximise the number of functionally annotated genes, and over 10,000 contigs were assigned Gene ontologies (GO terms). By using CateGOrizer, immunity related GO terms for stressors such as heat, hypoxia, oxidative stress and wounding received the highest counts. Twenty-six contigs with homology to the HSP70 family of genes were identified. Ninety-one putative single-nucleotide polymorphisms were observed in the abalone HSP70 contigs. Eleven of these were considered non-synonymous. The annotated transcriptome described in this study will be a useful basis for future work investigating the genetic response of abalone to stress.

  18. Structuring detergents for extracting and stabilizing functional membrane proteins.

    Directory of Open Access Journals (Sweden)

    Rima Matar-Merheb

    Full Text Available BACKGROUND: Membrane proteins are privileged pharmaceutical targets for which the development of structure-based drug design is challenging. One underlying reason is the fact that detergents do not stabilize membrane domains as efficiently as natural lipids in membranes, often leading to a partial to complete loss of activity/stability during protein extraction and purification and preventing crystallization in an active conformation. METHODOLOGY/PRINCIPAL FINDINGS: Anionic calix[4]arene based detergents (C4Cn, n=1-12 were designed to structure the membrane domains through hydrophobic interactions and a network of salt bridges with the basic residues found at the cytosol-membrane interface of membrane proteins. These compounds behave as surfactants, forming micelles of 5-24 nm, with the critical micellar concentration (CMC being as expected sensitive to pH ranging from 0.05 to 1.5 mM. Both by 1H NMR titration and Surface Tension titration experiments, the interaction of these molecules with the basic amino acids was confirmed. They extract membrane proteins from different origins behaving as mild detergents, leading to partial extraction in some cases. They also retain protein functionality, as shown for BmrA (Bacillus multidrug resistance ATP protein, a membrane multidrug-transporting ATPase, which is particularly sensitive to detergent extraction. These new detergents allow BmrA to bind daunorubicin with a Kd of 12 µM, a value similar to that observed after purification using dodecyl maltoside (DDM. They preserve the ATPase activity of BmrA (which resets the protein to its initial state after drug efflux much more efficiently than SDS (sodium dodecyl sulphate, FC12 (Foscholine 12 or DDM. They also maintain in a functional state the C4Cn-extracted protein upon detergent exchange with FC12. Finally, they promote 3D-crystallization of the membrane protein. CONCLUSION/SIGNIFICANCE: These compounds seem promising to extract in a functional state

  19. Structural interface parameters are discriminatory in recognising near-native poses of protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Sony Malhotra

    Full Text Available Interactions at the molecular level in the cellular environment play a very crucial role in maintaining the physiological functioning of the cell. These molecular interactions exist at varied levels viz. protein-protein interactions, protein-nucleic acid interactions or protein-small molecules interactions. Presently in the field, these interactions and their mechanisms mark intensively studied areas. Molecular interactions can also be studied computationally using the approach named as Molecular Docking. Molecular docking employs search algorithms to predict the possible conformations for interacting partners and then calculates interaction energies. However, docking proposes number of solutions as different docked poses and hence offers a serious challenge to identify the native (or near native structures from the pool of these docked poses. Here, we propose a rigorous scoring scheme called DockScore which can be used to rank the docked poses and identify the best docked pose out of many as proposed by docking algorithm employed. The scoring identifies the optimal interactions between the two protein partners utilising various features of the putative interface like area, short contacts, conservation, spatial clustering and the presence of positively charged and hydrophobic residues. DockScore was first trained on a set of 30 protein-protein complexes to determine the weights for different parameters. Subsequently, we tested the scoring scheme on 30 different protein-protein complexes and native or near-native structure were assigned the top rank from a pool of docked poses in 26 of the tested cases. We tested the ability of DockScore to discriminate likely dimer interactions that differ substantially within a homologous family and also demonstrate that DOCKSCORE can distinguish correct pose for all 10 recent CAPRI targets.

  20. Modulating nanoparticle superlattice structure using proteins with tunable bond distributions

    International Nuclear Information System (INIS)

    McMillan, Janet R.; Brodin, Jeffrey D.; Millan, Jaime A.; Lee, Byeongdu; Olvera de la Cruz, Monica; Mirkin, Chad A.

    2017-01-01

    Here, we investigate the use of proteins with tunable DNA modification distributions to modulate nanoparticle superlattice structure. Using Beta-galactosidase (βgal) as a model system, we have employed the orthogonal chemical reactivities of surface amines and thiols to synthesize protein-DNA conjugates with 36 evenly distributed or 8 specifically positioned oligonucleotides. When assembled into crystalline superlattices with AuNPs, we find that the distribution of DNA modifications modulates the favored structure: βgal with uniformly distributed DNA bonding elements results in body-centered cubic crystals, whereas DNA functionalization of cysteines results in AB 2 packing. We probe the role of protein oligonucleotide number and conjugate size on this observation, which revealed the importance of oligonucleotide distribution and number in this observed assembly behavior. These results indicate that proteins with defined DNA-modification patterns are powerful tools to control the nanoparticle superlattices architecture, and establish the importance of oligonucleotide distribution in the assembly behavior of protein-DNA conjugates.

  1. A resource for benchmarking the usefulness of protein structure models

    Directory of Open Access Journals (Sweden)

    Carbajo Daniel

    2012-08-01

    Full Text Available Abstract Background Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. Results This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. Conclusions The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. Implementation, availability and requirements Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php. Operating system(s: Platform independent. Programming language: Perl-BioPerl (program; mySQL, Perl DBI and DBD modules (database; php, JavaScript, Jmol scripting (web server. Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet and PSAIA. License: Free. Any

  2. Structure and assembly of scalable porous protein cages

    Science.gov (United States)

    Sasaki, Eita; Böhringer, Daniel; van de Waterbeemd, Michiel; Leibundgut, Marc; Zschoche, Reinhard; Heck, Albert J. R.; Ban, Nenad; Hilvert, Donald

    2017-03-01

    Proteins that self-assemble into regular shell-like polyhedra are useful, both in nature and in the laboratory, as molecular containers. Here we describe cryo-electron microscopy (EM) structures of two versatile encapsulation systems that exploit engineered electrostatic interactions for cargo loading. We show that increasing the number of negative charges on the lumenal surface of lumazine synthase, a protein that naturally assembles into a ~1-MDa dodecahedron composed of 12 pentamers, induces stepwise expansion of the native protein shell, giving rise to thermostable ~3-MDa and ~6-MDa assemblies containing 180 and 360 subunits, respectively. Remarkably, these expanded particles assume unprecedented tetrahedrally and icosahedrally symmetric structures constructed entirely from pentameric units. Large keyhole-shaped pores in the shell, not present in the wild-type capsid, enable diffusion-limited encapsulation of complementarily charged guests. The structures of these supercharged assemblies demonstrate how programmed electrostatic effects can be effectively harnessed to tailor the architecture and properties of protein cages.

  3. Cluster protein structures using recurrence quantification analysis on coordinates of alpha-carbon atoms of proteins

    International Nuclear Information System (INIS)

    Zhou Yu; Yu Zuguo; Anh, Vo

    2007-01-01

    The 3-dimensional coordinates of alpha-carbon atoms of proteins are used to distinguish the protein structural classes based on recurrence quantification analysis (RQA). We consider two independent variables from RQA of coordinates of alpha-carbon atoms, %determ1 and %determ2, which were defined by Webber et al. [C.L. Webber Jr., A. Giuliani, J.P. Zbilut, A. Colosimo, Proteins Struct. Funct. Genet. 44 (2001) 292]. The variable %determ2 is used to define two new variables, %determ2 1 and %determ2 2 . Then three variables %determ1, %determ2 1 and %determ2 2 are used to construct a 3-dimensional variable space. Each protein is represented by a point in this variable space. The points corresponding to proteins from the α, β, α+β and α/β structural classes position into different areas in this variable space. In order to give a quantitative assessment of our clustering on the selected proteins, Fisher's discriminant algorithm is used. Numerical results indicate that the discriminant accuracies are very high and satisfactory

  4. Prediction of protein-protein interaction sites in sequences and 3D structures by random forests.

    Directory of Open Access Journals (Sweden)

    Mile Sikić

    2009-01-01

    Full Text Available Identifying interaction sites in proteins provides important clues to the function of a protein and is becoming increasingly relevant in topics such as systems biology and drug discovery. Although there are numerous papers on the prediction of interaction sites using information derived from structure, there are only a few case reports on the prediction of interaction residues based solely on protein sequence. Here, a sliding window approach is combined with the Random Forests method to predict protein interaction sites using (i a combination of sequence- and structure-derived parameters and (ii sequence information alone. For sequence-based prediction we achieved a precision of 84% with a 26% recall and an F-measure of 40%. When combined with structural information, the prediction performance increases to a precision of 76% and a recall of 38% with an F-measure of 51%. We also present an attempt to rationalize the sliding window size and demonstrate that a nine-residue window is the most suitable for predictor construction. Finally, we demonstrate the applicability of our prediction methods by modeling the Ras-Raf complex using predicted interaction sites as target binding interfaces. Our results suggest that it is possible to predict protein interaction sites with quite a high accuracy using only sequence information.

  5. Defining an essence of structure determining residue contacts in proteins.

    Science.gov (United States)

    Sathyapriya, R; Duarte, Jose M; Stehr, Henning; Filippis, Ioannis; Lappe, Michael

    2009-12-01

    The network of native non-covalent residue contacts determines the three-dimensional structure of a protein. However, not all contacts are of equal structural significance, and little knowledge exists about a minimal, yet sufficient, subset required to define the global features of a protein. Characterisation of this "structural essence" has remained elusive so far: no algorithmic strategy has been devised to-date that could outperform a random selection in terms of 3D reconstruction accuracy (measured as the Ca RMSD). It is not only of theoretical interest (i.e., for design of advanced statistical potentials) to identify the number and nature of essential native contacts-such a subset of spatial constraints is very useful in a number of novel experimental methods (like EPR) which rely heavily on constraint-based protein modelling. To derive accurate three-dimensional models from distance constraints, we implemented a reconstruction pipeline using distance geometry. We selected a test-set of 12 protein structures from the four major SCOP fold classes and performed our reconstruction analysis. As a reference set, series of random subsets (ranging from 10% to 90% of native contacts) are generated for each protein, and the reconstruction accuracy is computed for each subset. We have developed a rational strategy, termed "cone-peeling" that combines sequence features and network descriptors to select minimal subsets that outperform the reference sets. We present, for the first time, a rational strategy to derive a structural essence of residue contacts and provide an estimate of the size of this minimal subset. Our algorithm computes sparse subsets capable of determining the tertiary structure at approximately 4.8 A Ca RMSD with as little as 8% of the native contacts (Ca-Ca and Cb-Cb). At the same time, a randomly chosen subset of native contacts needs about twice as many contacts to reach the same level of accuracy. This "structural essence" opens new avenues in the

  6. Integrating population variation and protein structural analysis to improve clinical interpretation of missense variation: application to the WD40 domain.

    Science.gov (United States)

    Laskowski, Roman A; Tyagi, Nidhi; Johnson, Diana; Joss, Shelagh; Kinning, Esther; McWilliam, Catherine; Splitt, Miranda; Thornton, Janet M; Firth, Helen V; Wright, Caroline F

    2016-03-01

    We present a generic, multidisciplinary approach for improving our understanding of novel missense variants in recently discovered disease genes exhibiting genetic heterogeneity, by combining clinical and population genetics with protein structural analysis. Using six new de novo missense diagnoses in TBL1XR1 from the Deciphering Developmental Disorders study, together with population variation data, we show that the β-propeller structure of the ubiquitous WD40 domain provides a convincing way to discriminate between pathogenic and benign variation. Children with likely pathogenic mutations in this gene have severely delayed language development, often accompanied by intellectual disability, autism, dysmorphology and gastrointestinal problems. Amino acids affected by likely pathogenic missense mutations are either crucial for the stability of the fold, forming part of a highly conserved symmetrically repeating hydrogen-bonded tetrad, or located at the top face of the β-propeller, where 'hotspot' residues affect the binding of β-catenin to the TBLR1 protein. In contrast, those altered by population variation are significantly less likely to be spatially clustered towards the top face or to be at buried or highly conserved residues. This result is useful not only for interpreting benign and pathogenic missense variants in this gene, but also in other WD40 domains, many of which are associated with disease. © The Author 2016. Published by Oxford University Press.

  7. Predicting protein structures with a multiplayer online game.

    Science.gov (United States)

    Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran; Players, Foldit

    2010-08-05

    People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully 'crowd-sourced' through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.

  8. Identification of structural protein-protein interactions of herpes simplex virus type 1.

    Science.gov (United States)

    Lee, Jin H; Vittone, Valerio; Diefenbach, Eve; Cunningham, Anthony L; Diefenbach, Russell J

    2008-09-01

    In this study we have defined protein-protein interactions between the structural proteins of herpes simplex virus type 1 (HSV-1) using a LexA yeast two-hybrid system. The majority of the capsid, tegument and envelope proteins of HSV-1 were screened in a matrix approach. A total of 40 binary interactions were detected including 9 out of 10 previously identified tegument-tegument interactions (Vittone, V., Diefenbach, E., Triffett, D., Douglas, M.W., Cunningham, A.L., and Diefenbach, R.J., 2005. Determination of interactions between tegument proteins of herpes simplex virus type 1. J. Virol. 79, 9566-9571). A total of 12 interactions involving the capsid protein pUL35 (VP26) and 11 interactions involving the tegument protein pUL46 (VP11/12) were identified. The most significant novel interactions detected in this study, which are likely to play a role in viral assembly, include pUL35-pUL37 (capsid-tegument), pUL46-pUL37 (tegument-tegument) and pUL49 (VP22)-pUS9 (tegument-envelope). This information will provide further insights into the pathways of HSV-1 assembly and the identified interactions are potential targets for new antiviral drugs.

  9. The E4 protein; structure, function and patterns of expression

    Energy Technology Data Exchange (ETDEWEB)

    Doorbar, John, E-mail: jdoorba@nimr.mrc.ac.uk

    2013-10-15

    The papillomavirus E4 open reading frame (ORF) is contained within the E2 ORF, with the primary E4 gene-product (E1{sup ∧}E4) being translated from a spliced mRNA that includes the E1 initiation codon and adjacent sequences. E4 is located centrally within the E2 gene, in a region that encodes the E2 protein′s flexible hinge domain. Although a number of minor E4 transcripts have been reported, it is the product of the abundant E1{sup ∧}E4 mRNA that has been most extensively analysed. During the papillomavirus life cycle, the E1{sup ∧}E4 gene products generally become detectable at the onset of vegetative viral genome amplification as the late stages of infection begin. E4 contributes to genome amplification success and virus synthesis, with its high level of expression suggesting additional roles in virus release and/or transmission. In general, E4 is easily visualised in biopsy material by immunostaining, and can be detected in lesions caused by diverse papillomavirus types, including those of dogs, rabbits and cattle as well as humans. The E4 protein can serve as a biomarker of active virus infection, and in the case of high-risk human types also disease severity. In some cutaneous lesions, E4 can be expressed at higher levels than the virion coat proteins, and can account for as much as 30% of total lesional protein content. The E4 proteins of the Beta, Gamma and Mu HPV types assemble into distinctive cytoplasmic, and sometimes nuclear, inclusion granules. In general, the E4 proteins are expressed before L2 and L1, with their structure and function being modified, first by kinases as the infected cell progresses through the S and G2 cell cycle phases, but also by proteases as the cell exits the cell cycle and undergoes true terminal differentiation. The kinases that regulate E4 also affect other viral proteins simultaneously, and include protein kinase A, Cyclin-dependent kinase, members of the MAP Kinase family and protein kinase C. For HPV16 E1{sup

  10. Ranking beta sheet topologies with applications to protein structure prediction

    DEFF Research Database (Denmark)

    Fonseca, Rasmus; Helles, Glennie; Winter, Pawel

    2011-01-01

    One reason why ab initio protein structure predictors do not perform very well is their inability to reliably identify long-range interactions between amino acids. To achieve reliable long-range interactions, all potential pairings of ß-strands (ß-topologies) of a given protein are enumerated......, including the native ß-topology. Two very different ß-topology scoring methods from the literature are then used to rank all potential ß-topologies. This has not previously been attempted for any scoring method. The main result of this paper is a justification that one of the scoring methods, in particular......, consistently top-ranks native ß-topologies. Since the number of potential ß-topologies grows exponentially with the number of ß-strands, it is unrealistic to expect that all potential ß-topologies can be enumerated for large proteins. The second result of this paper is an enumeration scheme of a subset of ß-topologies...

  11. Innate Immune Evasion Mediated by Flaviviridae Non-Structural Proteins.

    Science.gov (United States)

    Chen, Shun; Wu, Zhen; Wang, Mingshu; Cheng, Anchun

    2017-10-07

    Flaviviridae-caused diseases are a critical, emerging public health problem worldwide. Flaviviridae infections usually cause severe, acute or chronic diseases, such as liver damage and liver cancer resulting from a hepatitis C virus (HCV) infection and high fever and shock caused by yellow fever. Many researchers worldwide are investigating the mechanisms by which Flaviviridae cause severe diseases. Flaviviridae can interfere with the host's innate immunity to achieve their purpose of proliferation. For instance, dengue virus (DENV) NS2A, NS2B3, NS4A, NS4B and NS5; HCV NS2, NS3, NS3/4A, NS4B and NS5A; and West Nile virus (WNV) NS1 and NS4B proteins are involved in immune evasion. This review discusses the interplay between viral non-structural Flaviviridae proteins and relevant host proteins, which leads to the suppression of the host's innate antiviral immunity.

  12. Structural characterization of Mumps virus fusion protein core

    International Nuclear Information System (INIS)

    Liu Yueyong; Xu Yanhui; Lou Zhiyong; Zhu Jieqing; Hu Xuebo; Gao, George F.; Qiu Bingsheng; Rao Zihe; Tien, Po

    2006-01-01

    The fusion proteins of enveloped viruses mediating the fusion between the viral and cellular membranes comprise two discontinuous heptad repeat (HR) domains located at the ectodomain of the enveloped glycoproteins. The crystal structure of the fusion protein core of Mumps virus (MuV) was determined at 2.2 A resolution. The complex is a six-helix bundle in which three HR1 peptides form a central highly hydrophobic coiled-coil and three HR2 peptides pack against the hydrophobic grooves on the surface of central coiled-coil in an oblique antiparallel manner. Fusion core of MuV, like those of simian virus 5 and human respiratory syncytium virus, forms typical 3-4-4-4-3 spacing. The similar charecterization in HR1 regions, as well as the existence of O-X-O motif in extended regions of HR2 helix, suggests a basic rule for the formation of the fusion core of viral fusion proteins

  13. Malfolded protein structure and proteostasis in lung diseases.

    Science.gov (United States)

    Balch, William E; Sznajder, Jacob I; Budinger, Scott; Finley, Daniel; Laposky, Aaron D; Cuervo, Ana Maria; Benjamin, Ivor J; Barreiro, Esther; Morimoto, Richard I; Postow, Lisa; Weissman, Allan M; Gail, Dorothy; Banks-Schlegel, Susan; Croxton, Thomas; Gan, Weiniu

    2014-01-01

    Recent discoveries indicate that disorders of protein folding and degradation play a particularly important role in the development of lung diseases and their associated complications. The overarching purpose of the National Heart, Lung, and Blood Institute workshop on "Malformed Protein Structure and Proteostasis in Lung Diseases" was to identify mechanistic and clinical research opportunities indicated by these recent discoveries in proteostasis science that will advance our molecular understanding of lung pathobiology and facilitate the development of new diagnostic and therapeutic strategies for the prevention and treatment of lung disease. The workshop's discussion focused on identifying gaps in scientific knowledge with respect to proteostasis and lung disease, discussing new research advances and opportunities in protein folding science, and highlighting novel technologies with potential therapeutic applications for diagnosis and treatment.

  14. Structural Isosteres of Phosphate Groups in the Protein Data Bank.

    Science.gov (United States)

    Zhang, Yuezhou; Borrel, Alexandre; Ghemtio, Leo; Regad, Leslie; Boije Af Gennäs, Gustav; Camproux, Anne-Claude; Yli-Kauhaluoma, Jari; Xhaard, Henri

    2017-03-27

    We developed a computational workflow to mine the Protein Data Bank for isosteric replacements that exist in different binding site environments but have not necessarily been identified and exploited in compound design. Taking phosphate groups as examples, the workflow was used to construct 157 data sets, each composed of a reference protein complexed with AMP, ADP, ATP, or pyrophosphate as well other ligands. Phosphate binding sites appear to have a high hydration content and large size, resulting in U-shaped bioactive conformations recurrently found across unrelated protein families. A total of 16 413 replacements were extracted, filtered for a significant structural overlap on phosphate groups, and sorted according to their SMILES codes. In addition to the classical isosteres of phosphate, such as carboxylate, sulfone, or sulfonamide, unexpected replacements that do not conserve charge or polarity, such as aryl, aliphatic, or positively charged groups, were found.

  15. Dengue Virus Non-structural Protein 1 Modulates Infectious Particle Production via Interaction with the Structural Proteins.

    Directory of Open Access Journals (Sweden)

    Pietro Scaturro

    Full Text Available Non-structural protein 1 (NS1 is one of the most enigmatic proteins of the Dengue virus (DENV, playing distinct functions in immune evasion, pathogenesis and viral replication. The recently reported crystal structure of DENV NS1 revealed its peculiar three-dimensional fold; however, detailed information on NS1 function at different steps of the viral replication cycle is still missing. By using the recently reported crystal structure, as well as amino acid sequence conservation, as a guide for a comprehensive site-directed mutagenesis study, we discovered that in addition to being essential for RNA replication, DENV NS1 is also critically required for the production of infectious virus particles. Taking advantage of a trans-complementation approach based on fully functional epitope-tagged NS1 variants, we identified previously unreported interactions between NS1 and the structural proteins Envelope (E and precursor Membrane (prM. Interestingly, coimmunoprecipitation revealed an additional association with capsid, arguing that NS1 interacts via the structural glycoproteins with DENV particles. Results obtained with mutations residing either in the NS1 Wing domain or in the β-ladder domain suggest that NS1 might have two distinct functions in the assembly of DENV particles. By using a trans-complementation approach with a C-terminally KDEL-tagged ER-resident NS1, we demonstrate that the secretion of NS1 is dispensable for both RNA replication and infectious particle production. In conclusion, our results provide an extensive genetic map of NS1 determinants essential for viral RNA replication and identify a novel role of NS1 in virion production that is mediated via interaction with the structural proteins. These studies extend the list of NS1 functions and argue for a central role in coordinating replication and assembly/release of infectious DENV particles.

  16. Membrane protein structure determination by SAD, SIR, or SIRAS phasing in serial femtosecond crystallography using an iododetergent

    Science.gov (United States)

    Nakane, Takanori; Hanashima, Shinya; Suzuki, Mamoru; Saiki, Haruka; Hayashi, Taichi; Kakinouchi, Keisuke; Sugiyama, Shigeru; Kawatake, Satoshi; Matsuoka, Shigeru; Matsumori, Nobuaki; Nango, Eriko; Kobayashi, Jun; Shimamura, Tatsuro; Kimura, Kanako; Mori, Chihiro; Kunishima, Naoki; Sugahara, Michihiro; Takakyu, Yoko; Inoue, Shigeyuki; Masuda, Tetsuya; Hosaka, Toshiaki; Tono, Kensuke; Joti, Yasumasa; Kameshima, Takashi; Hatsui, Takaki; Inoue, Tsuyoshi; Nureki, Osamu; Iwata, So; Murata, Michio; Mizohata, Eiichi

    2016-01-01

    The 3D structure determination of biological macromolecules by X-ray crystallography suffers from a phase problem: to perform Fourier transformation to calculate real space density maps, both intensities and phases of structure factors are necessary; however, measured diffraction patterns give only intensities. Although serial femtosecond crystallography (SFX) using X-ray free electron lasers (XFELs) has been steadily developed since 2009, experimental phasing still remains challenging. Here, using 7.0-keV (1.771 Å) X-ray pulses from the SPring-8 Angstrom Compact Free Electron Laser (SACLA), iodine single-wavelength anomalous diffraction (SAD), single isomorphous replacement (SIR), and single isomorphous replacement with anomalous scattering (SIRAS) phasing were performed in an SFX regime for a model membrane protein bacteriorhodopsin (bR). The crystals grown in bicelles were derivatized with an iodine-labeled detergent heavy-atom additive 13a (HAD13a), which contains the magic triangle, I3C head group with three iodine atoms. The alkyl tail was essential for binding of the detergent to the surface of bR. Strong anomalous and isomorphous difference signals from HAD13a enabled successful phasing using reflections up to 2.1-Å resolution from only 3,000 and 4,000 indexed images from native and derivative crystals, respectively. When more images were merged, structure solution was possible with data truncated at 3.3-Å resolution, which is the lowest resolution among the reported cases of SFX phasing. Moreover, preliminary SFX experiment showed that HAD13a successfully derivatized the G protein-coupled A2a adenosine receptor crystallized in lipidic cubic phases. These results pave the way for de novo structure determination of membrane proteins, which often diffract poorly, even with the brightest XFEL beams. PMID:27799539

  17. Structure of the ordered hydration of amino acids in proteins: analysis of crystal structures

    Energy Technology Data Exchange (ETDEWEB)

    Biedermannová, Lada, E-mail: lada.biedermannova@ibt.cas.cz; Schneider, Bohdan [Institute of Biotechnology CAS, Videnska 1083, 142 20 Prague (Czech Republic)

    2015-10-27

    The hydration of protein crystal structures was studied at the level of individual amino acids. The dependence of the number of water molecules and their preferred spatial localization on various parameters, such as solvent accessibility, secondary structure and side-chain conformation, was determined. Crystallography provides unique information about the arrangement of water molecules near protein surfaces. Using a nonredundant set of 2818 protein crystal structures with a resolution of better than 1.8 Å, the extent and structure of the hydration shell of all 20 standard amino-acid residues were analyzed as function of the residue conformation, secondary structure and solvent accessibility. The results show how hydration depends on the amino-acid conformation and the environment in which it occurs. After conformational clustering of individual residues, the density distribution of water molecules was compiled and the preferred hydration sites were determined as maxima in the pseudo-electron-density representation of water distributions. Many hydration sites interact with both main-chain and side-chain amino-acid atoms, and several occurrences of hydration sites with less canonical contacts, such as carbon–donor hydrogen bonds, OH–π interactions and off-plane interactions with aromatic heteroatoms, are also reported. Information about the location and relative importance of the empirically determined preferred hydration sites in proteins has applications in improving the current methods of hydration-site prediction in molecular replacement, ab initio protein structure prediction and the set-up of molecular-dynamics simulations.

  18. Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive.

    Science.gov (United States)

    Burley, Stephen K; Berman, Helen M; Kleywegt, Gerard J; Markley, John L; Nakamura, Haruki; Velankar, Sameer

    2017-01-01

    The Protein Data Bank (PDB)--the single global repository of experimentally determined 3D structures of biological macromolecules and their complexes--was established in 1971, becoming the first open-access digital resource in the biological sciences. The PDB archive currently houses ~130,000 entries (May 2017). It is managed by the Worldwide Protein Data Bank organization (wwPDB; wwpdb.org), which includes the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu). The four wwPDB partners operate a unified global software system that enforces community-agreed data standards and supports data Deposition, Biocuration, and Validation of ~11,000 new PDB entries annually (deposit.wwpdb.org). The RCSB PDB currently acts as the archive keeper, ensuring disaster recovery of PDB data and coordinating weekly updates. wwPDB partners disseminate the same archival data from multiple FTP sites, while operating complementary websites that provide their own views of PDB data with selected value-added information and links to related data resources. At present, the PDB archives experimental data, associated metadata, and 3D-atomic level structural models derived from three well-established methods: crystallography, nuclear magnetic resonance spectroscopy (NMR), and electron microscopy (3DEM). wwPDB partners are working closely with experts in related experimental areas (small-angle scattering, chemical cross-linking/mass spectrometry, Forster energy resonance transfer or FRET, etc.) to establish a federation of data resources that will support sustainable archiving and validation of 3D structural models and experimental data derived from integrative or hybrid methods.

  19. NMR Structure of the Myristylated Feline Immunodeficiency Virus Matrix Protein

    Directory of Open Access Journals (Sweden)

    Lola A. Brown

    2015-04-01

    Full Text Available Membrane targeting by the Gag proteins of the human immunodeficiency viruses (HIV types-1 and -2 is mediated by Gag’s N-terminally myristylated matrix (MA domain and is dependent on cellular phosphatidylinositol-4,5-bisphosphate [PI(4,5P2]. To determine if other lentiviruses employ a similar membrane targeting mechanism, we initiated studies of the feline immunodeficiency virus (FIV, a widespread feline pathogen with potential utility for development of human therapeutics. Bacterial co-translational myristylation was facilitated by mutation of two amino acids near the amino-terminus of the protein (Q5A/G6S; myrMAQ5A/G6S. These substitutions did not affect virus assembly or release from transfected cells. NMR studies revealed that the myristyl group is buried within a hydrophobic pocket in a manner that is structurally similar to that observed for the myristylated HIV-1 protein. Comparisons with a recent crystal structure of the unmyristylated FIV protein [myr(-MA] indicate that only small changes in helix orientation are required to accommodate the sequestered myr group. Depletion of PI(4,5P2 from the plasma membrane of FIV-infected CRFK cells inhibited production of FIV particles, indicating that, like HIV, FIV hijacks the PI(4,5P2 cellular signaling system to direct intracellular Gag trafficking during virus assembly.

  20. NMR structure of the myristylated feline immunodeficiency virus matrix protein.

    Science.gov (United States)

    Brown, Lola A; Cox, Cassiah; Baptiste, Janae; Summers, Holly; Button, Ryan; Bahlow, Kennedy; Spurrier, Vaughn; Kyser, Jenna; Luttge, Benjamin G; Kuo, Lillian; Freed, Eric O; Summers, Michael F

    2015-04-30

    Membrane targeting by the Gag proteins of the human immunodeficiency viruses (HIV types-1 and -2) is mediated by Gag's N-terminally myristylated matrix (MA) domain and is dependent on cellular phosphatidylinositol-4,5-bisphosphate [PI(4,5)P2]. To determine if other lentiviruses employ a similar membrane targeting mechanism, we initiated studies of the feline immunodeficiency virus (FIV), a widespread feline pathogen with potential utility for development of human therapeutics. Bacterial co-translational myristylation was facilitated by mutation of two amino acids near the amino-terminus of the protein (Q5A/G6S; myrMAQ5A/G6S). These substitutions did not affect virus assembly or release from transfected cells. NMR studies revealed that the myristyl group is buried within a hydrophobic pocket in a manner that is structurally similar to that observed for the myristylated HIV-1 protein. Comparisons with a recent crystal structure of the unmyristylated FIV protein [myr(-)MA] indicate that only small changes in helix orientation are required to accommodate the sequestered myr group. Depletion of PI(4,5)P2 from the plasma membrane of FIV-infected CRFK cells inhibited production of FIV particles, indicating that, like HIV, FIV hijacks the PI(4,5)P2 cellular signaling system to direct intracellular Gag trafficking during virus assembly.

  1. EDM-DEDM and protein crystal structure solution.

    Science.gov (United States)

    Caliandro, Rocco; Carrozzini, Benedetta; Cascarano, Giovanni Luca; Giacovazzo, Carmelo; Mazzone, Anna Maria; Siliqi, Dritan

    2009-05-01

    Electron-density modification (EDM) procedures are the classical tool for driving model phases closer to those of the target structure. They are often combined with automated model-building programs to provide a correct protein model. The task is not always performed, mostly because of the large initial phase error. A recently proposed procedure combined EDM with DEDM (difference electron-density modification); the method was applied to the refinement of phases obtained by molecular replacement, ab initio or SAD phasing [Caliandro, Carrozzini, Cascarano, Giacovazzo, Mazzone & Siliqi (2009), Acta Cryst. D65, 249-256] and was more effective in improving phases than EDM alone. In this paper, a novel fully automated protocol for protein structure refinement based on the iterative application of automated model-building programs combined with the additional power derived from the EDM-DEDM algorithm is presented. The cyclic procedure was successfully tested on challenging cases for which all other approaches had failed.

  2. Structure determination of T-cell protein-tyrosine phosphatase

    DEFF Research Database (Denmark)

    Iversen, L.F.; Møller, K. B.; Pedersen, A.K.

    2002-01-01

    Protein-tyrosine phosphatase 1B (PTP1B) has recently received much attention as a potential drug target in type 2 diabetes. This has in particular been spurred by the finding that PTP1B knockout mice show increased insulin sensitivity and resistance to diet-induced obesity. Surprisingly, the highly...... homologous T cell protein-tyrosine phosphatase (TC-PTP) has received much less attention, and no x-ray structure has been provided. We have previously co-crystallized PTP1B with a number of low molecular weight inhibitors that inhibit TC-PTP with similar efficiency. Unexpectedly, we were not able to co...... the high degree of functional and structural similarity between TC-PTP and PTP1B, we have been able to identify areas close to the active site that might be addressed to develop selective inhibitors of each enzyme....

  3. Structural basis for precursor protein-directed ribosomal peptide macrocyclization

    Science.gov (United States)

    Li, Kunhua; Condurso, Heather L.; Li, Gengnan; Ding, Yousong; Bruner, Steven D.

    2016-01-01

    Macrocyclization is a common feature of natural product biosynthetic pathways including the diverse family of ribosomal peptides. Microviridins are architecturally complex cyanobacterial ribosomal peptides whose members target proteases with potent reversible inhibition. The product structure is constructed by three macrocyclizations catalyzed sequentially by two members of the ATP-grasp family, a unique strategy for ribosomal peptide macrocyclization. Here, we describe the detailed structural basis for the enzyme-catalyzed macrocyclizations in the microviridin J pathway of Microcystis aeruginosa. The macrocyclases, MdnC and MdnB, interact with a conserved α-helix of the precursor peptide using a novel precursor peptide recognition mechanism. The results provide insight into the unique protein/protein interactions key to the chemistry, suggest an origin of the natural combinatorial synthesis of microviridin peptides and provide a framework for future engineering efforts to generate designed compounds. PMID:27669417

  4. Structural basis for precursor protein-directed ribosomal peptide macrocyclization.

    Science.gov (United States)

    Li, Kunhua; Condurso, Heather L; Li, Gengnan; Ding, Yousong; Bruner, Steven D

    2016-11-01

    Macrocyclization is a common feature of natural product biosynthetic pathways including the diverse family of ribosomal peptides. Microviridins are architecturally complex cyanobacterial ribosomal peptides that target proteases with potent reversible inhibition. The product structure is constructed via three macrocyclizations catalyzed sequentially by two members of the ATP-grasp family, a unique strategy for ribosomal peptide macrocyclization. Here we describe in detail the structural basis for the enzyme-catalyzed macrocyclizations in the microviridin J pathway of Microcystis aeruginosa. The macrocyclases MdnC and MdnB interact with a conserved α-helix of the precursor peptide using a novel precursor-peptide recognition mechanism. The results provide insight into the unique protein-protein interactions that are key to the chemistry, suggest an origin for the natural combinatorial synthesis of microviridin peptides, and provide a framework for future engineering efforts to generate designed compounds.

  5. Cascaded bidirectional recurrent neural networks for protein secondary structure prediction.

    Science.gov (United States)

    Chen, Jinmiao; Chaudhari, Narendra

    2007-01-01

    Protein secondary structure (PSS) prediction is an important topic in bioinformatics. Our study on a large set of non-homologous proteins shows that long-range interactions commonly exist and negatively affect PSS prediction. Besides, we also reveal strong correlations between secondary structure (SS) elements. In order to take into account the long-range interactions and SS-SS correlations, we propose a novel prediction system based on cascaded bidirectional recurrent neural network (BRNN). We compare the cascaded BRNN against another two BRNN architectures, namely the original BRNN architecture used for speech recognition as well as Pollastri's BRNN that was proposed for PSS prediction. Our cascaded BRNN achieves an overall three state accuracy Q3 of 74.38\\%, and reaches a high Segment OVerlap (SOV) of 66.0455. It outperforms the original BRNN and Pollastri's BRNN in both Q3 and SOV. Specifically, it improves the SOV score by 4-6%.

  6. A Structural Perspective on the Modulation of Protein-Protein Interactions with Small Molecules.

    Science.gov (United States)

    Demirel, Habibe Cansu; Dogan, Tunca; Tuncbag, Nurcan

    2018-05-31

    Protein-protein interactions (PPIs) are the key components in many cellular processes including signaling pathways, enzymatic reactions and epigenetic regulation. Abnormal interactions of some proteins may be pathogenic and cause various disorders including cancer and neurodegenerative diseases. Although inhibiting PPIs with small molecules is a challenging task, it gained an increasing interest because of its strong potential for drug discovery and design. The knowledge of the interface as well as the structural and chemical characteristics of the PPIs and their roles in the cellular pathways are necessary for a rational design of small molecules to modulate PPIs. In this study, we review the recent progress in the field and detail the physicochemical properties of PPIs including binding hot spots with a focus on structural methods. Then, we review recent approaches for structural prediction of PPIs. Finally, we revisit the concept of targeting PPIs in a systems biology perspective and we refer to the non-structural approaches, usually employed when the structural information is not present. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  7. Engineering and introduction of de novo disulphide bridges in ...

    Indian Academy of Sciences (India)

    The engineeringof de novo disulphide bridges has been explored as a means to increase the thermal stability of enzymes in the rationalmethod of protein engineering. In this study, Disulphide by Design software, homology modelling and moleculardynamics simulations were used to select appropriate amino acid pairs for ...

  8. De novo mutation in the dopamine transporter gene associates dopamine dysfunction with autism spectrum disorder.

    Science.gov (United States)

    Hamilton, P J; Campbell, N G; Sharma, S; Erreger, K; Herborg Hansen, F; Saunders, C; Belovich, A N; Sahai, M A; Cook, E H; Gether, U; McHaourab, H S; Matthies, H J G; Sutcliffe, J S; Galli, A

    2013-12-01

    De novo genetic variation is an important class of risk factors for autism spectrum disorder (ASD). Recently, whole-exome sequencing of ASD families has identified a novel de novo missense mutation in the human dopamine (DA) transporter (hDAT) gene, which results in a Thr to Met substitution at site 356 (hDAT T356M). The dopamine transporter (DAT) is a presynaptic membrane protein that regulates dopaminergic tone in the central nervous system by mediating the high-affinity reuptake of synaptically released DA, making it a crucial regulator of DA homeostasis. Here, we report the first functional, structural and behavioral characterization of an ASD-associated de novo mutation in the hDAT. We demonstrate that the hDAT T356M displays anomalous function, characterized as a persistent reverse transport of DA (substrate efflux). Importantly, in the bacterial homolog leucine transporter, substitution of A289 (the homologous site to T356) with a Met promotes an outward-facing conformation upon substrate binding. In the substrate-bound state, an outward-facing transporter conformation is required for substrate efflux. In Drosophila melanogaster, the expression of hDAT T356M in DA neurons-lacking Drosophila DAT leads to hyperlocomotion, a trait associated with DA dysfunction and ASD. Taken together, our findings demonstrate that alterations in DA homeostasis, mediated by aberrant DAT function, may confer risk for ASD and related neuropsychiatric conditions.

  9. Minor snake venom proteins: Structure, function and potential applications.

    Science.gov (United States)

    Boldrini-França, Johara; Cologna, Camila Takeno; Pucca, Manuela Berto; Bordon, Karla de Castro Figueiredo; Amorim, Fernanda Gobbi; Anjolette, Fernando Antonio Pino; Cordeiro, Francielle Almeida; Wiezel, Gisele Adriano; Cerni, Felipe Augusto; Pinheiro-Junior, Ernesto Lopes; Shibao, Priscila Yumi Tanaka; Ferreira, Isabela Gobbo; de Oliveira, Isadora Sousa; Cardoso, Iara Aimê; Arantes, Eliane Candiani

    2017-04-01

    Snake venoms present a great diversity of pharmacologically active compounds that may be applied as research and biotechnological tools, as well as in drug development and diagnostic tests for certain diseases. The most abundant toxins have been extensively studied in the last decades and some of them have already been used for different purposes. Nevertheless, most of the minor snake venom protein classes remain poorly explored, even presenting potential application in diverse areas. The main difficulty in studying these proteins lies on the impossibility of obtaining sufficient amounts of them for a comprehensive investigation. The advent of more sensitive techniques in the last few years allowed the discovery of new venom components and the in-depth study of some already known minor proteins. This review summarizes information regarding some structural and functional aspects of low abundant snake venom proteins classes, such as growth factors, hyaluronidases, cysteine-rich secretory proteins, nucleases and nucleotidases, cobra venom factors, vespryns, protease inhibitors, antimicrobial peptides, among others. Some potential applications of these molecules are discussed herein in order to encourage researchers to explore the full venom repertoire and to discover new molecules or applications for the already known venom components. Copyright © 2016. Published by Elsevier B.V.

  10. Validation of Structures in the Protein Data Bank.

    Science.gov (United States)

    Gore, Swanand; Sanz García, Eduardo; Hendrickx, Pieter M S; Gutmanas, Aleksandras; Westbrook, John D; Yang, Huanwang; Feng, Zukang; Baskaran, Kumaran; Berrisford, John M; Hudson, Brian P; Ikegawa, Yasuyo; Kobayashi, Naohiro; Lawson, Catherine L; Mading, Steve; Mak, Lora; Mukhopadhyay, Abhik; Oldfield, Thomas J; Patwardhan, Ardan; Peisach, Ezra; Sahni, Gaurav; Sekharan, Monica R; Sen, Sanchayita; Shao, Chenghua; Smart, Oliver S; Ulrich, Eldon L; Yamashita, Reiko; Quesada, Martha; Young, Jasmine Y; Nakamura, Haruki; Markley, John L; Berman, Helen M; Burley, Stephen K; Velankar, Sameer; Kleywegt, Gerard J

    2017-12-05

    The Worldwide PDB recently launched a deposition, biocuration, and validation tool: OneDep. At various stages of OneDep data processing, validation reports for three-dimensional structures of biological macromolecules are produced. These reports are based on recommendations of expert task forces representing crystallography, nuclear magnetic resonance, and cryoelectron microscopy communities. The reports provide useful metrics with which depositors can evaluate the quality of the experimental data, the structural model, and the fit between them. The validation module is also available as a stand-alone web server and as a programmatically accessible web service. A growing number of journals require the official wwPDB validation reports (produced at biocuration) to accompany manuscripts describing macromolecular structures. Upon public release of the structure, the validation report becomes part of the public PDB archive. Geometric quality scores for proteins in the PDB archive have improved over the past decade. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  11. Cancer3D: understanding cancer mutations through protein structures.

    Science.gov (United States)

    Porta-Pardo, Eduard; Hrabe, Thomas; Godzik, Adam

    2015-01-01

    The new era of cancer genomics is providing us with extensive knowledge of mutations and other alterations in cancer. The Cancer3D database at http://www.cancer3d.org gives an open and user-friendly way to analyze cancer missense mutations in the context of structures of proteins in which they are found. The database also helps users analyze the distribution patterns of the mutations as well as their relationship to changes in drug activity through two algorithms: e-Driver and e-Drug. These algorithms use knowledge of modular structure of genes and proteins to separately study each region. This approach allows users to find novel candidate driver regions or drug biomarkers that cannot be found when similar analyses are done on the whole-gene level. The Cancer3D database provides access to the results of such analyses based on data from The Cancer Genome Atlas (TCGA) and the Cancer Cell Line Encyclopedia (CCLE). In addition, it displays mutations from over 14,700 proteins mapped to more than 24,300 structures from PDB. This helps users visualize the distribution of mutations and identify novel three-dimensional patterns in their distribution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Structure-Energy Relationships of Halogen Bonds in Proteins.

    Science.gov (United States)

    Scholfield, Matthew R; Ford, Melissa Coates; Carlsson, Anna-Carin C; Butta, Hawera; Mehl, Ryan A; Ho, P Shing

    2017-06-06

    The structures and stabilities of proteins are defined by a series of weak noncovalent electrostatic, van der Waals, and hydrogen bond (HB) interactions. In this study, we have designed and engineered halogen bonds (XBs) site-specifically to study their structure-energy relationship in a model protein, T4 lysozyme. The evidence for XBs is the displacement of the aromatic side chain toward an oxygen acceptor, at distances that are equal to or less than the sums of their respective van der Waals radii, when the hydroxyl substituent of the wild-type tyrosine is replaced by a halogen. In addition, thermal melting studies show that the iodine XB rescues the stabilization energy from an otherwise destabilizing substitution (at an equivalent noninteracting site), indicating that the interaction is also present in solution. Quantum chemical calculations show that the XB complements an HB at this site and that solvent structure must also be considered in trying to design molecular interactions such as XBs into biological systems. A bromine substitution also shows displacement of the side chain, but the distances and geometries do not indicate formation of an XB. Thus, we have dissected the contributions from various noncovalent interactions of halogens introduced into proteins, to drive the application of XBs, particularly in biomolecular design.

  13. New protein structures provide an updated understanding of phenylketonuria.

    Science.gov (United States)

    Jaffe, Eileen K

    2017-08-01

    Phenylketonuria (PKU) and less severe hyperphenylalaninemia (HPA) constitute the most common inborn error of amino acid metabolism, and is most often caused by defects in phenylalanine hydroxylase (PAH) function resulting in accumulation of Phe to neurotoxic levels. Despite the success of dietary intervention in preventing permanent neurological damage, individuals living with PKU clamor for additional non-dietary therapies. The bulk of disease-associated mutations are PAH missense variants, which occur throughout the entire 452 amino acid human PAH protein. While some disease-associated mutations affect protein structure (e.g. truncations) and others encode catalytically dead variants, most have been viewed as defective in protein folding/stability. Here we refine this view to address how PKU-associated missense variants can perturb the equilibrium among alternate native PAH structures (resting-state PAH and activated PAH), thus shifting the tipping point of this equilibrium to a neurotoxic Phe concentration. This refined view of PKU introduces opportunities for the design or discovery of therapeutic pharmacological chaperones that can help restore the tipping point to healthy Phe levels and how such a therapeutic might work with or without the inhibitory pharmacological chaperone BH 4 . Dysregulation of an equilibrium of architecturally distinct native PAH structures departs from the concept of "misfolding", provides an updated understanding of PKU, and presents an enhanced foundation for understanding genotype/phenotype relationships. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Parallel protein secondary structure prediction based on neural networks.

    Science.gov (United States)

    Zhong, Wei; Altun, Gulsah; Tian, Xinmin; Harrison, Robert; Tai, Phang C; Pan, Yi

    2004-01-01

    Protein secondary structure prediction has a fundamental influence on today's bioinformatics research. In this work, binary and tertiary classifiers of protein secondary structure prediction are implemented on Denoeux belief neural network (DBNN) architecture. Hydrophobicity matrix, orthogonal matrix, BLOSUM62 and PSSM (position specific scoring matrix) are experimented separately as the encoding schemes for DBNN. The experimental results contribute to the design of new encoding schemes. New binary classifier for Helix versus not Helix ( approximately H) for DBNN produces prediction accuracy of 87% when PSSM is used for the input profile. The performance of DBNN binary classifier is comparable to other best prediction methods. The good test results for binary classifiers open a new approach for protein structure prediction with neural networks. Due to the time consuming task of training the neural networks, Pthread and OpenMP are employed to parallelize DBNN in the hyperthreading enabled Intel architecture. Speedup for 16 Pthreads is 4.9 and speedup for 16 OpenMP threads is 4 in the 4 processors shared memory architecture. Both speedup performance of OpenMP and Pthread is superior to that of other research. With the new parallel training algorithm, thousands of amino acids can be processed in reasonable amount of time. Our research also shows that hyperthreading technology for Intel architecture is efficient for parallel biological algorithms.

  15. Quantification of Protein Hydration, Glass Transitions, and Structural Relaxations of Aqueous Protein and Carbohydrate-Protein Systems.

    Science.gov (United States)

    Roos, Yrjö H; Potes, Naritchaya

    2015-06-11

    Water distribution and miscibility of carbohydrate and protein components in biological materials and their structural contributions in concentrated solids are poorly understood. In the present study, structural relaxations and a glass transition of protein hydration water and antiplasticization of the hydration water at low temperatures were measured using dynamic mechanical analysis (DMA) and differential scanning calorimetry (DSC) for bovine whey protein (BWP), aqueous glucose-fructose (GF), and their mixture. Thermal transitions of α-lactalbumin and β-lactoglobulin components of BWP included water-content-dependent endothermic but reversible dehydration and denaturation, and exothermic and irreversible aggregation. An α-relaxation assigned to hydration water in BWP appeared at water-content-dependent temperatures and increased to over the range of 150-200 K at decreasing water content and in the presence of GF. Two separate glass transitions and individual fractions of unfrozen water of ternary GF-BWP-water systems contributed to uncoupled α-relaxations, suggesting different roles of protein hydration water and carbohydrate vitrification in concentrated solids during freezing and dehydration. Hydration water in the BWP fraction of GF-BWP systems was derived from equilibrium water sorption and glass transition data of the GF fraction, which gave a significant universal method to quantify (i) protein hydration water and (ii) the unfrozen water in protein-carbohydrate systems for such applications as cryopreservation, freezing, lyophilization, and dehydration of biological materials. A ternary supplemented phase diagram (state diagram) established for the GF-BWP-water system can be used for the analysis of the water distribution across carbohydrate and protein components in such applications.

  16. Automatic structure classification of small proteins using random forest

    Directory of Open Access Journals (Sweden)

    Hirst Jonathan D

    2010-07-01

    Full Text Available Abstract Background Random forest, an ensemble based supervised machine learning algorithm, is used to predict the SCOP structural classification for a target structure, based on the similarity of its structural descriptors to those of a template structure with an equal number of secondary structure elements (SSEs. An initial assessment of random forest is carried out for domains consisting of three SSEs. The usability of random forest in classifying larger domains is demonstrated by applying it to domains consisting of four, five and six SSEs. Results Random forest, trained on SCOP version 1.69, achieves a predictive accuracy of up to 94% on an independent and non-overlapping test set derived from SCOP version 1.73. For classification to the SCOP Class, Fold, Super-family or Family levels, the predictive quality of the model in terms of Matthew's correlation coefficient (MCC ranged from 0.61 to 0.83. As the number of constituent SSEs increases the MCC for classification to different structural levels decreases. Conclusions The utility of random forest in classifying domains from the place-holder classes of SCOP to the true Class, Fold, Super-family or Family levels is demonstrated. Issues such as introduction of a new structural level in SCOP and the merger of singleton levels can also be addressed using random forest. A real-world scenario is mimicked by predicting the classification for those protein structures from the PDB, which are yet to be assigned to the SCOP classification hierarchy.

  17. I-TASSER server for protein 3D structure prediction

    Directory of Open Access Journals (Sweden)

    Zhang Yang

    2008-01-01

    Full Text Available Abstract Background Prediction of 3-dimensional protein structures from amino acid sequences represents one of the most important problems in computational structural biology. The community-wide Critical Assessment of Structure Prediction (CASP experiments have been designed to obtain an objective assessment of the state-of-the-art of the field, where I-TASSER was ranked as the best method in the server section of the recent 7th CASP experiment. Our laboratory has since then received numerous requests about the public availability of the I-TASSER algorithm and the usage of the I-TASSER predictions. Results An on-line version of I-TASSER is developed at the KU Center for Bioinformatics which has generated protein structure predictions for thousands of modeling requests from more than 35 countries. A scoring function (C-score based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions. A large-scale benchmark test demonstrates a strong correlation between the C-score and the TM-score (a structural similarity measurement with values in [0, 1] of the first models with a correlation coefficient of 0.91. Using a C-score cutoff > -1.5 for the models of correct topology, both false positive and false negative rates are below 0.1. Combining C-score and protein length, the accuracy of the I-TASSER models can be predicted with an average error of 0.08 for TM-score and 2 Å for RMSD. Conclusion The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I-TASSER models. The output of the I-TASSER server for each query includes up to five full-length models, the confidence score, the estimated TM-score and RMSD, and the standard deviation of the estimations. The I-TASSER server is freely available

  18. Protein Design Using Unnatural Amino Acids

    Science.gov (United States)

    Bilgiçer, Basar; Kumar, Krishna

    2003-11-01

    With the increasing availability of whole organism genome sequences, understanding protein structure and function is of capital importance. Recent developments in the methodology of incorporation of unnatural amino acids into proteins allow the exploration of proteins at a very detailed level. Furthermore, de novo design of novel protein structures and function is feasible with unprecedented sophistication. Using examples from the literature, this article describes the available methods for unnatural amino acid incorporation and highlights some recent applications including the design of hyperstable protein folds.

  19. Building alternate protein structures using the elastic network model.

    Science.gov (United States)

    Yang, Qingyi; Sharp, Kim A

    2009-02-15

    We describe a method for efficiently generating ensembles of alternate, all-atom protein structures that (a) differ significantly from the starting structure, (b) have good stereochemistry (bonded geometry), and (c) have good steric properties (absence of atomic overlap). The method uses reconstruction from a series of backbone framework structures that are obtained from a modified elastic network model (ENM) by perturbation along low-frequency normal modes. To ensure good quality backbone frameworks, the single force parameter ENM is modified by introducing two more force parameters to characterize the interaction between the consecutive carbon alphas and those within the same secondary structure domain. The relative stiffness of the three parameters is parameterized to reproduce B-factors, while maintaining good bonded geometry. After parameterization, violations of experimental Calpha-Calpha distances and Calpha-Calpha-Calpha pseudo angles along the backbone are reduced to less than 1%. Simultaneously, the average B-factor correlation coefficient improves to R = 0.77. Two applications illustrate the potential of the approach. (1) 102,051 protein backbones spanning a conformational space of 15 A root mean square deviation were generated from 148 nonredundant proteins in the PDB database, and all-atom models with minimal bonded and nonbonded violations were produced from this ensemble of backbone structures using the SCWRL side chain building program. (2) Improved backbone templates for homology modeling. Fifteen query sequences were each modeled on two targets. For each of the 30 target frameworks, dozens of improved templates could be produced In all cases, improved full atom homology models resulted, of which 50% could be identified blind using the D-Fire statistical potential. (c) 2008 Wiley-Liss, Inc.

  20. Effects of dietary energy density and digestible protein:energy ratio on de novo lipid synthesis from dietary protein in gilthead sea bream (Sparus aurata) quantified with stable isotopes

    DEFF Research Database (Denmark)

    Ekmann, Kim Schøn; Dalsgaard, Anne Johanne Tang; Holm, Jørgen

    2013-01-01

    to trace the metabolic fate of dietary protein, 1·8% fishmeal was replaced with isotope-labelled whole protein (.98% 13C). The experiment was divided into a growth period lasting 89 d, growing fish from approximately 140 to 350 g, followed by a 3 d period feeding isotope-enriched diets. Isotope ratio MS...

  1. Structural fragment clustering reveals novel structural and functional motifs in α-helical transmembrane proteins

    Directory of Open Access Journals (Sweden)

    Vassilev Boris

    2010-04-01

    Full Text Available Abstract Background A large proportion of an organism's genome encodes for membrane proteins. Membrane proteins are important for many cellular processes, and several diseases can be linked to mutations in them. With the tremendous growth of sequence data, there is an increasing need to reliably identify membrane proteins from sequence, to functionally annotate them, and to correctly predict their topology. Results We introduce a technique called structural fragment clustering, which learns sequential motifs from 3D structural fragments. From over 500,000 fragments, we obtain 213 statistically significant, non-redundant, and novel motifs that are highly specific to α-helical transmembrane proteins. From these 213 motifs, 58 of them were assigned to function and checked in the scientific literature for a biological assessment. Seventy percent of the motifs are found in co-factor, ligand, and ion binding sites, 30% at protein interaction interfaces, and 12% bind specific lipids such as glycerol or cardiolipins. The vast majority of motifs (94% appear across evolutionarily unrelated families, highlighting the modularity of functional design in membrane proteins. We describe three novel motifs in detail: (1 a dimer interface motif found in voltage-gated chloride channels, (2 a proton transfer motif found in heme-copper oxidases, and (3 a convergently evolved interface helix motif found in an aspartate symporter, a serine protease, and cytochrome b. Conclusions Our findings suggest that functional modules exist in membrane proteins, and that they occur in completely different evolutionary contexts and cover different binding sites. Structural fragment clustering allows us to link sequence motifs to function through clusters of structural fragments. The sequence motifs can be applied to identify and characterize membrane proteins in novel genomes.

  2. From Extraction of Local Structures of Protein Energy Landscapes to Improved Decoy Selection in Template-Free Protein Structure Prediction.

    Science.gov (United States)

    Akhter, Nasrin; Shehu, Amarda

    2018-01-19

    Due to the essential role that the three-dimensional conformation of a protein plays in regulating interactions with molecular partners, wet and dry laboratories seek biologically-active conformations of a protein to decode its function. Computational approaches are gaining prominence due to the labor and cost demands of wet laboratory investigations. Template-free methods can now compute thousands of conformations known as decoys, but selecting native conformations from the generated decoys remains challenging. Repeatedly, research has shown that the protein energy functions whose minima are sought in the generation of decoys are unreliable indicators of nativeness. The prevalent approach ignores energy altogether and clusters decoys by conformational similarity. Complementary recent efforts design protein-specific scoring functions or train machine learning models on labeled decoys. In this paper, we show that an informative consideration of energy can be carried out under the energy landscape view. Specifically, we leverage local structures known as basins in the energy landscape probed by a template-free method. We propose and compare various strategies of basin-based decoy selection that we demonstrate are superior to clustering-based strategies. The presented results point to further directions of research for improving decoy selection, including the ability to properly consider the multiplicity of native conformations of proteins.

  3. From Extraction of Local Structures of Protein Energy Landscapes to Improved Decoy Selection in Template-Free Protein Structure Prediction

    Directory of Open Access Journals (Sweden)

    Nasrin Akhter

    2018-01-01

    Full Text Available Due to the essential role that the three-dimensional conformation of a protein plays in regulating interactions with molecular partners, wet and dry laboratories seek biologically-active conformations of a protein to decode its function. Computational approaches are gaining prominence due to the labor and cost demands of wet laboratory investigations. Template-free methods can now compute thousands of conformations known as decoys, but selecting native conformations from the generated decoys remains challenging. Repeatedly, research has shown that the protein energy functions whose minima are sought in the generation of decoys are unreliable indicators of nativeness. The prevalent approach ignores energy altogether and clusters decoys by conformational similarity. Complementary recent efforts design protein-specific scoring functions or train machine learning models on labeled decoys. In this paper, we show that an informative consideration of energy can be carried out under the energy landscape view. Specifically, we leverage local structures known as basins in the energy landscape probed by a template-free method. We propose and compare various strategies of basin-based decoy selection that we demonstrate are superior to clustering-based strategies. The presented results point to further directions of research for improving decoy selection, including the ability to properly consider the multiplicity of native conformations of proteins.

  4. Crystal structure of the Japanese encephalitis virus envelope protein.

    Science.gov (United States)

    Luca, Vincent C; AbiMansour, Jad; Nelson, Christopher A; Fremont, Daved H

    2012-02-01

    Japanese encephalitis virus (JEV) is the leading global cause of viral encephalitis. The JEV envelope protein (E) facilitates cellular attachment and membrane fusion and is the primary target of neutralizing antibodies. We have determined the 2.1-Å resolution crystal structure of the JEV E ectodomain refolded from bacterial inclusion bodies. The E protein possesses the three domains characteristic of flavivirus envelopes and epitope mapping of neutralizing antibodies onto the structure reveals determinants that correspond to the domain I lateral ridge, fusion loop, domain III lateral ridge, and domain I-II hinge. While monomeric in solution, JEV E assembles as an antiparallel dimer in the crystal lattice organized in a highly similar fashion as seen in cryo-electron microscopy models of mature flavivirus virions. The dimer interface, however, is remarkably small and lacks many of the domain II contacts observed in other flavivirus E homodimers. In addition, uniquely conserved histidines within the JEV serocomplex suggest that pH-mediated structural transitions may be aided by lateral interactions outside the dimer interface in the icosahedral virion. Our results suggest that variation in dimer structure and stability may significantly influence the assembly, receptor interaction, and uncoating of virions.

  5. A use of Ramachandran potentials in protein solution structure determinations

    International Nuclear Information System (INIS)

    Bertini, Ivano; Cavallaro, Gabriele; Luchinat, Claudio; Poli, Irene

    2003-01-01

    A strategy is developed to use database-derived φ-ψ constraints during simulated annealing procedures for protein solution structure determination in order to improve the Ramachandran plot statistics, while maintaining the agreement with the experimental constraints as the sole criterion for the selection of the family. The procedure, fully automated, consists of two consecutive simulated annealing runs. In the first run, the database-derived φ-ψ constraints are enforced for all aminoacids (but prolines and glycines). A family of structures is then selected on the ground of the lowest violations of the experimental constraints only, and the φ-ψ values for each residue are examined. In the second and final run, the database-derived φ-ψ constraints are enforced only for those residues which in the first run have ended in one and the same favored φ-ψ region. For residues which are either spread over different favored regions or concentrated in disallowed regions, the constraints are not enforced. The final family is then selected, after the second run, again only based on the agreement with the experimental constraints. This automated approach was implemented in DYANA and was tested on as many as 12 proteins, including some containing paramagnetic metals, whose structures had been previously solved in our laboratory. The quality of the structures, and of Ramachandran plot statistics in particular, was notably improved while preserving the agreement with the experimental constraints

  6. [Structure and evolution of the eukaryotic FANCJ-like proteins].

    Science.gov (United States)

    Wuhe, Jike; Zefeng, Wu; Sanhong, Fan; Xuguang, Xi

    2015-02-01

    The FANCJ-like protein family is a class of ATP-dependent helicases that can catalytically unwind duplex DNA along the 5'-3' direction. It is involved in the processes of DNA damage repair, homologous recombination and G-quadruplex DNA unwinding, and plays a critical role in maintaining genome integrity. In this study, we systemically analyzed FNACJ-like proteins from 47 eukaryotic species and discussed their sequences diversity, origin and evolution, motif organization patterns and spatial structure differences. Four members of FNACJ-like proteins, including XPD, CHL1, RTEL1 and FANCJ, were found in eukaryotes, but some of them were seriously deficient in most fungi and some insects. For example, the Zygomycota fungi lost RTEL1, Basidiomycota and Ascomycota fungi lost RTEL1 and FANCJ, and Diptera insect lost FANCJ. FANCJ-like proteins contain canonical motor domains HD1 and HD2, and the HD1 domain further integrates with three unique domains Fe-S, Arch and Extra-D. Fe-S and Arch domains are relatively conservative in all members of the family, but the Extra-D domain is lost in XPD and differs from one another in rest members. There are 7, 10 and 2 specific motifs found from the three unique domains respectively, while 5 and 12 specific motifs are found from HD1 and HD2 domains except the conserved motifs reported previously. By analyzing the arrangement pattern of these specific motifs, we found that RTEL1 and FANCJ are more closer and share two specific motifs Vb2 and Vc in HD2 domain, which are likely related with their G-quadruplex DNA unwinding activity. The evidence of evolution showed that FACNJ-like proteins were originated from a helicase, which has a HD1 domain inserted by extra Fe-S domain and Arch domain. By three continuous gene duplication events and followed specialization, eukaryotes finally possessed the current four members of FANCJ-like proteins.

  7. Hydrogen atoms in protein structures: high-resolution X-ray diffraction structure of the DFPase

    Science.gov (United States)

    2013-01-01

    Background Hydrogen atoms represent about half of the total number of atoms in proteins and are often involved in substrate recognition and catalysis. Unfortunately, X-ray protein crystallography at usual resolution fails to access directly their positioning, mainly because light atoms display weak contributions to diffraction. However, sub-Ångstrom diffraction data, careful modeling and a proper refinement strategy can allow the positioning of a significant part of hydrogen atoms. Results A comprehensive study on the X-ray structure of the diisopropyl-fluorophosphatase (DFPase) was performed, and the hydrogen atoms were modeled, including those of solvent molecules. This model was compared to the available neutron structure of DFPase, and differences in the protein and the active site solvation were noticed. Conclusions A further examination of the DFPase X-ray structure provides substantial evidence about the presence of an activated water molecule that may constitute an interesting piece of information as regard to the enzymatic hydrolysis mechanism. PMID:23915572

  8. A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein.

    Science.gov (United States)

    Agrawal, Neeraj J; Helk, Bernhard; Trout, Bernhardt L

    2014-01-21

    Identifying hot-spot residues - residues that are critical to protein-protein binding - can help to elucidate a protein's function and assist in designing therapeutic molecules to target those residues. We present a novel computational tool, termed spatial-interaction-map (SIM), to predict the hot-spot residues of an evolutionarily conserved protein-protein interaction from the structure of an unbound protein alone. SIM can predict the protein hot-spot residues with an accuracy of 36-57%. Thus, the SIM tool can be used to predict the yet unknown hot-spot residues for many proteins for which the structure of the protein-protein complexes are not available, thereby providing a clue to their functions and an opportunity to design therapeutic molecules to target these proteins. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  9. WildSpan: mining structured motifs from protein sequences

    Directory of Open Access Journals (Sweden)

    Chen Chien-Yu

    2011-03-01

    Full Text Available Abstract Background Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions that incorporates several pruning strategies to largely reduce the mining cost. Results WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode

  10. Crystal Structure of a Lipid G Protein-Coupled Receptor

    Energy Technology Data Exchange (ETDEWEB)

    Hanson, Michael A; Roth, Christopher B; Jo, Euijung; Griffith, Mark T; Scott, Fiona L; Reinhart, Greg; Desale, Hans; Clemons, Bryan; Cahalan, Stuart M; Schuerer, Stephan C; Sanna, M Germana; Han, Gye Won; Kuhn, Peter; Rosen, Hugh; Stevens, Raymond C [Scripps; (Receptos)

    2012-03-01

    The lyso-phospholipid sphingosine 1-phosphate modulates lymphocyte trafficking, endothelial development and integrity, heart rate, and vascular tone and maturation by activating G protein-coupled sphingosine 1-phosphate receptors. Here, we present the crystal structure of the sphingosine 1-phosphate receptor 1 fused to T4-lysozyme (S1P1-T4L) in complex with an antagonist sphingolipid mimic. Extracellular access to the binding pocket is occluded by the amino terminus and extracellular loops of the receptor. Access is gained by ligands entering laterally between helices I and VII within the transmembrane region of the receptor. This structure, along with mutagenesis, agonist structure-activity relationship data, and modeling, provides a detailed view of the molecular recognition and requirement for hydrophobic volume that activates S1P1, resulting in the modulation of immune and stromal cell responses.

  11. A periodic table of coiled-coil protein structures.

    Science.gov (United States)

    Moutevelis, Efrosini; Woolfson, Derek N

    2009-01-23

    Coiled coils are protein structure domains with two or more alpha-helices packed together via interlacing of side chains known as knob-into-hole packing. We analysed and classified a large set of coiled-coil structures using a combination of automated and manual methods. This led to a systematic classification that we termed a "periodic table of coiled coils," which we have made available at http://coiledcoils.chm.bris.ac.uk/ccplus/search/periodic_table. In this table, coiled-coil assemblies are arranged in columns with increasing numbers of alpha-helices and in rows of increased complexity. The table provides a framework for understanding possibilities in and limits on coiled-coil structures and a basis for future prediction, engineering and design studies.

  12. [Structure analysis of disease-related proteins using vibrational spectroscopy].

    Science.gov (United States)

    Hiramatsu, Hirotsugu

    2014-01-01

    Analyses of the structure and properties of identified pathogenic proteins are important for elucidating the molecular basis of diseases and in drug discovery research. Vibrational spectroscopy has advantages over other techniques in terms of sensitivity of detection of structural changes. Spectral analysis, however, is complicated because the spectrum involves a substantial amount of information. This article includes examples of structural analysis of disease-related proteins using vibrational spectroscopy in combination with additional techniques that facilitate data acquisition and analysis. Residue-specific conformation analysis of an amyloid fibril was conducted using IR absorption spectroscopy in combination with (13)C-isotope labeling, linear dichroism measurement, and analysis of amide I band features. We reveal a pH-dependent property of the interacting segment of an amyloidogenic protein, β2-microglobulin, which causes dialysis-related amyloidosis. We also reveal the molecular mechanisms underlying pH-dependent sugar-binding activity of human galectin-1, which is involved in cell adhesion, using spectroscopic techniques including UV resonance Raman spectroscopy. The decreased activity at acidic pH was attributed to a conformational change in the sugar-binding pocket caused by protonation of His52 (pKa 6.3) and the cation-π interaction between Trp68 and the protonated His44 (pKa 5.7). In addition, we show that the peak positions of the Raman bands of the C4=C5 stretching mode at approximately 1600 cm(-1) and the Nπ-C2-Nτ bending mode at approximately 1405 cm(-1) serve as markers of the His side-chain structure. The Raman signal was enhanced 12 fold using a vertical flow apparatus.

  13. A Survey of Protein Structures from Archaeal Viruses

    Directory of Open Access Journals (Sweden)

    Nikki Dellas

    2013-01-01

    Full Text Available Viruses that infect the third domain of life, Archaea, are a newly emerging field of interest. To date, all characterized archaeal viruses infect archaea that thrive in extreme conditions, such as halophilic, hyperthermophilic, and methanogenic environments. Viruses in general, especially those replicating in extreme environments, contain highly mosaic genomes with open reading frames (ORFs whose sequences are often dissimilar to all other known ORFs. It has been estimated that approximately 85% of virally encoded ORFs do not match known sequences in the nucleic acid databases, and this percentage is even higher for archaeal viruses (typically 90%–100%. This statistic suggests that either virus genomes represent a larger segment of sequence space and/or that viruses encode genes of novel fold and/or function. Because the overall three-dimensional fold of a protein evolves more slowly than its sequence, efforts have been geared toward structural characterization of proteins encoded by archaeal viruses in order to gain insight into their potential functions. In this short review, we provide multiple examples where structural characterization of archaeal viral proteins has indeed provided significant functional and evolutionary insight.

  14. Protein 8-class secondary structure prediction using conditional neural fields.

    Science.gov (United States)

    Wang, Zhiyong; Zhao, Feng; Peng, Jian; Xu, Jinbo

    2011-10-01

    Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. The RCSB protein data bank: integrative view of protein, gene and 3D structural information.

    Science.gov (United States)

    Rose, Peter W; Prlić, Andreas; Altunkaya, Ali; Bi, Chunxiao; Bradley, Anthony R; Christie, Cole H; Costanzo, Luigi Di; Duarte, Jose M; Dutta, Shuchismita; Feng, Zukang; Green, Rachel Kramer; Goodsell, David S; Hudson, Brian; Kalro, Tara; Lowe, Robert; Peisach, Ezra; Randle, Christopher; Rose, Alexander S; Shao, Chenghua; Tao, Yi-Ping; Valasatava, Yana; Voigt, Maria; Westbrook, John D; Woo, Jesse; Yang, Huangwang; Young, Jasmine Y; Zardecki, Christine; Berman, Helen M; Burley, Stephen K

    2017-01-04

    The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, http://rcsb.org), the US data center for the global PDB archive, makes PDB data freely available to all users, from structural biologists to computational biologists and beyond. New tools and resources have been added to the RCSB PDB web portal in support of a 'Structural View of Biology.' Recent developments have improved the User experience, including the high-speed NGL Viewer that provides 3D molecular visualization in any web browser, improved support for data file download and enhanced organization of website pages for query, reporting and individual structure exploration. Structure validation information is now visible for all archival entries. PDB data have been integrated with external biological resources, including chromosomal position within the human genome; protein modifications; and metabolic pathways. PDB-101 educational materials have been reorganized into a searchable website and expanded to include new features such as the Geis Digital Archive. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Structure refinement of flexible proteins using dipolar couplings: Application to the protein p8MTCP1

    International Nuclear Information System (INIS)

    Demene, Helene; Ducat, Thierry; Barthe, Philippe; Delsuc, Marc-Andre; Roumestand, Christian

    2002-01-01

    The present study deals with the relevance of using mobility-averaged dipolar couplings for the structure refinement of flexible proteins. The 68-residue protein p8 MTCP1 has been chosen as model for this study. Its solution state consists mainly of three α-helices. The two N-terminal helices are strapped in a well-determined α-hairpin, whereas, due to an intrinsic mobility, the position of the third helix is less well defined in the NMR structure. To further characterize the degrees of freedom of this helix, we have measured the dipolar coupling constants in the backbone of p8 MTCP1 in a bicellar medium. We show here that including D HN dip dipolar couplings in the structure calculation protocol improves the structure of the α-hairpin but not the positioning of the third helix. This is due to the motional averaging of the dipolar couplings measured in the last helix. Performing two calculations with different force constants for the dipolar restraints highlights the inconstancy of these mobility-averaged dipolar couplings. Alternatively, prior to any structure calculations, comparing the values of the dipolar couplings measured in helix III to values back-calculated from an ideal helix demonstrates that they are atypical for a helix. This can be partly attributed to mobility effects since the inclusion of the 15 N relaxation derived order parameter allows for a better fit

  17. Bluetongue virus non-structural protein 1 is a positive regulator of viral protein synthesis

    Directory of Open Access Journals (Sweden)

    Boyce Mark

    2012-08-01

    Full Text Available Abstract Background Bluetongue virus (BTV is a double-stranded RNA (dsRNA virus of the Reoviridae family, which encodes its genes in ten linear dsRNA segments. BTV mRNAs are synthesised by the viral RNA-dependent RNA polymerase (RdRp as exact plus sense copies of the genome segments. Infection of mammalian cells with BTV rapidly replaces cellular protein synthesis with viral protein synthesis, but the regulation of viral gene expression in the Orbivirus genus has not been investigated. Results Using an mRNA reporter system based on genome segment 10 of BTV fused with GFP we identify the protein characteristic of this genus, non-structural protein 1 (NS1 as sufficient to upregulate translation. The wider applicability of this phenomenon among the viral genes is demonstrated using the untranslated regions (UTRs of BTV genome segments flanking the quantifiable Renilla luciferase ORF in chimeric mRNAs. The UTRs of viral mRNAs are shown to be determinants of the amount of protein synthesised, with the pre-expression of NS1 increasing the quantity in each case. The increased expression induced by pre-expression of NS1 is confirmed in virus infected cells by generating a replicating virus which expresses the reporter fused with genome segment 10, using reverse genetics. Moreover, NS1-mediated upregulation of expression is restricted to mRNAs which lack the cellular 3′ poly(A sequence identifying the 3′ end as a necessary determinant in specifically increasing the translation of viral mRNA in the presence of cellular mRNA. Conclusions NS1 is identified as a positive regulator of viral protein synthesis. We propose a model of translational regulation where NS1 upregulates the synthesis of viral proteins, including itself, and creates a positive feedback loop of NS1 expression, which rapidly increases the expression of all the viral proteins. The efficient translation of viral reporter mRNAs among cellular mRNAs can account for the observed

  18. Bluetongue virus non-structural protein 1 is a positive regulator of viral protein synthesis.

    Science.gov (United States)

    Boyce, Mark; Celma, Cristina C P; Roy, Polly

    2012-08-29

    Bluetongue virus (BTV) is a double-stranded RNA (dsRNA) virus of the Reoviridae family, which encodes its genes in ten linear dsRNA segments. BTV mRNAs are synthesised by the viral RNA-dependent RNA polymerase (RdRp) as exact plus sense copies of the genome segments. Infection of mammalian cells with BTV rapidly replaces cellular protein synthesis with viral protein synthesis, but the regulation of viral gene expression in the Orbivirus genus has not been investigated. Using an mRNA reporter system based on genome segment 10 of BTV fused with GFP we identify the protein characteristic of this genus, non-structural protein 1 (NS1) as sufficient to upregulate translation. The wider applicability of this phenomenon among the viral genes is demonstrated using the untranslated regions (UTRs) of BTV genome segments flanking the quantifiable Renilla luciferase ORF in chimeric mRNAs. The UTRs of viral mRNAs are shown to be determinants of the amount of protein synthesised, with the pre-expression of NS1 increasing the quantity in each case. The increased expression induced by pre-expression of NS1 is confirmed in virus infected cells by generating a replicating virus which expresses the reporter fused with genome segment 10, using reverse genetics. Moreover, NS1-mediated upregulation of expression is restricted to mRNAs which lack the cellular 3' poly(A) sequence identifying the 3' end as a necessary determinant in specifically increasing the translation of viral mRNA in the presence of cellular mRNA. NS1 is identified as a positive regulator of viral protein synthesis. We propose a model of translational regulation where NS1 upregulates the synthesis of viral proteins, including itself, and creates a positive feedback loop of NS1 expression, which rapidly increases the expression of all the viral proteins. The efficient translation of viral reporter mRNAs among cellular mRNAs can account for the observed replacement of cellular protein synthesis with viral protein

  19. Ice cream structure modification by ice-binding proteins.

    Science.gov (United States)

    Kaleda, Aleksei; Tsanev, Robert; Klesment, Tiina; Vilu, Raivo; Laos, Katrin

    2018-04-25

    Ice-binding proteins (IBPs), also known as antifreeze proteins, were added to ice cream to investigate their effect on structure and texture. Ice recrystallization inhibition was assessed in the ice cream mixes using a novel accelerated microscope assay and the ice cream microstructure was studied using an ice crystal dispersion method. It was found that adding recombinantly produced fish type III IBPs at a concentration 3 mg·L -1 made ice cream hard and crystalline with improved shape preservation during melting. Ice creams made with IBPs (both from winter rye, and type III IBP) had aggregates of ice crystals that entrapped pockets of the ice cream mixture in a rigid network. Larger individual ice crystals and no entrapment in control ice creams was observed. Based on these results a model of ice crystals aggregates formation in the presence of IBPs was proposed. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Determination of structural fluctuations of proteins from structure-based calculations of residual dipolar couplings

    International Nuclear Information System (INIS)

    Montalvao, Rinaldo W.; De Simone, Alfonso; Vendruscolo, Michele

    2012-01-01

    Residual dipolar couplings (RDCs) have the potential of providing detailed information about the conformational fluctuations of proteins. It is very challenging, however, to extract such information because of the complex relationship between RDCs and protein structures. A promising approach to decode this relationship involves structure-based calculations of the alignment tensors of protein conformations. By implementing this strategy to generate structural restraints in molecular dynamics simulations we show that it is possible to extract effectively the information provided by RDCs about the conformational fluctuations in the native states of proteins. The approach that we present can be used in a wide range of alignment media, including Pf1, charged bicelles and gels. The accuracy of the method is demonstrated by the analysis of the Q factors for RDCs not used as restraints in the calculations, which are significantly lower than those corresponding to existing high-resolution structures and structural ensembles, hence showing that we capture effectively the contributions to RDCs from conformational fluctuations.

  1. Protein NMR Structures Refined with Rosetta Have Higher Accuracy Relative to Corresponding X-ray Crystal Structures

    Science.gov (United States)

    2014-01-01

    We have found that refinement of protein NMR structures using Rosetta with experimental NMR restraints yields more accurate protein NMR structures than those that have been deposited in the PDB using standard refinement protocols. Using 40 pairs of NMR and X-ray crystal structures determined by the Northeast Structural Genomics Consortium, for proteins ranging in size from 5–22 kDa, restrained Rosetta refined structures fit better to the raw experimental data, are in better agreement with their X-ray counterparts, and have better phasing power compared to conventionally determined NMR structures. For 37 proteins for which NMR ensembles were available and which had similar structures in solution and in the crystal, all of the restrained Rosetta refined NMR structures were sufficiently accurate to be used for solving the corresponding X-ray crystal structures by molecular replacement. The protocol for restrained refinement of protein NMR structures was also compared with restrained CS-Rosetta calculations. For proteins smaller than 10 kDa, restrained CS-Rosetta, starting from extended conformations, provides slightly more accurate structures, while for proteins in the size range of 10–25 kDa the less CPU intensive restrained Rosetta refinement protocols provided equally or more accurate structures. The restrained Rosetta protocols described here can improve the accuracy of protein NMR structures and should find broad and general for studies of protein structure and function. PMID:24392845

  2. NMR structure of the protein NP-247299.1: comparison with the crystal structure

    International Nuclear Information System (INIS)

    Jaudzems, Kristaps; Geralt, Michael; Serrano, Pedro; Mohanty, Biswaranjan; Horst, Reto; Pedrini, Bill; Elsliger, Marc-André; Wilson, Ian A.; Wüthrich, Kurt

    2010-01-01

    Comparison of the NMR and crystal structures of a protein determined using largely automated methods has enabled the interpretation of local differences in the highly similar structures. These differences are found in segments of higher B values in the crystal and correlate with dynamic processes on the NMR chemical shift timescale observed in solution. The NMR structure of the protein NP-247299.1 in solution at 313 K has been determined and is compared with the X-ray crystal structure, which was also solved in the Joint Center for Structural Genomics (JCSG) at 100 K and at 1.7 Å resolution. Both structures were obtained using the current largely automated crystallographic and solution NMR methods used by the JCSG. This paper assesses the accuracy and precision of the results from these recently established automated approaches, aiming for quantitative statements about the location of structure variations that may arise from either one of the methods used or from the different environments in solution and in the crystal. To evaluate the possible impact of the different software used for the crystallographic and the NMR structure determinations and analysis, the concept is introduced of reference structures, which are computed using the NMR software with input of upper-limit distance constraints derived from the molecular models representing the results of the two structure determinations. The use of this new approach is explored to quantify global differences that arise from the different methods of structure determination and analysis versus those that represent interesting local variations or dynamics. The near-identity of the protein core in the NMR and crystal structures thus provided a basis for the identification of complementary information from the two different methods. It was thus observed that locally increased crystallographic B values correlate with dynamic structural polymorphisms in solution, including that the solution state of the protein involves

  3. Biophysical and structural considerations for protein sequence evolution

    Directory of Open Access Journals (Sweden)

    Grahnen Johan A

    2011-12-01

    Full Text Available Abstract Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model.

  4. The construction of an amino acid network for understanding protein structure and function.

    Science.gov (United States)

    Yan, Wenying; Zhou, Jianhong; Sun, Maomin; Chen, Jiajia; Hu, Guang; Shen, Bairong

    2014-06-01

    Amino acid networks (AANs) are undirected networks consisting of amino acid residues and their interactions in three-dimensional protein structures. The analysis of AANs provides novel insight into protein science, and several common amino acid network properties have revealed diverse classes of proteins. In this review, we first summarize methods for the construction and characterization of AANs. We then compare software tools for the construction and analysis of AANs. Finally, we review the application of AANs for understanding protein structure and function, including the identification of functional residues, the prediction of protein folding, analyzing protein stability and protein-protein interactions, and for understanding communication within and between proteins.

  5. Some Recent Developments in Structure and Glassy Behavior of Proteins

    Science.gov (United States)

    Hu, Chin-Kun

    2012-02-01

    We have used ARVO developed by us to find that the ratio of volume and surface area of proteins in Protein Data Bank distributed in a very narrow region [1]. Such result is useful for the determination of protein 3D structures. It has been widely known that a spin glass model can be used to understand the slow relaxation behavior of a glass at low temperatures [2]. We have used molecular dynamics and simple models of polymer chains to study relaxation and aggregation of proteins under various conditions and found that polymer chains with neighboring monomers connected by rigid bonds can relax very slowly and show glassy behavior [3]. We have also found that native collagen fibrils show glassy behavior at room temperatures [4]. The results of [3] and [4] about the glassy behavior of polymers or proteins are useful for understanding the mechanism for a biological system to maintain in a non-equilibrium state, including the ancient seed [5], which can maintain in a non-equilibrium state for a very long time. (1) M.-C. Wu, M. S. Li, W.-J. Ma, M. Kouza, and C.-K. Hu, EPL, in press (2011); (2) C. Dasgupta, S.-K. Ma, and C.-K. Hu. Phys. Rev. B 20, 3837-3849 (1979); (3) W.-J. Ma and C.-K. Hu, J. Phys. Soc. Japan 79, 024005, 024006, 054001, and 104002 (2010), C.-K. Hu and W.-J. Ma, Prog. Theor. Phys. Supp. 184, 369 (2010); S. G. Gevorkian, A. E. Allahverdyan, D. S. Gevorgyan and C.-K. Hu, EPL 95, 23001 (2011); S. Sallon, et al. Science 320, 1464 (2008).

  6. Heparan sulfate proteoglycans: structure, protein interactions and cell signaling

    Directory of Open Access Journals (Sweden)

    Juliana L. Dreyfuss

    2009-09-01

    Full Text Available Heparan sulfate proteoglycans are ubiquitously found at the cell surface and extracellular matrix in all the animal species. This review will focus on the structural characteristics of the heparan sulfate proteoglycans related to protein interactions leading to cell signaling. The heparan sulfate chains due to their vast structural diversity are able to bind and interact with a wide variety of proteins, such as growth factors, chemokines, morphogens, extracellular matrix components, enzymes, among others. There is a specificity directing the interactions of heparan sulfates and target proteins, regarding both the fine structure of the polysaccharide chain as well precise protein motifs. Heparan sulfates play a role in cellular signaling either as receptor or co-receptor for different ligands, and the activation of downstream pathways is related to phosphorylation of different cytosolic proteins either directly or involving cytoskeleton interactions leading to gene regulation. The role of the heparan sulfate proteoglycans in cellular signaling and endocytic uptake pathways is also discussed.Proteoglicanos de heparam sulfato são encontrados tanto superfície celular quanto na matriz extracelular em todas as espécies animais. Esta revisão tem enfoque nas características estruturais dos proteoglicanos de heparam sulfato e nas interações destes proteoglicanos com proteínas que levam à sinalização celular. As cadeias de heparam sulfato, devido a sua variedade estrutural, são capazes de se ligar e interagir com ampla gama de proteínas, como fatores de crescimento, quimiocinas, morfógenos, componentes da matriz extracelular, enzimas, entreoutros. Existe uma especificidade estrutural que direciona as interações dos heparam sulfatos e proteínas alvo. Esta especificidade está relacionada com a estrutura da cadeia do polissacarídeo e os motivos conservados da cadeia polipeptídica das proteínas envolvidas nesta interação. Os heparam

  7. Validation of Molecular Dynamics Simulations for Prediction of Three-Dimensional Structures of Small Proteins.

    Science.gov (United States)

    Kato, Koichi; Nakayoshi, Tomoki; Fukuyoshi, Shuichi; Kurimoto, Eiji; Oda, Akifumi

    2017-10-12

    Although various higher-order protein structure prediction methods have been developed, almost all of them were developed based on the three-dimensional (3D) structure information of known proteins. Here we predicted the short protein structures by molecular dynamics (MD) simulations in which only Newton's equations of motion were used and 3D structural information of known proteins was not required. To evaluate the ability of MD simulationto predict protein structures, we calculated seven short test protein (10-46 residues) in the denatured state and compared their predicted and experimental structures. The predicted structure for Trp-cage (20 residues) was close to the experimental structure by 200-ns MD simulation. For proteins shorter or longer than Trp-cage, root-mean square deviation values were larger than those for Trp-cage. However, secondary structures could be reproduced by MD simulations for proteins with 10-34 residues. Simulations by replica exchange MD were performed, but the results were similar to those from normal MD simulations. These results suggest that normal MD simulations can roughly predict short protein structures and 200-ns simulations are frequently sufficient for estimating the secondary structures of protein (approximately 20 residues). Structural prediction method using only fundamental physical laws are useful for investigating non-natural proteins, such as primitive proteins and artificial proteins for peptide-based drug delivery systems.

  8. Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures.

    Directory of Open Access Journals (Sweden)

    Alberto Pascual-García

    2009-03-01

    Full Text Available Structural classifications of proteins assume the existence of the fold, which is an intrinsic equivalence class of protein domains. Here, we test in which conditions such an equivalence class is compatible with objective similarity measures. We base our analysis on the transitive property of the equivalence relationship, requiring that similarity of A with B and B with C implies that A and C are also similar. Divergent gene evolution leads us to expect that the transitive property should approximately hold. However, if protein domains are a combination of recurrent short polypeptide fragments, as proposed by several authors, then similarity of partial fragments may violate the transitive property, favouring the continuous view of the protein structure space. We propose a measure to quantify the violations of the transitive property when a clustering algorithm joins elements into clusters, and we find out that such violations present a well defined and detectable cross-over point, from an approximately transitive regime at high structure similarity to a regime with large transitivity violations and large differences in length at low similarity. We argue that protein structure space is discrete and hierarchic classification is justified up to this cross-over point, whereas at lower similarities the structure space is continuous and it should be represented as a network. We have tested the qualitative behaviour of this measure, varying all the choices involved in the automatic classification procedure, i.e., domain decomposition, alignment algorithm, similarity score, and clustering algorithm, and we have found out that this behaviour is quite robust. The final classification depends on the chosen algorithms. We used the values of the clustering coefficient and the transitivity violations to select the optimal choices among those that we tested. Interestingly, this criterion also favours the agreement between automatic and expert classifications

  9. Protein structural changes during processing of vegetable feed ingredients used in swine diets

    NARCIS (Netherlands)

    Salazar-Villanea, S.; Hendriks, W.H.; Bruininx, E.M.A.M.; Gruppen, H.; Poel, Van Der A.F.B.

    2016-01-01

    Protein structure influences the accessibility of enzymes for digestion. The proportion of intramolecular β-sheets in the secondary structure of native proteins has been related to a decrease in protein digestibility. Changes to proteins that can be considered positive (for example, denaturation

  10. Cleft analysis of Zika virus non-structural protein 1

    Institute of Scientific and Technical Information of China (English)

    Somsri Wiwanitkit; Viroj Wiwanitkit

    2017-01-01

    The non-structural protein 1 is an important molecule of the viruses in flavivirus group including to Zika virus. Recently, the NS1 of Zika virus was discovered. There is still no complete information of the molecular interaction of NS1 of Zika virus which can be the clue for explanation for its pathogenesis and further drug search. Here the authors report the cleft analysis of NS1 of Zika virus and the result can be useful for future development of good diagnostic tool and antiviral drug finding for management of Zika virus.

  11. Cleft analysis of Zika virus non-structural protein 1

    Directory of Open Access Journals (Sweden)

    Somsri Wiwanitkit

    2017-08-01

    Full Text Available The non-structural protein 1 is an important molecule of the viruses in flavivirus group including to Zika virus. Recently, the NS1 of Zika virus was discovered. There is still no complete information of the molecular interaction of NS1 of Zika virus which can be the clue for explanation for its pathogenesis and further drug search. Here the authors report the cleft analysis of NS1 of Zika virus and the result can be useful for future development of good diagnostic tool and antiviral drug finding for management of Zika virus.

  12. Cleft analysis of Zika virus non-structural protein 1

    OpenAIRE

    Somsri Wiwanitkit; Viroj Wiwanitkit

    2017-01-01

    The non-structural protein 1 is an important molecule of the viruses in flavivirus group including to Zika virus. Recently, the NS1 of Zika virus was discovered. There is still no complete information of the molecular interaction of NS1 of Zika virus which can be the clue for explanation for its pathogenesis and further drug search. Here the authors report the cleft analysis of NS1 of Zika virus and the result can be useful for future development of good diagnostic tool and antiviral drug fin...

  13. Structural studies on proton/protonation of the protein molecule

    International Nuclear Information System (INIS)

    Morimoto, Yukio; Kida, Akiko; Chatake, Toshiyuki; Yamaguchi, Hiroshi; Hosokawa, Keiichi; Murakami, Takuto; Umino, Masaaki; Tanaka, Ichiro; Hisatome, Ichiro; Yanagisawa, Yasutake; Fujiwara, Satoshi; Hidaka, Yuji; Shimamoto, Shigeru; Fujiwara, Mitsutoshi; Nakanishi, Takeyoshi

    2015-01-01

    This paper reports three studies involved in the analysis of protons and protonation at physiologically active sites in protein molecules. (1) 'Elucidation of the higher-order structure formation and activity performing mechanism of yeast proteasome.' With an aim to apply to anti-cancer drugs, this study performed the shape analysis of the total structure of 26S proteasome using small-angle X-ray scattering to clarify the complex form where controlling elements bonded to the both ends of 20S catalyst body, and analyzed the complex structure between the active sites of 20S and inhibitor (drug). (2) 'Basic study on the neutron experiment of biomolecules such as physiologically active substances derived from Natto-bacteria.' This study conducted the purification, crystallization, and X-ray analysis experiment of nattokinase; high-grade purification and solution experiment of vitamin K2 (menaquinone-7); and Z-DNA crystal structure study related to the neutron crystal analysis of DNA as another biomolecule structure study. (3) 'Functional evaluation on digestive enzymes derived from Nephila clavata.' As an Alzheimer's disease-related amyloid fibril formation model, this study carried out elucidation on the fibrosis and fiber-forming mechanism of the traction fiber of Nephila clavata, and the functional analysis of its degrading enzyme. (A.O.)

  14. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  15. Phylogenetic continuum indicates "galaxies" in the protein universe: preliminary results on the natural group structures of proteins.

    Science.gov (United States)

    Ladunga, I

    1992-04-01

    The markedly nonuniform, even systematic distribution of sequences in the protein "universe" has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two chi 2-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.

  16. Structure of the ordered hydration of amino acids in proteins: analysis of crystal structures

    Czech Academy of Sciences Publication Activity Database

    Biedermannová, Lada; Schneider, Bohdan

    2015-01-01

    Roč. 71, č. 11 (2015), s. 2178-2202 ISSN 1399-0047 Institutional support: RVO:86652036 Keywords : protein hydration * structural biology * X-ray crystallography Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.674, year: 2014

  17. Structural and Biochemical Studies of LysM Proteins

    DEFF Research Database (Denmark)

    Wong, Mei Mei Jaslyn Elizabeth

    2017-01-01

    . Most of the signalling components in the Nod factor signalling pathway have been identified through genetic approaches. The current symbiosis signalling model, however, lacks components that could link Nod factor perception at the plasma membrane to downstream responses, such as calcium influx and perinuclear calcium...... involved in peptidoglycan hydrolysis; the Cell Wall Lytic enzyme associated with cell Separation (CwlS) from Bacillus subtilis, and P60_Tth from Thermus thermopiles. Biochemical studies conducted on purified CwlS showed that multiple LysM modules function cooperatively to bind N-acetylglucosamine (NAG......-induced intermolecular dimerization was observed in the co-crystal structure of P60_2LysM and NAG6. Until today, this is the only structural evidence illustrating intermolecular dimerization of LysM proteins. Intermolecular dimerization of plant LysM receptor kinases (RK) has been proposed as a mechanism...

  18. Confocal imaging of protein distributions in porous silicon optical structures

    International Nuclear Information System (INIS)

    De Stefano, Luca; D'Auria, Sabato

    2007-01-01

    The performances of porous silicon optical biosensors depend strongly on the arrangement of the biological probes into their sponge-like structures: it is well known that in this case the sensing species do not fill the pores but instead cover their internal surface. In this paper, the direct imaging of labelled proteins into different porous silicon structures by using a confocal laser microscope is reported. The distribution of the biological matter in the nanostructured material follows a Gaussian behaviour which is typical of the diffusion process in the porous media but with substantial differences between a porous silicon monolayer and a multilayer such as a Bragg mirror. Even if semi-quantitative, the results can be very useful in the design of the porous silicon based biosensing devices

  19. Hantaviral proteins: structure, functions and role in hantavirus infection

    Directory of Open Access Journals (Sweden)

    Musalwa eMuyangwa

    2015-11-01

    Full Text Available Hantaviruses are the members of the family Bunyaviridae that are naturally maintained in the populations of small mammals, mostly rodents. Most of these viruses can easily infect humans through contact with aerosols or dust generated by contaminated animal waste products. Depending on the particular hantavirus involved, human infection could result in either Hemorrhagic Fever with Renal Syndrome (HFRS or in Hantavirus Cardiopulmonary Syndrome (HCPS. In the past few years, clinical cases of the hantavirus caused diseases have been on the rise. Understanding structure of the hantavirus genome and the functions of the key viral proteins is critical for the therapeutic agents’ research. This paper gives a brief overview of the current knowledge on the structure and properties of the hantavirus nucleoprotein and the glycoproteins.

  20. An automated approach to network features of protein structure ensembles

    Science.gov (United States)

    Bhattacharyya, Moitrayee; Bhat, Chanda R; Vishveshwara, Saraswathi

    2013-01-01

    Network theory applied to protein structures provides insights into numerous problems of biological relevance. The explosion in structural data available from PDB and simulations establishes a need to introduce a standalone-efficient program that assembles network concepts/parameters under one hood in an automated manner. Herein, we discuss the development/application of an exhaustive, user-friendly, standalone program package named PSN-Ensemble, which can handle structural ensembles generated through molecular dynamics (MD) simulation/NMR studies or from multiple X-ray structures. The novelty in network construction lies in the explicit consideration of side-chain interactions among amino acids. The program evaluates network parameters dealing with topological organization and long-range allosteric communication. The introduction of a flexible weighing scheme in terms of residue pairwise cross-correlation/interaction energy in PSN-Ensemble brings in dynamical/chemical knowledge into the network representation. Also, the results are mapped on a graphical display of the structure, allowing an easy access of network analysis to a general biological community. The potential of PSN-Ensemble toward examining structural ensemble is exemplified using MD trajectories of an ubiquitin-conjugating enzyme (UbcH5b). Furthermore, insights derived from network parameters evaluated using PSN-Ensemble for single-static structures of active/inactive states of β2-adrenergic receptor and the ternary tRNA complexes of tyrosyl tRNA synthetases (from organisms across kingdoms) are discussed. PSN-Ensemble is freely available from http://vishgraph.mbu.iisc.ernet.in/PSN-Ensemble/psn_index.html. PMID:23934896

  1. Protein Machineries Involved in the Attachment of Heme to Cytochrome c: Protein Structures and Molecular Mechanisms

    Directory of Open Access Journals (Sweden)

    Carlo Travaglini-Allocatelli

    2013-01-01

    Full Text Available Cytochromes c (Cyt c are ubiquitous heme-containing proteins, mainly involved in electron transfer processes, whose structure and functions have been and still are intensely studied. Surprisingly, our understanding of the molecular mechanism whereby the heme group is covalently attached to the apoprotein (apoCyt in the cell is still largely unknown. This posttranslational process, known as Cyt c biogenesis or Cyt c maturation, ensures the stereospecific formation of the thioether bonds between the heme vinyl groups and the cysteine thiols of the apoCyt heme binding motif. To accomplish this task, prokaryotic and eukaryotic cells have evolved distinctive protein machineries composed of different proteins. In this review, the structural and functional properties of the main maturation apparatuses found in gram-negative and gram-positive bacteria and in the mitochondria of eukaryotic cells will be presented, dissecting the Cyt c maturation process into three functional steps: (i heme translocation and delivery, (ii apoCyt thioreductive pathway, and (iii apoCyt chaperoning and heme ligation. Moreover, current hypotheses and open questions about the molecular mechanisms of each of the three steps will be discussed, with special attention to System I, the maturation apparatus found in gram-negative bacteria.

  2. Structure-sequence based analysis for identification of conserved regions in proteins

    Science.gov (United States)

    Zemla, Adam T; Zhou, Carol E; Lam, Marisa W; Smith, Jason R; Pardes, Elizabeth

    2013-05-28

    Disclosed are computational methods, and associated hardware and software products for scoring conservation in a protein structure based on a computationally identified family or cluster of protein structures. A method of computationally identifying a family or cluster of protein structures in also disclosed herein.

  3. MolTalk--a programming library for protein structures and structure analysis.

    Science.gov (United States)

    Diemand, Alexander V; Scheib, Holger

    2004-04-19

    Two of the mostly unsolved but increasingly urgent problems for modern biologists are a) to quickly and easily analyse protein structures and b) to comprehensively mine the wealth of information, which is distributed along with the 3D co-ordinates by the Protein Data Bank (PDB). Tools which address this issue need to be highly flexible and powerful but at the same time must be freely available and easy to learn. We present MolTalk, an elaborate programming language, which consists of the programming library libmoltalk implemented in Objective-C and the Smalltalk-based interpreter MolTalk. MolTalk combines the advantages of an easy to learn and programmable procedural scripting with the flexibility and power of a full programming language. An overview of currently available applications of MolTalk is given and with PDBChainSaw one such application is described in more detail. PDBChainSaw is a MolTalk-based parser and information extraction utility of PDB files. Weekly updates of the PDB are synchronised with PDBChainSaw and are available for free download from the MolTalk project page http://www.moltalk.org following the link to PDBChainSaw. For each chain in a protein structure, PDBChainSaw extracts the sequence from its co-ordinates and provides additional information from the PDB-file header section, such as scientific organism, compound name, and EC code. MolTalk provides a rich set of methods to analyse and even modify experimentally determined or modelled protein structures. These methods vary in complexity and are thus suitable for beginners and advanced programmers alike. We envision MolTalk to be most valuable in the following applications:1) To analyse protein structures repetitively in large-scale, i.e. to benchmark protein structure prediction methods or to evaluate structural models. The quality of the resulting 3D-models can be assessed by e.g. calculating a Ramachandran-Sasisekharan plot.2) To quickly retrieve information for (a limited number of

  4. MolTalk – a programming library for protein structures and structure analysis

    Science.gov (United States)

    Diemand, Alexander V; Scheib, Holger

    2004-01-01

    Background Two of the mostly unsolved but increasingly urgent problems for modern biologists are a) to quickly and easily analyse