WorldWideScience

Sample records for template-based protein structure

  1. Memoir: template-based structure prediction for membrane proteins.

    Science.gov (United States)

    Ebejer, Jean-Paul; Hill, Jamie R; Kelm, Sebastian; Shi, Jiye; Deane, Charlotte M

    2013-07-01

    Membrane proteins are estimated to be the targets of 50% of drugs that are currently in development, yet we have few membrane protein crystal structures. As a result, for a membrane protein of interest, the much-needed structural information usually comes from a homology model. Current homology modelling software is optimized for globular proteins, and ignores the constraints that the membrane is known to place on protein structure. Our Memoir server produces homology models using alignment and coordinate generation software that has been designed specifically for transmembrane proteins. Memoir is easy to use, with the only inputs being a structural template and the sequence that is to be modelled. We provide a video tutorial and a guide to assessing model quality. Supporting data aid manual refinement of the models. These data include a set of alternative conformations for each modelled loop, and a multiple sequence alignment that incorporates the query and template. Memoir works with both α-helical and β-barrel types of membrane proteins and is freely available at http://opig.stats.ox.ac.uk/webapps/memoir.

  2. Effect of Using Suboptimal Alignments in Template-Based Protein Structure Prediction

    Science.gov (United States)

    Chen, Hao; Kihara, Daisuke

    2010-01-01

    Computational protein structure prediction remains a challenging task in protein bioinformatics. In the recent years, the importance of template-based structure prediction is increasing due to the growing number of protein structures solved by the structural genomics projects. To capitalize the significant efforts and investments paid on the structural genomics projects, it is urgent to establish effective ways to use the solved structures as templates by developing methods for exploiting remotely related proteins that cannot be simply identified by homology. In this work, we examine the effect of employing suboptimal alignments in template-based protein structure prediction. We showed that suboptimal alignments are often more accurate than the optimal one, and such accurate suboptimal alignments can occur even at a very low rank of the alignment score. Suboptimal alignments contain a significant number of correct amino acid residue contacts. Moreover, suboptimal alignments can improve template-based models when used as input to Modeller. Finally, we employ suboptimal alignments for handling a contact potential in a probabilistic way in a threading program, SUPRB. The probabilistic contacts strategy outperforms the partly thawed approach which only uses the optimal alignment in defining residue contacts and also the reranking strategy, which uses the contact potential in reranking alignments. The comparison with existing methods in the template-recognition test shows that SUPRB is very competitive and outperform existing methods. PMID:21058297

  3. A Template-Based Protein Structure Reconstruction Method Using Deep Autoencoder Learning.

    Science.gov (United States)

    Li, Haiou; Lyu, Qiang; Cheng, Jianlin

    2016-12-01

    Protein structure prediction is an important problem in computational biology, and is widely applied to various biomedical problems such as protein function study, protein design, and drug design. In this work, we developed a novel deep learning approach based on a deeply stacked denoising autoencoder for protein structure reconstruction. We applied our approach to a template-based protein structure prediction using only the 3D structural coordinates of homologous template proteins as input. The templates were identified for a target protein by a PSI-BLAST search. 3DRobot (a program that automatically generates diverse and well-packed protein structure decoys) was used to generate initial decoy models for the target from the templates. A stacked denoising autoencoder was trained on the decoys to obtain a deep learning model for the target protein. The trained deep model was then used to reconstruct the final structural model for the target sequence. With target proteins that have highly similar template proteins as benchmarks, the GDT-TS score of the predicted structures is greater than 0.7, suggesting that the deep autoencoder is a promising method for protein structure reconstruction.

  4. Template-based quaternary structure prediction of proteins using enhanced profile-profile alignments.

    Science.gov (United States)

    Nakamura, Tsukasa; Oda, Toshiyuki; Fukasawa, Yoshinori; Tomii, Kentaro

    2017-11-27

    Proteins often exist as their multimeric forms when they function as so-called biological assemblies consisting of the specific number and arrangement of protein subunits. Consequently, elucidating biological assemblies is necessary to improve understanding of protein function. Template-Based Modeling (TBM), based on known protein structures, has been used widely for protein structure prediction. Actually, TBM has become an increasingly useful approach in recent years because of the increased amounts of information related to protein amino acid sequences and three-dimensional structures. An apparently similar situation exists for biological assembly structure prediction as protein complex structures in the PDB increase, although the inference of biological assemblies is not a trivial task. Many methods using TBM, including ours, have been developed for protein structure prediction. Using enhanced profile-profile alignments, we participated in the 12th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP12), as the FONT team (Group # 480). Herein, we present experimental procedures and results of retrospective analyses using our approach for the Quaternary Structure Prediction category of CASP12. We performed profile-profile alignments of several types, based on FORTE, our profile-profile alignment algorithm, to identify suitable templates. Results show that these alignment results enable us to find templates in almost all possible cases. Moreover, we have come to understand the necessity of developing a model selection method that provides improved accuracy. Results also demonstrate that, to some extent, finding templates of protein complexes is useful even for MEDIUM and HARD assembly prediction. © 2017 The Authors Proteins: Structure, Function and Bioinformatics Published by Wiley Periodicals, Inc.

  5. Template-based prediction of protein function.

    Science.gov (United States)

    Petrey, Donald; Chen, T Scott; Deng, Lei; Garzon, Jose Ignacio; Hwang, Howook; Lasso, Gorka; Lee, Hunjoong; Silkov, Antonina; Honig, Barry

    2015-06-01

    We discuss recent approaches for structure-based protein function annotation. We focus on template-based methods where the function of a query protein is deduced from that of a template for which both the structure and function are known. We describe the different ways of identifying a template. These are typically based on sequence analysis but new methods based on purely structural similarity are also being developed that allow function annotation based on structural relationships that cannot be recognized by sequence. The growing number of available structures of known function, improved homology modeling techniques and new developments in the use of structure allow template-based methods to be applied on a proteome-wide scale and in many different biological contexts. This progress significantly expands the range of applicability of structural information in function annotation to a level that previously was only achievable by sequence comparison. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Template-based modeling and ab initio refinement of protein oligomer structures using GALAXY in CAPRI round 30.

    Science.gov (United States)

    Lee, Hasup; Baek, Minkyung; Lee, Gyu Rie; Park, Sangwoo; Seok, Chaok

    2017-03-01

    Many proteins function as homo- or hetero-oligomers; therefore, attempts to understand and regulate protein functions require knowledge of protein oligomer structures. The number of available experimental protein structures is increasing, and oligomer structures can be predicted using the experimental structures of related proteins as templates. However, template-based models may have errors due to sequence differences between the target and template proteins, which can lead to functional differences. Such structural differences may be predicted by loop modeling of local regions or refinement of the overall structure. In CAPRI (Critical Assessment of PRotein Interactions) round 30, we used recently developed features of the GALAXY protein modeling package, including template-based structure prediction, loop modeling, model refinement, and protein-protein docking to predict protein complex structures from amino acid sequences. Out of the 25 CAPRI targets, medium and acceptable quality models were obtained for 14 and 1 target(s), respectively, for which proper oligomer or monomer templates could be detected. Symmetric interface loop modeling on oligomer model structures successfully improved model quality, while loop modeling on monomer model structures failed. Overall refinement of the predicted oligomer structures consistently improved the model quality, in particular in interface contacts. Proteins 2017; 85:399-407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  7. LoopIng: a template-based tool for predicting the structure of protein loops.

    KAUST Repository

    Messih, Mario Abdel

    2015-08-06

    Predicting the structure of protein loops is very challenging, mainly because they are not necessarily subject to strong evolutionary pressure. This implies that, unlike the rest of the protein, standard homology modeling techniques are not very effective in modeling their structure. However, loops are often involved in protein function, hence inferring their structure is important for predicting protein structure as well as function.We describe a method, LoopIng, based on the Random Forest automated learning technique, which, given a target loop, selects a structural template for it from a database of loop candidates. Compared to the most recently available methods, LoopIng is able to achieve similar accuracy for short loops (4-10 residues) and significant enhancements for long loops (11-20 residues). The quality of the predictions is robust to errors that unavoidably affect the stem regions when these are modeled. The method returns a confidence score for the predicted template loops and has the advantage of being very fast (on average: 1 min/loop).www.biocomputing.it/loopinganna.tramontano@uniroma1.itSupplementary data are available at Bioinformatics online.

  8. Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10.

    Science.gov (United States)

    Zhang, Yang

    2014-02-01

    We develop and test a new pipeline in CASP10 to predict protein structures based on an interplay of I-TASSER and QUARK for both free-modeling (FM) and template-based modeling (TBM) targets. The most noteworthy observation is that sorting through the threading template pool using the QUARK-based ab initio models as probes allows the detection of distant-homology templates which might be ignored by the traditional sequence profile-based threading alignment algorithms. Further template assembly refinement by I-TASSER resulted in successful folding of two medium-sized FM targets with >150 residues. For TBM, the multiple threading alignments from LOMETS are, for the first time, incorporated into the ab initio QUARK simulations, which were further refined by I-TASSER assembly refinement. Compared with the traditional threading assembly refinement procedures, the inclusion of the threading-constrained ab initio folding models can consistently improve the quality of the full-length models as assessed by the GDT-HA and hydrogen-bonding scores. Despite the success, significant challenges still exist in domain boundary prediction and consistent folding of medium-size proteins (especially beta-proteins) for nonhomologous targets. Further developments of sensitive fold-recognition and ab initio folding methods are critical for solving these problems. Copyright © 2013 Wiley Periodicals, Inc.

  9. Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade.

    Science.gov (United States)

    Yang, Jianyi; Zhang, Wenxuan; He, Baoji; Walker, Sara Elizabeth; Zhang, Hongjiu; Govindarajoo, Brandon; Virtanen, Jouko; Xue, Zhidong; Shen, Hong-Bin; Zhang, Yang

    2016-09-01

    We report the structure prediction results of a new composite pipeline for template-based modeling (TBM) in the 11th CASP experiment. Starting from multiple structure templates identified by LOMETS based meta-threading programs, the QUARK ab initio folding program is extended to generate initial full-length models under strong constraints from template alignments. The final atomic models are then constructed by I-TASSER based fragment reassembly simulations, followed by the fragment-guided molecular dynamic simulation and the MQAP-based model selection. It was found that the inclusion of QUARK-TBM simulations as an intermediate modeling step could help improve the quality of the I-TASSER models for both Easy and Hard TBM targets. Overall, the average TM-score of the first I-TASSER model is 12% higher than that of the best LOMETS templates, with the RMSD in the same threading-aligned regions reduced from 5.8 to 4.7 Å. Nevertheless, there are nearly 18% of TBM domains with the templates deteriorated by the structure assembly pipeline, which may be attributed to the errors of secondary structure and domain orientation predictions that propagate through and degrade the procedures of template identification and final model selections. To examine the record of progress, we made a retrospective report of the I-TASSER pipeline in the last five CASP experiments (CASP7-11). The data show no clear progress of the LOMETS threading programs over PSI-BLAST; but obvious progress on structural improvement relative to threading templates was witnessed in recent CASP experiments, which is probably attributed to the integration of the extended ab initio folding simulation with the threading assembly pipeline and the introduction of atomic-level structure refinements following the reduced modeling simulations. Proteins 2016; 84(Suppl 1):233-246. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  10. Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11.

    Science.gov (United States)

    Cao, Renzhi; Bhattacharya, Debswapna; Adhikari, Badri; Li, Jilong; Cheng, Jianlin

    2016-09-01

    Model evaluation and selection is an important step and a big challenge in template-based protein structure prediction. Individual model quality assessment methods designed for recognizing some specific properties of protein structures often fail to consistently select good models from a model pool because of their limitations. Therefore, combining multiple complimentary quality assessment methods is useful for improving model ranking and consequently tertiary structure prediction. Here, we report the performance and analysis of our human tertiary structure predictor (MULTICOM) based on the massive integration of 14 diverse complementary quality assessment methods that was successfully benchmarked in the 11th Critical Assessment of Techniques of Protein Structure prediction (CASP11). The predictions of MULTICOM for 39 template-based domains were rigorously assessed by six scoring metrics covering global topology of Cα trace, local all-atom fitness, side chain quality, and physical reasonableness of the model. The results show that the massive integration of complementary, diverse single-model and multi-model quality assessment methods can effectively leverage the strength of single-model methods in distinguishing quality variation among similar good models and the advantage of multi-model quality assessment methods of identifying reasonable average-quality models. The overall excellent performance of the MULTICOM predictor demonstrates that integrating a large number of model quality assessment methods in conjunction with model clustering is a useful approach to improve the accuracy, diversity, and consequently robustness of template-based protein structure prediction. Proteins 2016; 84(Suppl 1):247-259. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  11. Assessing the accuracy of template-based structure prediction metaservers by comparison with structural genomics structures.

    Science.gov (United States)

    Gront, Dominik; Grabowski, Marek; Zimmerman, Matthew D; Raynor, John; Tkaczuk, Karolina L; Minor, Wladek

    2012-12-01

    The explosion of the size of the universe of known protein sequences has stimulated two complementary approaches to structural mapping of these sequences: theoretical structure prediction and experimental determination by structural genomics (SG). In this work, we assess the accuracy of structure prediction by two automated template-based structure prediction metaservers (genesilico.pl and bioinfo.pl) by measuring the structural similarity of the predicted models to corresponding experimental models determined a posteriori. Of 199 targets chosen from SG programs, the metaservers predicted the structures of about a fourth of them "correctly." (In this case, "correct" was defined as placing more than 70 % of the alpha carbon atoms in the model within 2 Å of the experimentally determined positions.) Almost all of the targets that could be modeled to this accuracy were those with an available template in the Protein Data Bank (PDB) with more than 25 % sequence identity. The majority of those SG targets with lower sequence identity to structures in the PDB were not predicted by the metaservers with this accuracy. We also compared metaserver results to CASP8 results, finding that the models obtained by participants in the CASP competition were significantly better than those produced by the metaservers.

  12. Relative Packing Groups in Template-Based Structure Prediction: Cooperative Effects of True Positive Constraints

    Science.gov (United States)

    Day, Ryan; Qu, Xiaotao; Swanson, Rosemarie; Bohannan, Zach; Bliss, Robert

    2011-01-01

    Abstract Most current template-based structure prediction methods concentrate on finding the correct backbone conformation and then packing sidechains within that backbone. Our packing-based method derives distance constraints from conserved relative packing groups (RPGs). In our refinement approach, the RPGs provide a level of resolution that restrains global topology while allowing conformational sampling. In this study, we test our template-based structure prediction method using 51 prediction units from CASP7 experiments. RPG-based constraints are able to substantially improve approximately two-thirds of starting templates. Upon deeper investigation, we find that true positive spatial constraints, especially those non-local in sequence, derived from the RPGs were important to building nearer native models. Surprisingly, the fraction of incorrect or false positive constraints does not strongly influence the quality of the final candidate. This result indicates that our RPG-based true positive constraints sample the self-consistent, cooperative interactions of the native structure. The lack of such reinforcing cooperativity explains the weaker effect of false positive constraints. Generally, these findings are encouraging indications that RPGs will improve template-based structure prediction. PMID:21210729

  13. Accuracy of protein-protein binding sites in high-throughput template-based modeling.

    Directory of Open Access Journals (Sweden)

    Petras J Kundrotas

    2010-04-01

    Full Text Available The accuracy of protein structures, particularly their binding sites, is essential for the success of modeling protein complexes. Computationally inexpensive methodology is required for genome-wide modeling of such structures. For systematic evaluation of potential accuracy in high-throughput modeling of binding sites, a statistical analysis of target-template sequence alignments was performed for a representative set of protein complexes. For most of the complexes, alignments containing all residues of the interface were found. The full interface alignments were obtained even in the case of poor alignments where a relatively small part of the target sequence (as low as 40% aligned to the template sequence, with a low overall alignment identity (<30%. Although such poor overall alignments might be considered inadequate for modeling of whole proteins, the alignment of the interfaces was strong enough for docking. In the set of homology models built on these alignments, one third of those ranked 1 by a simple sequence identity criteria had RMSD<5 A, the accuracy suitable for low-resolution template free docking. Such models corresponded to multi-domain target proteins, whereas for single-domain proteins the best models had 5 Astructure-alignment methods. Overall, approximately 50% of complexes with the interfaces modeled by high-throughput techniques had accuracy suitable for meaningful docking experiments. This percentage will grow with the increasing availability of co-crystallized protein-protein complexes.

  14. Novel nanoarray structures formed by template based approach: Characterization and electrochemistry

    Science.gov (United States)

    Alhoshan, Mansour Saleh

    Several different methods have been developed to form nanomaterials. The methods employed depend on the desired properties and applications. One of the broadest and important synthetic approaches for nanomaterials is based on templates. Templates provide a predetermined configuration or cast to guide the formation of nanomaterials with the desired morphology. They provide a very rich method to fabricate nanomaterials with a wide range of different morphologies and tunable sizes. After a materials formation reaction, the template can be sacrificially removed, leaving behind the final product that replicates the morphology of the original template. The synthetic methods based on templates overcome a weakness of other synthesis methods by providing good control of the final morphology of the produced nanomaterials. In addition, the methods are very general with respect to the types of materials that may be prepared. The main focus in this thesis is to produce highly ordered novel structures and arrays at nano/micro scales that are of electrochemical interest with good control of size and shape. Template-based approaches were used here to fabricate nano/micro tubes, rods, wires and porous films. By this approach, we incorporated electroless deposition and electrodeposition reactions with various templates, so that once the template is removed, the desired structures are reveled. Among the templates that were used are: track etched polycarbonate membrane, anodized aluminum oxide membrane and silica colloidal sphere. Electroless deposition and electrodeposition within the template to form nanomaterials is very attractive approach because it can be carried out under conditions mild enough to avoid any damage of both the desired materials and the template used. This approach makes possible the formation of a wide range of nanomaterials that may be useful technologically for catalytic, electronic, and energy storage applications. Examples of the nanostructures that were

  15. DNABind: a hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches.

    Science.gov (United States)

    Liu, Rong; Hu, Jianjun

    2013-11-01

    Accurate prediction of DNA-binding residues has become a problem of increasing importance in structural bioinformatics. Here, we presented DNABind, a novel hybrid algorithm for identifying these crucial residues by exploiting the complementarity between machine learning- and template-based methods. Our machine learning-based method was based on the probabilistic combination of a structure-based and a sequence-based predictor, both of which were implemented using support vector machines algorithms. The former included our well-designed structural features, such as solvent accessibility, local geometry, topological features, and relative positions, which can effectively quantify the difference between DNA-binding and nonbinding residues. The latter combined evolutionary conservation features with three other sequence attributes. Our template-based method depended on structural alignment and utilized the template structure from known protein-DNA complexes to infer DNA-binding residues. We showed that the template method had excellent performance when reliable templates were found for the query proteins but tended to be strongly influenced by the template quality as well as the conformational changes upon DNA binding. In contrast, the machine learning approach yielded better performance when high-quality templates were not available (about 1/3 cases in our dataset) or the query protein was subject to intensive transformation changes upon DNA binding. Our extensive experiments indicated that the hybrid approach can distinctly improve the performance of the individual methods for both bound and unbound structures. DNABind also significantly outperformed the state-of-art algorithms by around 10% in terms of Matthews's correlation coefficient. The proposed methodology could also have wide application in various protein functional site annotations. DNABind is freely available at http://mleg.cse.sc.edu/DNABind/. Copyright © 2013 Wiley Periodicals, Inc.

  16. SnapDock-template-based docking by Geometric Hashing.

    Science.gov (United States)

    Estrin, Michael; Wolfson, Haim J

    2017-07-15

    A highly efficient template-based protein-protein docking algorithm, nicknamed SnapDock, is presented. It employs a Geometric Hashing-based structural alignment scheme to align the target proteins to the interfaces of non-redundant protein-protein interface libraries. Docking of a pair of proteins utilizing the 22 600 interface PIFACE library is performed in gmail.com or wolfson@tau.ac.il.

  17. Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling : A CASP-CAPRI experiment

    NARCIS (Netherlands)

    Lensink, Marc F.; Velankar, Sameer; Kryshtafovych, Andriy; Huang, Shen You; Schneidman-Duhovny, Dina; Sali, Andrej; Segura, Joan; Fernandez-Fuentes, Narcis; Viswanath, Shruthi; Elber, Ron; Grudinin, Sergei; Popov, Petr; Neveu, Emilie; Lee, Hasup; Baek, Minkyung; Park, Sangwoo; Heo, Lim; Rie Lee, Gyu; Seok, Chaok; Qin, Sanbo; Zhou, Huan Xiang; Ritchie, David W.; Maigret, Bernard; Devignes, Marie Dominique; Ghoorah, Anisah; Torchala, Mieczyslaw; Chaleil, Raphaël A G; Bates, Paul A.; Ben-Zeev, Efrat; Eisenstein, Miriam; Negi, Surendra S.; Weng, Zhiping; Vreven, Thom; Pierce, Brian G.; Borrman, Tyler M.; Yu, Jinchao; Ochsenbein, Françoise; Guerois, Raphaël; Vangone, Anna|info:eu-repo/dai/nl/370549694; Garcia Lopes Maia Rodrigues, João|info:eu-repo/dai/nl/330827391; van Zundert, Gydo|info:eu-repo/dai/nl/338775285; Nellen, Mehdi; Xue, Li|info:eu-repo/dai/nl/413576817; Karaca, Ezgi|info:eu-repo/dai/nl/315554789; Melquiond, Adrien S J|info:eu-repo/dai/nl/31412277X; Visscher, Koen; Kastritis, Panagiotis L.|info:eu-repo/dai/nl/315886668; Bonvin, Alexandre M J J|info:eu-repo/dai/nl/113691238; Xu, Xianjin; Qiu, Liming; Yan, Chengfei; Li, Jilong; Ma, Zhiwei; Cheng, Jianlin; Zou, Xiaoqin; Shen, Yang; Peterson, Lenna X.; Kim, Hyung Rae; Roy, Amit; Han, Xusi; Esquivel-Rodriguez, Juan; Kihara, Daisuke; Yu, Xiaofeng; Bruce, Neil J.; Fuller, Jonathan C.; Wade, Rebecca C.; Anishchenko, Ivan; Kundrotas, Petras J.; Vakser, Ilya A.; Imai, Kenichiro; Yamada, Kazunori; Oda, Toshiyuki; Nakamura, Tsukasa; Tomii, Kentaro; Pallara, Chiara; Romero-Durana, Miguel; Jiménez-García, Brian; Moal, Iain H.; Férnandez-Recio, Juan; Joung, Jong Young; Kim, Jong Yun; Joo, Keehyoung; Lee, Jooyoung; Kozakov, Dima; Vajda, Sandor; Mottarella, Scott; Hall, David R.; Beglov, Dmitri; Mamonov, Artem; Xia, Bing; Bohnuud, Tanggis; Del Carpio, Carlos A.; Ichiishi, Eichiro; Marze, Nicholas; Kuroda, Daisuke; Roy Burman, Shourya S.; Gray, Jeffrey J.; Chermak, Edrisse; Cavallo, Luigi; Oliva, Romina; Tovchigrechko, Andrey; Wodak, Shoshana J.

    2016-01-01

    We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014.

  18. Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling: A CASP-CAPRI experiment

    KAUST Repository

    Lensink, Marc F.

    2016-04-28

    We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy. © 2016 Wiley Periodicals, Inc.

  19. Protein Structure

    Science.gov (United States)

    Asmus, Elaine Garbarino

    2007-01-01

    Individual students model specific amino acids and then, through dehydration synthesis, a class of students models a protein. The students clearly learn amino acid structure, primary, secondary, tertiary, and quaternary structure in proteins and the nature of the bonds maintaining a protein's shape. This activity is fun, concrete, inexpensive and…

  20. RBO Aleph: leveraging novel information sources for protein structure prediction.

    Science.gov (United States)

    Mabrouk, Mahmoud; Putz, Ines; Werner, Tim; Schneider, Michael; Neeb, Moritz; Bartels, Philipp; Brock, Oliver

    2015-07-01

    RBO Aleph is a novel protein structure prediction web server for template-based modeling, protein contact prediction and ab initio structure prediction. The server has a strong emphasis on modeling difficult protein targets for which templates cannot be detected. RBO Aleph's unique features are (i) the use of combined evolutionary and physicochemical information to perform residue-residue contact prediction and (ii) leveraging this contact information effectively in conformational space search. RBO Aleph emerged as one of the leading approaches to ab initio protein structure prediction and contact prediction during the most recent Critical Assessment of Protein Structure Prediction experiment (CASP11, 2014). In addition to RBO Aleph's main focus on ab initio modeling, the server also provides state-of-the-art template-based modeling services. Based on template availability, RBO Aleph switches automatically between template-based modeling and ab initio prediction based on the target protein sequence, facilitating use especially for non-expert users. The RBO Aleph web server offers a range of tools for visualization and data analysis, such as the visualization of predicted models, predicted contacts and the estimated prediction error along the model's backbone. The server is accessible at http://compbio.robotics.tu-berlin.de/rbo_aleph/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Template-based modeling of a psychrophilic lipase: conformational changes, novel structural features and its application in predicting the enantioselectivity of lipase catalyzed transesterification of secondary alcohols.

    Science.gov (United States)

    Xu, Tao; Gao, Bei; Zhang, Lujia; Lin, Jingpin; Wang, Xuedong; Wei, Dongzhi

    2010-12-01

    In order to fully explore the structure-function relationship of a Proteus lipase (LipK107) that was screened from the soil in our previous study, we have modeled the three-dimensional (3-D) structures of the enzyme in its active and inactive conformations on the basis of crystal structures of Burkholderia glumae and Pseudomonas aeruginosa lipases in the present study. Both homology models suggested that LipK107 possessed a catalytic triad (Ser79-Asp232-H254), an oxyanion hole (Leu13 and Gln80) which was used to stabilize the reaction tetrahedral intermediates, and a lid substructure that controlled the access of the substrate to the active site. The existence of the lid was further verified by carrying out the interfacial activation experiment. The conformational change of LipK107 which was caused by lid opening action was predicted by superimposing the two theoretical models for the first time. Finally, both 3-D structures were used to predict the enantioselectivity of LipK107 when the enzyme was used to catalyze the resolution of racemic 1-phenylethanol. Lid-open model of LipK107 identified the R-enantiomer as the preferred enantiomer, while lid-closed mode showed that the S-enantiomer was more favored. However, only the lid-open conformational model could led to predictions that agreed with the following the experimental result of real biocatalysis reaction of 1-phenylethanol. Crown Copyright © 2010. Published by Elsevier B.V. All rights reserved.

  2. Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks

    Directory of Open Access Journals (Sweden)

    Martin Alberto JM

    2009-01-01

    Full Text Available Abstract Background Prediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield correct three-dimensional structures, this is true only at a relatively low resolution (3–4 Å from the native structure. Another known weakness of contact maps is that they are generally predicted ab initio, that is not exploiting information about potential homologues of known structure. Results We introduce a new class of distance restraints for protein structures: multi-class distance maps. We show that Cα trace reconstructions based on 4-class native maps are significantly better than those from residue contact maps. We then build two predictors of 4-class maps based on recursive neural networks: one ab initio, or relying on the sequence and on evolutionary information; one template-based, or in which homology information to known structures is provided as a further input. We show that virtually any level of sequence similarity to structural templates (down to less than 10% yields more accurate 4-class maps than the ab initio predictor. We show that template-based predictions by recursive neural networks are consistently better than the best template and than a number of combinations of the best available templates. We also extract binary residue contact maps at an 8 Å threshold (as per CASP assessment from the 4-class predictors and show that the template-based version is also more accurate than the best template and consistently better than the ab initio one, down to very low levels of sequence identity to structural templates. Furthermore, we test both ab-initio and template-based 8

  3. Protein structure prediction using a docking-based hierarchical folding scheme.

    Science.gov (United States)

    Kifer, Ilona; Nussinov, Ruth; Wolfson, Haim J

    2011-06-01

    The pathways by which proteins fold into their specific native structure are still an unsolved mystery. Currently, many methods for protein structure prediction are available, and most of them tackle the problem by relying on the vast amounts of data collected from known protein structures. These methods are often not concerned with the route the protein follows to reach its final fold. This work is based on the premise that proteins fold in a hierarchical manner. We present FOBIA, an automated method for predicting a protein structure. FOBIA consists of two main stages: the first finds matches between parts of the target sequence and independently folding structural units using profile-profile comparison. The second assembles these units into a 3D structure by searching and ranking their possible orientations toward each other using a docking-based approach. We have previously reported an application of an initial version of this strategy to homology based targets. Since then we have considerably enhanced our method's abilities to allow it to address the more difficult template-based target category. This allows us to now apply FOBIA to the template-based targets of CASP8 and to show that it is both very efficient and promising. Our method can provide an alternative for template-based structure prediction, and in particular, the docking-basedranking technique presented here can be incorporated into any profile-profile comparison based method. Copyright © 2011 Wiley-Liss, Inc.

  4. The MULTICOM toolbox for protein structure prediction

    Directory of Open Access Journals (Sweden)

    Cheng Jianlin

    2012-04-01

    Full Text Available Abstract Background As genome sequencing is becoming routine in biomedical research, the total number of protein sequences is increasing exponentially, recently reaching over 108 million. However, only a tiny portion of these proteins (i.e. ~75,000 or Results To meet the need, we have developed a comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools. These tools include secondary structure prediction, solvent accessibility prediction, disorder region prediction, domain boundary prediction, contact map prediction, disulfide bond prediction, beta-sheet topology prediction, fold recognition, multiple template combination and alignment, template-based tertiary structure modeling, protein model quality assessment, and mutation stability prediction. Conclusions These tools have been rigorously tested by many users in the last several years and/or during the last three rounds of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7-9 from 2006 to 2010, achieving state-of-the-art or near performance. In order to facilitate bioinformatics research and technological development in the field, we have made the MULTICOM toolbox freely available as web services and/or software packages for academic use and scientific research. It is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/.

  5. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng

    2016-06-15

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

  6. Assessing ligand efficiencies using template-based molecular ...

    Indian Academy of Sciences (India)

    Statistical modelling using artificial neural network (ANN: 2 = 0.922) and multiple linear regression method (MLR: 2 = 0.851) showed good correlation between the biological activity, binding affinity, and different ligand efficiencies of the compounds, which suggest the robustness of the template-based binding ...

  7. GalaxyHomomer: a web server for protein homo-oligomer structure prediction from a monomer sequence or structure.

    Science.gov (United States)

    Baek, Minkyung; Park, Taeyong; Heo, Lim; Park, Chiwook; Seok, Chaok

    2017-04-06

    Homo-oligomerization of proteins is abundant in nature, and is often intimately related with the physiological functions of proteins, such as in metabolism, signal transduction or immunity. Information on the homo-oligomer structure is therefore important to obtain a molecular-level understanding of protein functions and their regulation. Currently available web servers predict protein homo-oligomer structures either by template-based modeling using homo-oligomer templates selected from the protein structure database or by ab initio docking of monomer structures resolved by experiment or predicted by computation. The GalaxyHomomer server, freely accessible at http://galaxy.seoklab.org/homomer, carries out template-based modeling, ab initio docking or both depending on the availability of proper oligomer templates. It also incorporates recently developed model refinement methods that can consistently improve model quality. Moreover, the server provides additional options that can be chosen by the user depending on the availability of information on the monomer structure, oligomeric state and locations of unreliable/flexible loops or termini. The performance of the server was better than or comparable to that of other available methods when tested on benchmark sets and in a recent CASP performed in a blind fashion. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. PSPP: a protein structure prediction pipeline for computing clusters.

    Directory of Open Access Journals (Sweden)

    Michael S Lee

    Full Text Available BACKGROUND: Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster. METHODOLOGY/PRINCIPAL FINDINGS: The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML formats. So far, the pipeline has been used to study viral and bacterial proteomes. CONCLUSIONS: The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a

  9. FALCON@home: a high-throughput protein structure prediction server based on remote homologue recognition.

    Science.gov (United States)

    Wang, Chao; Zhang, Haicang; Zheng, Wei-Mou; Xu, Dong; Zhu, Jianwei; Wang, Bing; Ning, Kang; Sun, Shiwei; Li, Shuai Cheng; Bu, Dongbo

    2016-02-01

    The protein structure prediction approaches can be categorized into template-based modeling (including homology modeling and threading) and free modeling. However, the existing threading tools perform poorly on remote homologous proteins. Thus, improving fold recognition for remote homologous proteins remains a challenge. Besides, the proteome-wide structure prediction poses another challenge of increasing prediction throughput. In this study, we presented FALCON@home as a protein structure prediction server focusing on remote homologue identification. The design of FALCON@home is based on the observation that a structural template, especially for remote homologous proteins, consists of conserved regions interweaved with highly variable regions. The highly variable regions lead to vague alignments in threading approaches. Thus, FALCON@home first extracts conserved regions from each template and then aligns a query protein with conserved regions only rather than the full-length template directly. This helps avoid the vague alignments rooted in highly variable regions, improving remote homologue identification. We implemented FALCON@home using the Berkeley Open Infrastructure of Network Computing (BOINC) volunteer computing protocol. With computation power donated from over 20,000 volunteer CPUs, FALCON@home shows a throughput as high as processing of over 1000 proteins per day. In the Critical Assessment of protein Structure Prediction (CASP11), the FALCON@home-based prediction was ranked the 12th in the template-based modeling category. As an application, the structures of 880 mouse mitochondria proteins were predicted, which revealed the significant correlation between protein half-lives and protein structural factors. FALCON@home is freely available at http://protein.ict.ac.cn/FALCON/. shuaicli@cityu.edu.hk, dbu@ict.ac.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For

  10. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction.

    Science.gov (United States)

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-06-15

    Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence-structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM-HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods. Our program is freely available for download from http://sfb.kaust.edu.sa/Pages/Software.aspx : xin.gao@kaust.edu.sa Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  11. Template based parallel checkpointing in a massively parallel computer system

    Science.gov (United States)

    Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  12. Protein Modelling: What Happened to the “Protein Structure Gap”?

    Science.gov (United States)

    Schwede, Torsten

    2013-01-01

    Computational modeling and prediction of three-dimensional macromolecular structures and complexes from their sequence has been a long standing vision in structural biology as it holds the promise to bypass part of the laborious process of experimental structure solution. Over the last two decades, a paradigm shift has occurred: starting from a situation where the “structure knowledge gap” between the huge number of protein sequences and small number of known structures has hampered the widespread use of structure-based approaches in life science research, today some form of structural information – either experimental or computational – is available for the majority of amino acids encoded by common model organism genomes. Template based homology modeling techniques have matured to a point where they are now routinely used to complement experimental techniques. With the scientific focus of interest moving towards larger macromolecular complexes and dynamic networks of interactions, the integration of computational modeling methods with low-resolution experimental techniques allows studying large and complex molecular machines. Computational modeling and prediction techniques are still facing a number of challenges which hamper the more widespread use by the non-expert scientist. For example, it is often difficult to convey the underlying assumptions of a computational technique, as well as the expected accuracy and structural variability of a specific model. However, these aspects are crucial to understand the limitations of a model, and to decide which interpretations and conclusions can be supported. PMID:24010712

  13. Template-based education toolkit for mobile platforms

    Science.gov (United States)

    Golagani, Santosh Chandana; Esfahanian, Moosa; Akopian, David

    2012-02-01

    Nowadays mobile phones are the most widely used portable devices which evolve very fast adding new features and improving user experiences. The latest generation of hand-held devices called smartphones is equipped with superior memory, cameras and rich multimedia features, empowering people to use their mobile phones not only as a communication tool but also for entertainment purposes. With many young students showing interest in learning mobile application development one should introduce novel learning methods which may adapt to fast technology changes and introduce students to application development. Mobile phones become a common device, and engineering community incorporates phones in various solutions. Overcoming the limitations of conventional undergraduate electrical engineering (EE) education this paper explores the concept of template-based based education in mobile phone programming. The concept is based on developing small exercise templates which students can manipulate and revise for quick hands-on introduction to the application development and integration. Android platform is used as a popular open source environment for application development. The exercises relate to image processing topics typically studied by many students. The goal is to enable conventional course enhancements by incorporating in them short hands-on learning modules.

  14. Structure-based de novo prediction of zinc-binding sites in proteins of unknown function.

    Science.gov (United States)

    Zhao, Wei; Xu, Meng; Liang, Zhi; Ding, Bo; Niu, Liwen; Liu, Haiyan; Teng, Maikun

    2011-05-01

    Zinc-binding proteins are the most abundant metallo-proteins in Protein Data Bank (PDB). Accurate prediction of zinc-binding sites in proteins of unknown function may provide important clues for the inference of protein function. As zinc binding is often associated with characteristic 3D arrangements of zinc ligand residues, its prediction may benefit from using not only the sequence information but also the structure information of proteins. In this work, we present a structure-based method, TEMSP (3D TEmplate-based Metal Site Prediction), to predict zinc-binding sites. TEMSP significantly improves over previously reported best methods in predicting as many as possible true ligand residues for zinc with minimum overpredictions: if only those results in which all zinc ligand residues have been correctly predicted are defined as true positives, our method improves sensitivity from less than 30% to above 60%, and selectivity from around 25% to 80%. These results are for predictions based on apo state structures. In addition, the method can predict the zinc-bound local structures reliably, generating predictions useful for function inference. We applied TEMSP to 1888 protein structures of the 'Unknown Function' class in the PDB database. A number of zinc-binding sites have been discovered de novo, i.e. based solely on the protein structures. Using the predicted local structures of these sites, possible functional roles were analyzed. TEMSP is freely available from http://netalign.ustc.edu.cn/temsp/.

  15. Modeling complexes of modeled proteins.

    Science.gov (United States)

    Anishchenko, Ivan; Kundrotas, Petras J; Vakser, Ilya A

    2017-03-01

    Structural characterization of proteins is essential for understanding life processes at the molecular level. However, only a fraction of known proteins have experimentally determined structures. This fraction is even smaller for protein-protein complexes. Thus, structural modeling of protein-protein interactions (docking) primarily has to rely on modeled structures of the individual proteins, which typically are less accurate than the experimentally determined ones. Such "double" modeling is the Grand Challenge of structural reconstruction of the interactome. Yet it remains so far largely untested in a systematic way. We present a comprehensive validation of template-based and free docking on a set of 165 complexes, where each protein model has six levels of structural accuracy, from 1 to 6 Å C α RMSD. Many template-based docking predictions fall into acceptable quality category, according to the CAPRI criteria, even for highly inaccurate proteins (5-6 Å RMSD), although the number of such models (and, consequently, the docking success rate) drops significantly for models with RMSD > 4 Å. The results show that the existing docking methodologies can be successfully applied to protein models with a broad range of structural accuracy, and the template-based docking is much less sensitive to inaccuracies of protein models than the free docking. Proteins 2017; 85:470-478. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  16. Structural Genomics of Protein Phosphatases

    Energy Technology Data Exchange (ETDEWEB)

    Almo,S.; Bonanno, J.; Sauder, J.; Emtage, S.; Dilorenzo, T.; Malashkevich, V.; Wasserman, S.; Swaminathan, S.; Eswaramoorthy, S.; et al

    2007-01-01

    The New York SGX Research Center for Structural Genomics (NYSGXRC) of the NIGMS Protein Structure Initiative (PSI) has applied its high-throughput X-ray crystallographic structure determination platform to systematic studies of all human protein phosphatases and protein phosphatases from biomedically-relevant pathogens. To date, the NYSGXRC has determined structures of 21 distinct protein phosphatases: 14 from human, 2 from mouse, 2 from the pathogen Toxoplasma gondii, 1 from Trypanosoma brucei, the parasite responsible for African sleeping sickness, and 2 from the principal mosquito vector of malaria in Africa, Anopheles gambiae. These structures provide insights into both normal and pathophysiologic processes, including transcriptional regulation, regulation of major signaling pathways, neural development, and type 1 diabetes. In conjunction with the contributions of other international structural genomics consortia, these efforts promise to provide an unprecedented database and materials repository for structure-guided experimental and computational discovery of inhibitors for all classes of protein phosphatases.

  17. Structural genomics of human proteins.

    Science.gov (United States)

    Osman, Khan Tanjid; Edwards, Aled

    2014-01-01

    Structural genomics efforts focused on the human proteome have had three aims: to understand the structural and functional variations within protein families; to understand the structural basis of disease and genetic variation; and to determine the structures of human integral membrane proteins. The overarching theme is to advance the understanding of human health and to provide a structural platform to aid in the development of therapeutics. A decade or more of work in this field has identified optimal experimental strategies that can be used to expedite expression and crystallization of human proteins-and we provide some guidance to this end.

  18. Magic Numbers in Protein Structures

    DEFF Research Database (Denmark)

    Lindgård, Per-Anker; Bohr, Henrik

    1996-01-01

    A homology measure for protein fold classes has been constructed by locally projecting consecutive secondary structures onto a lattice. Taking into account hydrophobic forces we have found a mechanism for formation of domains containing magic numbers of secondary structures and multipla...... of these domains. We have performed a statistical analysis of available protein structures and found agreement with the predicted preferred abundances. Furthermore, a connection between sequence information and fold classes is established in terms of hinge forces between the structural elements....

  19. Update on protein structure prediction

    DEFF Research Database (Denmark)

    Hubbard, T; Tramontano, A; Barton, G

    1996-01-01

    Computational tools for protein structure prediction are of great interest to molecular, structural and theoretical biologists due to a rapidly increasing number of protein sequences with no known structure. In October 1995, a workshop was held at IRBM to predict as much as possible about a number...... of proteins of biological interest using ab initio pre!diction of fold recognition methods. 112 protein sequences were collected via an open invitation for target submissions. 17 were selected for prediction during the workshop and for 11 of these a prediction of some reliability could be made. We believe...

  20. Detecting internally symmetric protein structures

    Directory of Open Access Journals (Sweden)

    Basner Jodi

    2010-06-01

    Full Text Available Abstract Background Many functional proteins have a symmetric structure. Most of these are multimeric complexes, which are made of non-symmetric monomers arranged in a symmetric manner. However, there are also a large number of proteins that have a symmetric structure in the monomeric state. These internally symmetric proteins are interesting objects from the point of view of their folding, function, and evolution. Most algorithms that detect the internally symmetric proteins depend on finding repeating units of similar structure and do not use the symmetry information. Results We describe a new method, called SymD, for detecting symmetric protein structures. The SymD procedure works by comparing the structure to its own copy after the copy is circularly permuted by all possible number of residues. The procedure is relatively insensitive to symmetry-breaking insertions and deletions and amplifies positive signals from symmetry. It finds 70% to 80% of the TIM barrel fold domains in the ASTRAL 40 domain database and 100% of the beta-propellers as symmetric. More globally, 10% to 15% of the proteins in the ASTRAL 40 domain database may be considered symmetric according to this procedure depending on the precise cutoff value used to measure the degree of perfection of the symmetry. Symmetrical proteins occur in all structural classes and can have a closed, circular structure, a cylindrical barrel-like structure, or an open, helical structure. Conclusions SymD is a sensitive procedure for detecting internally symmetric protein structures. Using this procedure, we estimate that 10% to 15% of the known protein domains may be considered symmetric. We also report an initial, overall view of the types of symmetries and symmetric folds that occur in the protein domain structure universe.

  1. Immunophenotype Discovery, Hierarchical Organization, and Template-based Classification of Flow Cytometry Samples

    Directory of Open Access Journals (Sweden)

    Ariful Azad

    2016-08-01

    Full Text Available We describe algorithms for discovering immunophenotypes from large collections of flow cytometry (FC samples, and using them to organize the samples into a hierarchy based on phenotypic similarity. The hierarchical organization is helpful for effective and robust cytometry data mining, including the creation of collections of cell populations characteristic of different classes of samples, robust classification, and anomaly detection. We summarize a set of samples belonging to a biological class or category with a statistically derived template for the class. Whereas individual samples are represented in terms of their cell populations (clusters, a template consists of generic meta-populations (a group of homogeneous cell populations obtained from the samples in a class that describe key phenotypes shared among all those samples. We organize an FC data collection in a hierarchical data structure that supports the identification of immunophenotypes relevant to clinical diagnosis. A robust template-based classification scheme is also developed, but our primary focus is in the discovery of phenotypic signatures and inter-sample relationships in an FC data collection. This collective analysis approach is more efficient and robust since templates describe phenotypic signatures common to cell populations in several samples, while ignoring noise and small sample-specific variations.We have applied the template-base scheme to analyze several data setsincluding one representing a healthy immune system, and one of Acute Myeloid Leukemia (AMLsamples. The last task is challenging due to the phenotypic heterogeneity of the severalsubtypes of AML. However, we identified thirteen immunophenotypes corresponding to subtypes of AML, and were able to distinguish Acute Promyelocytic Leukemia from other subtypes of AML.

  2. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12.

    Science.gov (United States)

    Zhang, Chengxin; Mortuza, S M; He, Baoji; Wang, Yanting; Zhang, Yang

    2018-03-01

    We develop two complementary pipelines, "Zhang-Server" and "QUARK", based on I-TASSER and QUARK pipelines for template-based modeling (TBM) and free modeling (FM), and test them in the CASP12 experiment. The combination of I-TASSER and QUARK successfully folds three medium-size FM targets that have more than 150 residues, even though the interplay between the two pipelines still awaits further optimization. Newly developed sequence-based contact prediction by NeBcon plays a critical role to enhance the quality of models, particularly for FM targets, by the new pipelines. The inclusion of NeBcon predicted contacts as restraints in the QUARK simulations results in an average TM-score of 0.41 for the best in top five predicted models, which is 37% higher than that by the QUARK simulations without contacts. In particular, there are seven targets that are converted from non-foldable to foldable (TM-score >0.5) due to the use of contact restraints in the simulations. Another additional feature in the current pipelines is the local structure quality prediction by ResQ, which provides a robust residue-level modeling error estimation. Despite the success, significant challenges still remain in ab initio modeling of multi-domain proteins and folding of β-proteins with complicated topologies bound by long-range strand-strand interactions. Improvements on domain boundary and long-range contact prediction, as well as optimal use of the predicted contacts and multiple threading alignments, are critical to address these issues seen in the CASP12 experiment. © 2017 Wiley Periodicals, Inc.

  3. Compact Structure Patterns in Proteins.

    Science.gov (United States)

    Chitturi, Bhadrachalam; Shi, Shuoyong; Kinch, Lisa N; Grishin, Nick V

    2016-10-23

    Globular proteins typically fold into tightly packed arrays of regular secondary structures. We developed a model to approximate the compact parallel and antiparallel arrangement of α-helices and β-strands, enumerated all possible topologies formed by up to five secondary structural elements (SSEs), searched for their occurrence in spatial structures of proteins, and documented their frequencies of occurrence in the PDB. The enumeration model grows larger super-secondary structure patterns (SSPs) by combining pairs of smaller patterns, a process that approximates a potential path of protein fold evolution. The most prevalent SSPs are typically present in superfolds such as the Rossmann-like fold, the ferredoxin-like fold, and the Greek key motif, whereas the less frequent SSPs often possess uncommon structure features such as split β-sheets, left-handed connections, and crossing loops. This complete SSP enumeration model, for the first time, allows us to investigate which theoretically possible SSPs are not observed in available protein structures. All SSPs with up to four SSEs occurred in proteins. However, among the SSPs with five SSEs, approximately 20% (218) are absent from existing folds. Of these unobserved SSPs, 80% contain two or more uncommon structure features. To facilitate future efforts in protein structure classification, engineering, and design, we provide the resulting patterns and their frequency of occurrence in proteins at: http://prodata.swmed.edu/ssps/. Copyright © 2016. Published by Elsevier Ltd.

  4. Julius – a template based supplementary electronic health record system

    Directory of Open Access Journals (Sweden)

    Klein Gunnar O

    2007-05-01

    Full Text Available Abstract Background EHR systems are widely used in hospitals and primary care centres but it is usually difficult to share information and to collect patient data for clinical research. This is partly due to the different proprietary information models and inconsistent data quality. Our objective was to provide a more flexible solution enabling the clinicians to define which data to be recorded and shared for both routine documentation and clinical studies. The data should be possible to reuse through a common set of variable definitions providing a consistent nomenclature and validation of data. Another objective was that the templates used for the data entry and presentation should be possible to use in combination with the existing EHR systems. Methods We have designed and developed a template based system (called Julius that was integrated with existing EHR systems. The system is driven by the medical domain knowledge defined by clinicians in the form of templates and variable definitions stored in a common data repository. The system architecture consists of three layers. The presentation layer is purely web-based, which facilitates integration with existing EHR products. The domain layer consists of the template design system, a variable/clinical concept definition system, the transformation and validation logic all implemented in Java. The data source layer utilizes an object relational mapping tool and a relational database. Results The Julius system has been implemented, tested and deployed to three health care units in Stockholm, Sweden. The initial responses from the pilot users were positive. The template system facilitates patient data collection in many ways. The experience of using the template system suggests that enabling the clinicians to be in control of the system, is a good way to add supplementary functionality to the present EHR systems. Conclusion The approach of the template system in combination with various local EHR

  5. DOMAC: an accurate, hybrid protein domain prediction server

    OpenAIRE

    Cheng, Jianlin

    2007-01-01

    Protein domain prediction is important for protein structure prediction, structure determination, function annotation, mutagenesis analysis and protein engineering. Here we describe an accurate protein domain prediction server (DOMAC) combining both template-based and ab initio methods. The preliminary version of the server was ranked among the top domain prediction servers in the seventh edition of Critical Assessment of Techniques for Protein Structure Prediction (CASP7), 2006. DOMAC server...

  6. The protein structure initiative structural genomics knowledgebase.

    Science.gov (United States)

    Berman, Helen M; Westbrook, John D; Gabanyi, Margaret J; Tao, Wendy; Shah, Raship; Kouranov, Andrei; Schwede, Torsten; Arnold, Konstantin; Kiefer, Florian; Bordoli, Lorenza; Kopp, Jürgen; Podvinec, Michael; Adams, Paul D; Carter, Lester G; Minor, Wladek; Nair, Rajesh; La Baer, Joshua

    2009-01-01

    The Protein Structure Initiative Structural Genomics Knowledgebase (PSI SGKB, http://kb.psi-structuralgenomics.org) has been created to turn the products of the PSI structural genomics effort into knowledge that can be used by the biological research community to understand living systems and disease. This resource provides central access to structures in the Protein Data Bank (PDB), along with functional annotations, associated homology models, worldwide protein target tracking information, available protocols and the potential to obtain DNA materials for many of the targets. It also offers the ability to search all of the structural and methodological publications and the innovative technologies that were catalyzed by the PSI's high-throughput research efforts. In collaboration with the Nature Publishing Group, the PSI SGKB provides a research library, editorials about new research advances, news and an events calendar to present a broader view of structural biology and structural genomics. By making these resources freely available, the PSI SGKB serves as a bridge to connect the structural biology and the greater biomedical communities.

  7. Template-based combinatorial enumeration of virtual compound libraries for lipids

    Directory of Open Access Journals (Sweden)

    Sud Manish

    2012-09-01

    Full Text Available Abstract A variety of software packages are available for the combinatorial enumeration of virtual libraries for small molecules, starting from specifications of core scaffolds with attachments points and lists of R-groups as SMILES or SD files. Although SD files include atomic coordinates for core scaffolds and R-groups, it is not possible to control 2-dimensional (2D layout of the enumerated structures generated for virtual compound libraries because different packages generate different 2D representations for the same structure. We have developed a software package called LipidMapsTools for the template-based combinatorial enumeration of virtual compound libraries for lipids. Virtual libraries are enumerated for the specified lipid abbreviations using matching lists of pre-defined templates and chain abbreviations, instead of core scaffolds and lists of R-groups provided by the user. 2D structures of the enumerated lipids are drawn in a specific and consistent fashion adhering to the framework for representing lipid structures proposed by the LIPID MAPS consortium. LipidMapsTools is lightweight, relatively fast and contains no external dependencies. It is an open source package and freely available under the terms of the modified BSD license.

  8. Structure Prediction of Membrane Proteins

    Science.gov (United States)

    Hu, Xiche

    Membrane proteins play a central role in many cellular and physiological processes. It is estimated that integral membrane proteins make up about 20-30% of the proteome (Krogh et al., 2001b; Stevens and Arkin, 2000; von Heijne, 1999). They are essential mediators of material and information transfer across cell membranes. Their functions include active and passive transport of molecules into and out of cells and organelles; transduction of energy among various forms (light, electrical, and chemical energy); as well as reception and transduction of chemical and electrical signals across membranes (Avdonin, 2005; Bockaert et al., 2002; Pahl, 1999; Rehling et al., 2004; Stack et al., 1995). Identifying these transmembrane (TM) proteins and deciphering their molecular mechanisms, then, is of great importance, particularly as applied to biomedicine. Membrane proteins are the targets of a large number of pharmacologically and toxicologically active substances, and are directly involved in their uptake, metabolism, and clearance (Bettler et al., 1998; Cohen, 2002; Heusser and Jardieu, 1997; Tibes et al., 2005; Xu et al., 2005). Despite the importance of membrane proteins, the knowledge of their high-resolution structures and mechanisms of action has lagged far behind in comparison to that of water-soluble proteins: less than 1% of all three-dimensional structures deposited in the Protein Data Bank are of membrane proteins. This unfortunate disparity stems from difficulties in overexpression and the crystallization of membrane proteins (Grisshammer and Tate, 1995; Michel, 1991).

  9. Protein Structure Refinement by Optimization

    DEFF Research Database (Denmark)

    Carlsen, Martin

    million sequences that are known. Determining the protein structure from its sequence of amino acids is therefore a major problem in computational structural biology and is referred to as the protein folding problem. The folding problem is solved using de novo methods or comparative methods depending...... on whether the three-dimensional structure of a homologous sequence is known. Whether or not a protein model can be used for industrial purposes depends on the quality of the predicted structure. A model can be used to design a drug when the quality is high. The overall goal of this project is to assess...... that correlates maximally to a native-decoy distance. The main contribution of this thesis is methods developed for analyzing the performance of metrically trained knowledge-based potentials and for optimizing their performance while making them less dependent on the decoy set used to define them. We focus...

  10. Protein interfacial structure and nanotoxicology

    Science.gov (United States)

    White, John W.; Perriman, Adam W.; McGillivray, Duncan J.; Lin, Jhih-Min

    2009-02-01

    Here we briefly recapitulate the use of X-ray and neutron reflectometry at the air-water interface to find protein structures and thermodynamics at interfaces and test a possibility for understanding those interactions between nanoparticles and proteins which lead to nanoparticle toxicology through entry into living cells. Stable monomolecular protein films have been made at the air-water interface and, with a specially designed vessel, the substrate changed from that which the air-water interfacial film was deposited. This procedure allows interactions, both chemical and physical, between introduced species and the monomolecular film to be studied by reflectometry. The method is briefly illustrated here with some new results on protein-protein interaction between β-casein and κ-casein at the air-water interface using X-rays. These two proteins are an essential component of the structure of milk. In the experiments reported, specific and directional interactions appear to cause different interfacial structures if first, a β-casein monolayer is attacked by a κ-casein solution compared to the reverse. The additional contrast associated with neutrons will be an advantage here. We then show the first results of experiments on the interaction of a β-casein monolayer with a nanoparticle titanium oxide sol, foreshadowing the study of the nanoparticle "corona" thought to be important for nanoparticle-cell wall penetration.

  11. How Permissive Are Protein Structures?

    Science.gov (United States)

    Hecht, Michael

    2000-03-01

    How permissive are protein structures? Can folded structures be isolated from combinatorial libraries of de novo amino acid sequences? We have developed an approach that benefits from the diversity of combinatorial methods, while simultaneously incorporating key design features that favor desired structures and properties. Our strategy is based on the premise that the locations of polar and nonpolar residues must be specified explicitly, but their precise identities can be varied extensively. Thus, the strategy uses a "binary code" that specifies only whether a given position is hydrophobic or hydrophilic. Since the precise identity of each polar or nonpolar residue is not specified, the binary code strategy facilitates the design and construction of libraries with enormous combinatorial diversity. Experiments will be presented describing binary code libraries of proteins designed for (i) structure (alpha-helical and beta-sheet); (ii) cofactor binding; (iii) catalytic activity; and (iv) assembly into fibrils resembling the amyloid found in neurodegenerative diseases.

  12. Structural entanglements in protein complexes

    Science.gov (United States)

    Zhao, Yani; Chwastyk, Mateusz; Cieplak, Marek

    2017-06-01

    We consider multi-chain protein native structures and propose a criterion that determines whether two chains in the system are entangled or not. The criterion is based on the behavior observed by pulling at both termini of each chain simultaneously in the two chains. We have identified about 900 entangled systems in the Protein Data Bank and provided a more detailed analysis for several of them. We argue that entanglement enhances the thermodynamic stability of the system but it may have other functions: burying the hydrophobic residues at the interface and increasing the DNA or RNA binding area. We also study the folding and stretching properties of the knotted dimeric proteins MJ0366, YibK, and bacteriophytochrome. These proteins have been studied theoretically in their monomeric versions so far. The dimers are seen to separate on stretching through the tensile mechanism and the characteristic unraveling force depends on the pulling direction.

  13. Evaluation of template-based models in CASP8 with standard measures

    KAUST Repository

    Cozzetto, Domenico

    2009-01-01

    The strategy for evaluating template-based models submitted to CASP has continuously evolved from CASP1 to CASP5, leading to a standard procedure that has been used in all subsequent editions. The established approach includes methods for calculating the quality of each individual model, for assigning scores based on the distribution of the results for each target and for computing the statistical significance of the differences in scores between prediction methods. These data are made available to the assessor of the template-based modeling category, who uses them as a starting point for further evaluations and analyses. This article describes the detailed workflow of the procedure, provides justifications for a number of choices that are customarily made for CASP data evaluation, and reports the results of the analysis of template-based predictions at CASP8.

  14. Synthesis of copper telluride nanowires using template-based ...

    Indian Academy of Sciences (India)

    Structural characteristics were examined using X-ray diffraction and scanning electron microscope which confirm the formation of CuTe nanowires. Investigation for chemical sensing was carried out using air and chloroform, acetone, ethanol, glycerol, distilled water as liquids having dielectric constants 1, 4.81, 8.93, 21, ...

  15. Algorithms for Protein Structure Prediction

    DEFF Research Database (Denmark)

    Paluszewski, Martin

    ) is more robust than standard Monte Carlo search. In the second approach for reconstruction of C-traces, an exact branch and bound algorithm has been developed [67, 65]. The model is discrete and makes use of secondary structure predictions, HSE, CN and radius of gyration. We show how to compute good lower......The problem of predicting the three-dimensional structure of a protein given its amino acid sequence is one of the most important open problems in bioinformatics. One of the carbon atoms in amino acids is the C-atom and the overall structure of a protein is often represented by a so-called C......-trace. Here we present three different approaches for reconstruction of C-traces from predictable measures. In our first approach [63, 62], the C-trace is positioned on a lattice and a tabu-search algorithm is applied to find minimum energy structures. The energy function is based on half-sphere-exposure (HSE...

  16. Fractal aspects of calcium binding protein structures

    Energy Technology Data Exchange (ETDEWEB)

    Isvoran, Adriana [West University of Timisoara, Department of Chemistry, Pestalozzi 16, 300115 Timisoara (Romania)], E-mail: aisvoran@cbg.uvt.ro; Pitulice, Laura [West University of Timisoara, Department of Chemistry, Pestalozzi 16, 300115 Timisoara (Romania); Craescu, Constantin T. [INSERM U759/Institute Curie-Recherche, Centre Universitaire Paris-Sud, Batiment 112, 91405 Orsay (France); Chiriac, Adrian [West University of Timisoara, Department of Chemistry, Pestalozzi 16, 300115 Timisoara (Romania)

    2008-03-15

    The structures of EF-hand calcium binding proteins may be classified into two distinct groups: extended and compact structures. In this paper we studied 20 different structures of calcium binding proteins using the fractal analysis. Nine structures show extended shapes, one is semi-compact and the other 10 have compact shapes. Our study reveals different fractal characteristics for protein backbones belonging to different structural classes and these observations may be correlated to the physicochemical forces governing the protein folding.

  17. Relationship between protein structure and geometrical constrains

    DEFF Research Database (Denmark)

    Lund, Ole; Hansen, Jan; Brunak, Søren

    1996-01-01

    We evaluate to what extent the structure of proteins can be deduced from incomplete knowledge of disulfide bridges, surface assignments, secondary structure assignments, and additional distance constraints. A cost function taking such constraints into account was used to obtain protein structures...... using a simple minimization algorithm. For small proteins, the approximate structure could be obtained using one additional distance constraint for each amino acid in the protein. We also studied the effect of using predicted secondary structure and surface assignments. The constraints used...

  18. Relationship between protein structure and geometrical constraints

    DEFF Research Database (Denmark)

    Lund, Ole; Hansen, Jan; Brunak, Søren

    1996-01-01

    We evaluate to what extent the structure of proteins can be deduced from incomplete knowledge of disulfide bridges, surface assignments, secondary structure assignments, and additional distance constraints. A cost function taking such constraints into account was used to obtain protein structures...... using a simple minimization algorithm. For small proteins, the approximate structure could be obtained using one additional distance constraint for each amino acid in the protein. We also studied the effect of using predicted secondary structure and surface assignments. The constraints used...

  19. Introduction to Protein Structure through Genetic Diseases

    Science.gov (United States)

    Schneider, Tanya L.; Linton, Brian R.

    2008-01-01

    An illuminating way to learn about protein function is to explore high-resolution protein structures. Analysis of the proteins involved in genetic diseases has been used to introduce students to protein structure and the role that individual mutations can play in the onset of disease. Known mutations can be correlated to changes in protein…

  20. Relationship between protein structure and geometrical constraints.

    OpenAIRE

    Lund, O.; Hansen, J.; Brunak, S.; Bohr, J.

    1996-01-01

    We evaluate to what extent the structure of proteins can be deduced from incomplete knowledge of disulfide bridges, surface assignments, secondary structure assignments, and additional distance constraints. A cost function taking such constraints into account was used to obtain protein structures using a simple minimization algorithm. For small proteins, the approximate structure could be obtained using one additional distance constraint for each amino acid in the protein. We also studied the...

  1. An Interactive Introduction to Protein Structure

    Science.gov (United States)

    Lee, W. Theodore

    2004-01-01

    To improve student understanding of protein structure and the significance of noncovalent interactions in protein structure and function, students are assigned a project to write a paper complemented with computer-generated images. The assignment provides an opportunity for students to select a protein structure that is of interest and detail…

  2. Template-Based de Novo Design for Type II Kinase Inhibitors and Its Extented Application to Acetylcholinesterase Inhibitors

    Directory of Open Access Journals (Sweden)

    Bo-Han Su

    2013-10-01

    Full Text Available There is a compelling need to discover type II inhibitors targeting the unique DFG-out inactive kinase conformation since they are likely to possess greater potency and selectivity relative to traditional type I inhibitors. Using a known inhibitor, such as a currently available and approved drug or inhibitor, as a template to design new drugs via computational de novo design is helpful when working with known ligand-receptor interactions. This study proposes a new template-based de novo design protocol to discover new inhibitors that preserve and also optimize the binding interactions of the type II kinase template. First, sorafenib (Nexavar® and nilotinib (Tasigna®, two type II inhibitors with different ligand-receptor interactions, were selected as the template compounds. The five-step protocol can reassemble each drug from a large fragment library. Our procedure demonstrates that the selected template compounds can be successfully reassembled while the key ligand-receptor interactions are preserved. Furthermore, to demonstrate that the algorithm is able to construct more potent compounds, we considered kinase inhibitors and other protein dataset, acetylcholinesterase (AChE inhibitors. The de novo optimization was initiated using a template compound possessing a less than optimal activity from a series of aminoisoquinoline and TAK-285 inhibiting type II kinases, and E2020 derivatives inhibiting AChE respectively. Three compounds with greater potency than the template compound were discovered that were also included in the original congeneric series. This template-based lead optimization protocol with the fragment library can help to design compounds with preferred binding interactions of known inhibitors automatically and further optimize the compounds in the binding pockets.

  3. Neural Networks for protein Structure Prediction

    DEFF Research Database (Denmark)

    Bohr, Henrik

    1998-01-01

    This is a review about neural network applications in bioinformatics. Especially the applications to protein structure prediction, e.g. prediction of secondary structures, prediction of surface structure, fold class recognition and prediction of the 3-dimensional structure of protein backbones...

  4. Sucralose Destabilization of Protein Structure

    Science.gov (United States)

    Cho, Inha; Chen, Lee; Shukla, Nimesh; Othon, Christina

    2015-03-01

    Sucralose is a commonly employed artificial sweetener. Sucralose behaves very differently than its natural disaccharide counterpart, sucrose, in terms of its interaction with biomolecules. The presence of sucralose in solution is found to destabilize the native structure of the globular protein Bovine Serum Albumin (BSA). The melting temperature decreases as a linear function of sucralose concentration. We correlate this destabilization with the increased polarity of the sucralose molecule as compared to sucrose. The strongly polar nature is observed as a large dielectric friction exerted on the excited state rotational diffusion of tryptophan using time-resolved fluorescence anisotropy. Tryptophan exhibits rotational diffusion proportional to the measured bulk viscosity for sucrose solutions over a wide range of concentrations, consistent with a Stokes-Einstein diffusional model. For sucralose solutions however, the diffusion is linearly dependent with the concentration, strongly diverging from the viscosity predictions. The polar nature of sucralose causes a dramatically different interaction with biomolecules than natural disaccharide molecules. Connecticut Space Grant Consortium.

  5. Structure-based barcoding of proteins.

    Science.gov (United States)

    Metri, Rahul; Jerath, Gaurav; Kailas, Govind; Gacche, Nitin; Pal, Adityabarna; Ramakrishnan, Vibin

    2014-01-01

    A reduced representation in the format of a barcode has been developed to provide an overview of the topological nature of a given protein structure from 3D coordinate file. The molecular structure of a protein coordinate file from Protein Data Bank is first expressed in terms of an alpha-numero code and further converted to a barcode image. The barcode representation can be used to compare and contrast different proteins based on their structure. The utility of this method has been exemplified by comparing structural barcodes of proteins that belong to same fold family, and across different folds. In addition to this, we have attempted to provide an illustration to (i) the structural changes often seen in a given protein molecule upon interaction with ligands and (ii) Modifications in overall topology of a given protein during evolution. The program is fully downloadable from the website http://www.iitg.ac.in/probar/. © 2013 The Protein Society.

  6. The Formation of Protein Structure

    DEFF Research Database (Denmark)

    Bohr, Jakob; Bohr, Henrik; Brunak, Søren

    1996-01-01

    Dynamically induced curvature owing to long-range excitations along the backbones of protein molecules with non-linear elastic properties may control the folding of proteins.......Dynamically induced curvature owing to long-range excitations along the backbones of protein molecules with non-linear elastic properties may control the folding of proteins....

  7. Gauge symmetries and structure of proteins

    Directory of Open Access Journals (Sweden)

    Molochkov Alexander

    2017-01-01

    Full Text Available We discuss the gauge field theory approach to protein structure study, which allows a natural way to introduce collective degrees of freedom and nonlinear topological structures. Local symmetry of proteins and its breaking in the medium is considered, what allows to derive Abelian Higgs model of protein backbone, correct folding of which is defined by gauge symmetry breaking due hydrophobic forces. Within this model structure of protein backbone is defined by superposition of one-dimensional topological solitons (kinks, what allows to reproduce the three-dimensional structure of the protein backbone with precision up to 1A and to predict its dynamics.

  8. Protein loop modeling using a new hybrid energy function and its application to modeling in inaccurate structural environments.

    Science.gov (United States)

    Park, Hahnbeom; Lee, Gyu Rie; Heo, Lim; Seok, Chaok

    2014-01-01

    Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.

  9. Protein loop modeling using a new hybrid energy function and its application to modeling in inaccurate structural environments.

    Directory of Open Access Journals (Sweden)

    Hahnbeom Park

    Full Text Available Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.

  10. Constrained Peptides as Miniature Protein Structures

    Science.gov (United States)

    Yin, Hang

    2012-01-01

    This paper discusses the recent developments of protein engineering using both covalent and noncovalent bonds to constrain peptides, forcing them into designed protein secondary structures. These constrained peptides subsequently can be used as peptidomimetics for biological functions such as regulations of protein-protein interactions. PMID:25969758

  11. Improving the physical realism and structural accuracy of protein models by a two-step atomic-level energy minimization.

    Science.gov (United States)

    Xu, Dong; Zhang, Yang

    2011-11-16

    Most protein structural prediction algorithms assemble structures as reduced models that represent amino acids by a reduced number of atoms to speed up the conformational search. Building accurate full-atom models from these reduced models is a necessary step toward a detailed function analysis. However, it is difficult to ensure that the atomic models retain the desired global topology while maintaining a sound local atomic geometry because the reduced models often have unphysical local distortions. To address this issue, we developed a new program, called ModRefiner, to construct and refine protein structures from Cα traces based on a two-step, atomic-level energy minimization. The main-chain structures are first constructed from initial Cα traces and the side-chain rotamers are then refined together with the backbone atoms with the use of a composite physics- and knowledge-based force field. We tested the method by performing an atomic structure refinement of 261 proteins with the initial models constructed from both ab initio and template-based structure assemblies. Compared with other state-of-art programs, ModRefiner shows improvements in both global and local structures, which have more accurate side-chain positions, better hydrogen-bonding networks, and fewer atomic overlaps. ModRefiner is freely available at http://zhanglab.ccmb.med.umich.edu/ModRefiner. Copyright © 2011 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  12. Structuring high-protein foods

    NARCIS (Netherlands)

    Purwanti, N.

    2012-01-01

    Increased protein consumption gives rise to various health benefits. High-protein intake can lead to muscle development, body weight control and suppression of sarcopenia progression. However, increasing the protein content in food products leads to textural changes over time. These changes result

  13. The interface of protein structure, protein biophysics, and molecular evolution

    Science.gov (United States)

    Liberles, David A; Teichmann, Sarah A; Bahar, Ivet; Bastolla, Ugo; Bloom, Jesse; Bornberg-Bauer, Erich; Colwell, Lucy J; de Koning, A P Jason; Dokholyan, Nikolay V; Echave, Julian; Elofsson, Arne; Gerloff, Dietlind L; Goldstein, Richard A; Grahnen, Johan A; Holder, Mark T; Lakner, Clemens; Lartillot, Nicholas; Lovell, Simon C; Naylor, Gavin; Perica, Tina; Pollock, David D; Pupko, Tal; Regan, Lynne; Roger, Andrew; Rubinstein, Nimrod; Shakhnovich, Eugene; Sjölander, Kimmen; Sunyaev, Shamil; Teufel, Ashley I; Thorne, Jeffrey L; Thornton, Joseph W; Weinreich, Daniel M; Whelan, Simon

    2012-01-01

    Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction. PMID:22528593

  14. Determining and visualizing flexibility in protein structures.

    Science.gov (United States)

    Scott, Walter R P; Straus, Suzana K

    2015-05-01

    How to compare the structures of an ensemble of protein conformations is a fundamental problem in structural biology. As has been previously observed, the widely used RMSD measure due to Kabsch, in which a rigid-body superposition minimizing the least-squares positional deviations is performed, has its drawbacks when comparing and visualizing a set of flexible protein structures. Here, we develop a method, fleximatch, of protein structure comparison that takes flexibility into account. Based on a distance matrix measure of flexibility, a weighted superposition of distance matrices rather than of atomic coordinates is performed. Subsequently, this allows a consistent determination of (a) a superposition of structures for visualization, (b) a partitioning of the protein structure into rigid molecular components (core atoms), and (c) an atomic mobility measure. The method is suitable for highlighting both particularly flexible and rigid parts of a protein from structures derived from NMR, X-ray diffraction or molecular simulation. © 2015 Wiley Periodicals, Inc.

  15. Performance of protein-structure predictions with the physics-based UNRES force field in CASP11.

    Science.gov (United States)

    Krupa, Paweł; Mozolewska, Magdalena A; Wiśniewska, Marta; Yin, Yanping; He, Yi; Sieradzan, Adam K; Ganzynkowicz, Robert; Lipska, Agnieszka G; Karczyńska, Agnieszka; Ślusarz, Magdalena; Ślusarz, Rafał; Giełdoń, Artur; Czaplewski, Cezary; Jagieła, Dawid; Zaborowski, Bartłomiej; Scheraga, Harold A; Liwo, Adam

    2016-11-01

    Participating as the Cornell-Gdansk group, we have used our physics-based coarse-grained UNited RESidue (UNRES) force field to predict protein structure in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11). Our methodology involved extensive multiplexed replica exchange simulations of the target proteins with a recently improved UNRES force field to provide better reproductions of the local structures of polypeptide chains. All simulations were started from fully extended polypeptide chains, and no external information was included in the simulation process except for weak restraints on secondary structure to enable us to finish each prediction within the allowed 3-week time window. Because of simplified UNRES representation of polypeptide chains, use of enhanced sampling methods, code optimization and parallelization and sufficient computational resources, we were able to treat, for the first time, all 55 human prediction targets with sizes from 44 to 595 amino acid residues, the average size being 251 residues. Complete structures of six single-domain proteins were predicted accurately, with the highest accuracy being attained for the T0769, for which the CαRMSD was 3.8 Å for 97 residues of the experimental structure. Correct structures were also predicted for 13 domains of multi-domain proteins with accuracy comparable to that of the best template-based modeling methods. With further improvements of the UNRES force field that are now underway, our physics-based coarse-grained approach to protein-structure prediction will eventually reach global prediction capacity and, consequently, reliability in simulating protein structure and dynamics that are important in biochemical processes. Freely available on the web at http://www.unres.pl/ CONTACT: has5@cornell.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. PSAIA – Protein Structure and Interaction Analyzer

    Directory of Open Access Journals (Sweden)

    Vlahoviček Kristian

    2008-04-01

    Full Text Available Abstract Background PSAIA (Protein Structure and Interaction Analyzer was developed to compute geometric parameters for large sets of protein structures in order to predict and investigate protein-protein interaction sites. Results In addition to most relevant established algorithms, PSAIA offers a new method PIADA (Protein Interaction Atom Distance Algorithm for the determination of residue interaction pairs. We found that PIADA produced more satisfactory results than comparable algorithms implemented in PSAIA. Particular advantages of PSAIA include its capacity to combine different methods to detect the locations and types of interactions between residues and its ability, without any further automation steps, to handle large numbers of protein structures and complexes. Generally, the integration of a variety of methods enables PSAIA to offer easier automation of analysis and greater reliability of results. PSAIA can be used either via a graphical user interface or from the command-line. Results are generated in either tabular or XML format. Conclusion In a straightforward fashion and for large sets of protein structures, PSAIA enables the calculation of protein geometric parameters and the determination of location and type for protein-protein interaction sites. XML formatted output enables easy conversion of results to various formats suitable for statistic analysis. Results from smaller data sets demonstrated the influence of geometry on protein interaction sites. Comprehensive analysis of properties of large data sets lead to new information useful in the prediction of protein-protein interaction sites.

  17. Understanding Protein-Protein Interactions Using Local Structural Features

    DEFF Research Database (Denmark)

    Planas-Iglesias, Joan; Bonet, Jaume; García-García, Javier

    2013-01-01

    Protein-protein interactions (PPIs) play a relevant role among the different functions of a cell. Identifying the PPI network of a given organism (interactome) is useful to shed light on the key molecular mechanisms within a biological system. In this work, we show the role of structural features...... (loops and domains) to comprehend the molecular mechanisms of PPIs. A paradox in protein-protein binding is to explain how the unbound proteins of a binary complex recognize each other among a large population within a cell and how they find their best docking interface in a short timescale. We use...... interacting and non-interacting protein pairs to classify the structural features that sustain the binding (or non-binding) behavior. Our study indicates that not only the interacting region but also the rest of the protein surface are important for the interaction fate. The interpretation...

  18. Structural genomics on membrane proteins: mini review.

    Science.gov (United States)

    Lundstrom, K

    2004-08-01

    Structural genomics, structure-based analysis of gene products, has so far mainly concentrated on soluble proteins because of their less demanding requirements for overexpression, purification and crystallisation compared to membrane proteins. This so-called "low-hanging fruit" approach has generated more than 25,000 structures deposited in databases. In contrast, the substantially more complex membrane proteins, in relation to all steps from overexpression to high-resolution structure determination, represent less than 1% of available crystal structures. This is in sharp contrast to the importance of this type of proteins, particularly G protein-coupled receptors (GPCRs), as today 60-70% of the current drug targets are based on membrane proteins. The key to improved success with membrane protein structural elucidation is technology development. The most efficient approach constitutes parallel studies on a large number of targets and evaluation of various systems for expression. Next, high throughput format solubilisation and refolding screening methods for a wide range of detergents and additives in numerous concentrations should be established. Today, several networks dealing with structural genomics approaches of membrane proteins have been initiated, among them the Membrane Protein Network (MePNet) programme that deals with the pharmaceutically important mammalian GPCRs. In MePNet, three overexpression systems have been employed for the evaluation of 101 GPCRs, which has generated large quantities of numerous recombinant GPCRs, compatible for structural biology applications.

  19. Structural classification of proteins and structural genomics: new insights into protein folding and evolution.

    Science.gov (United States)

    Andreeva, Antonina; Murzin, Alexey G

    2010-10-01

    During the past decade, the Protein Structure Initiative (PSI) centres have become major contributors of new families, superfamilies and folds to the Structural Classification of Proteins (SCOP) database. The PSI results have increased the diversity of protein structural space and accelerated our understanding of it. This review article surveys a selection of protein structures determined by the Joint Center for Structural Genomics (JCSG). It presents previously undescribed β-sheet architectures such as the double barrel and spiral β-roll and discusses new examples of unusual topologies and peculiar structural features observed in proteins characterized by the JCSG and other Structural Genomics centres.

  20. Structural characterization of proteins using residue environments.

    Science.gov (United States)

    Mooney, Sean D; Liang, Mike Hsin-Ping; DeConde, Rob; Altman, Russ B

    2005-12-01

    A primary challenge for structural genomics is the automated functional characterization of protein structures. We have developed a sequence-independent method called S-BLEST (Structure-Based Local Environment Search Tool) for the annotation of previously uncharacterized protein structures. S-BLEST encodes the local environment of an amino acid as a vector of structural property values. It has been applied to all amino acids in a nonredundant database of protein structures to generate a searchable structural resource. Given a query amino acid from an experimentally determined or modeled structure, S-BLEST quickly identifies similar amino acid environments using a K-nearest neighbor search. In addition, the method gives an estimation of the statistical significance of each result. We validated S-BLEST on X-ray crystal structures from the ASTRAL 40 nonredundant dataset. We then applied it to 86 crystallographically determined proteins in the protein data bank (PDB) with unknown function and with no significant sequence neighbors in the PDB. S-BLEST was able to associate 20 proteins with at least one local structural neighbor and identify the amino acid environments that are most similar between those neighbors. Proteins 2005. 2005 Wiley-Liss, Inc.

  1. Near-native Protein Structure Simulation

    Directory of Open Access Journals (Sweden)

    Stefka Fidanova

    2007-10-01

    Full Text Available The protein folding problem is a fundamental problem in computational molecular biology and biochemical physics. The high resolution 3D structure of a protein is the key to the understanding and manipulating of its biochemical and cellular functions. All information necessary to fold a protein to its native structure is contained in its amino-acid sequence. Proteins structure could be calculated from knowledge of its sequence and our understanding of the sequence-structure relationships. Various optimization methods have been applied to formulation of the folding problem. There are two main approaches. The one is based on properties of homologous proteins. Other is based on reduced models of proteins structure like hydrophobic-polar (HP protein model. After that, the folding problem is defined like optimization problem. It is a hard optimization problem and most of the authors apply Monte Carlo or metaheuristic methods to solve it. In this paper other approach will be used. By HP model is explained the structures of proteins conformation observed by biologists and is studied the correspondence between the primary and tertiary structures of the proteins.

  2. A new protein structure representation for efficient protein function prediction.

    Science.gov (United States)

    Maghawry, Huda A; Mostafa, Mostafa G M; Gharib, Tarek F

    2014-12-01

    One of the challenging problems in bioinformatics is the prediction of protein function. Protein function is the main key that can be used to classify different proteins. Protein function can be inferred experimentally with very small throughput or computationally with very high throughput. Computational methods are sequence based or structure based. Structure-based methods produce more accurate protein function prediction. In this article, we propose a new protein structure representation for efficient protein function prediction. The representation is based on three-dimensional patterns of protein residues. In the analysis, we used protein function based on enzyme activity through six mechanistically diverse enzyme superfamilies: amidohydrolase, crotonase, haloacid dehalogenase, isoprenoid synthase type I, and vicinal oxygen chelate. We applied three different classification methods, naïve Bayes, k-nearest neighbors, and random forest, to predict the enzyme superfamily of a given protein. The prediction accuracy using the proposed representation outperforms a recently introduced representation method that is based only on the distance patterns. The results show that the proposed representation achieved prediction accuracy up to 98%, with improvement of about 10% on average.

  3. Alpha complexes in protein structure prediction

    DEFF Research Database (Denmark)

    Winter, Pawel; Fonseca, Rasmus

    2015-01-01

    Reducing the computational effort and increasing the accuracy of potential energy functions is of utmost importance in modeling biological systems, for instance in protein structure prediction, docking or design. Evaluating interactions between nonbonded atoms is the bottleneck of such computations......-complexes and kinetic a-complexes in protein related problems (e.g., protein structure prediction and protein-ligand docking) deserves furhter investigation.)......-complexes from scratch for every configuration encountered during the search for the native structure would make this approach hopelessly slow. However, it is argued that kinetic a-complexes can be used to reduce the computational effort of determining the potential energy when "moving" from one configuration...

  4. Refinement of protein structures in explicit solvent

    NARCIS (Netherlands)

    Linge, J.P.; Williams, M.A.; Spronk, C.A.E.M.; Bonvin, A.M.J.J.|info:eu-repo/dai/nl/113691238; Nilges, M.

    2003-01-01

    We present a CPU efficient protocol for refinement of protein structures in a thin layer of explicit solvent and energy parameters with completely revised dihedral angle terms. Our approach is suitable for protein structures determined by theoretical (e.g., homology modeling or threading) or

  5. Validation-driven protein-structure improvement

    NARCIS (Netherlands)

    Touw, W.G.

    2016-01-01

    High-quality protein structure models are essential for many Life Science applications, such as protein engineering, molecular dynamics, drug design, and homology modelling. The WHAT_CHECK model validation project and the PDB_REDO model optimisation project have shown that many structure models in

  6. Template-based Quality Assessment of the Doppler Ultrasound Signal for Fetal Monitoring

    Directory of Open Access Journals (Sweden)

    Camilo E. Valderrama

    2017-07-01

    Full Text Available One dimensional Doppler Ultrasound (DUS is a low cost method for fetal auscultation. However, accuracy of any metrics derived from the DUS signals depends on their quality, which relies heavily on operator skills. In low resource settings, where skill levels are sparse, it is important for the device to provide real time signal quality feedback to allow the re-recording of data. Retrospectively, signal quality assessment can help remove low quality recordings when processing large amounts of data. To this end, we proposed a novel template-based method, to assess DUS signal quality. Data used in this study were collected from 17 pregnant women using a low-cost transducer connected to a smart phone. Recordings were split into 1990 segments of 3.75 s duration, and hand labeled for quality by three independent annotators. The proposed template-based method uses Empirical Mode Decomposition (EMD to allow detection of the fetal heart beats and segmentation into short, time-aligned temporal windows. Templates were derived for each 15 s window of the recordings. The DUS signal quality index (SQI was calculated by correlating the segments in each window with the corresponding running template using four different pre-processing steps: (i no additional preprocessing, (ii linear resampling of each beat, (iii dynamic time warping (DTW of each beat and (iv weighted DTW of each beat. The template-based SQIs were combined with additional features based on sample entropy and power spectral density. To assess the performance of the method, the dataset was split into training and test subsets. The training set was used to obtain the best combination of features for predicting the DUS quality using cross validation, and the test set was used to estimate the classification accuracy using bootstrap resampling. A median out of sample classification accuracy on the test set of 85.8% was found using three features; template-based SQI, sample entropy and the relative

  7. Predicting template-based catalysis rates in a simple catalytic reaction model.

    Science.gov (United States)

    Hordijk, Wim; Steel, Mike

    2012-02-21

    We show that in a particular model of catalytic reaction systems, known as the binary polymer model, there is a mathematical concordance between two versions of the model: (1) random catalysis and (2) template-based catalysis. In particular, we derive an analytical calculation that allows us to accurately predict the (observed) required level of catalysis in one version of the model from that in the other version, for a given probability of having self-sustaining autocatalytic sets exist in instances of both model versions. This provides a tractable connection between two models that have been investigated in theoretical origin-of-life studies. Copyright © 2011 Elsevier Ltd. All rights reserved.

  8. Evaluating protein structures determined by structural genomics consortia.

    Science.gov (United States)

    Bhattacharya, Aneerban; Tejero, Roberto; Montelione, Gaetano T

    2007-03-01

    Structural genomics projects are providing large quantities of new 3D structural data for proteins. To monitor the quality of these data, we have developed the protein structure validation software suite (PSVS), for assessment of protein structures generated by NMR or X-ray crystallographic methods. PSVS is broadly applicable for structure quality assessment in structural biology projects. The software integrates under a single interface analyses from several widely-used structure quality evaluation tools, including PROCHECK (Laskowski et al., J Appl Crystallog 1993;26:283-291), MolProbity (Lovell et al., Proteins 2003;50:437-450), Verify3D (Luthy et al., Nature 1992;356:83-85), ProsaII (Sippl, Proteins 1993;17: 355-362), the PDB validation software, and various structure-validation tools developed in our own laboratory. PSVS provides standard constraint analyses, statistics on goodness-of-fit between structures and experimental data, and knowledge-based structure quality scores in standardized format suitable for database integration. The analysis provides both global and site-specific measures of protein structure quality. Global quality measures are reported as Z scores, based on calibration with a set of high-resolution X-ray crystal structures. PSVS is particularly useful in assessing protein structures determined by NMR methods, but is also valuable for assessing X-ray crystal structures or homology models. Using these tools, we assessed protein structures generated by the Northeast Structural Genomics Consortium and other international structural genomics projects, over a 5-year period. Protein structures produced from structural genomics projects exhibit quality score distributions similar to those of structures produced in traditional structural biology projects during the same time period. However, while some NMR structures have structure quality scores similar to those seen in higher-resolution X-ray crystal structures, the majority of NMR structures

  9. Template-based data entry for general description in medical records and data transfer to data warehouse for analysis.

    Science.gov (United States)

    Matsumura, Yasushi; Kuwata, Shigeki; Yamamoto, Yuichiro; Izumi, Kazunori; Okada, Yasushi; Hazumi, Michihiro; Yoshimoto, Sachiko; Mineno, Takahiro; Nagahama, Munetoshi; Fujii, Ayumi; Takeda, Hiroshi

    2007-01-01

    General descriptions in medical records are so diverse that they are usually entered as free text into an electronic medical record, and the resulting data analysis is often difficult. We developed and implemented a template-based data entry module and data analyzing system for general descriptions. We developed a template with tree structure, whose content master and entered patient's data are simultaneously expressed by XML. The entered structured data is converted to narrative form for easy reading. This module was implemented in the EMR system, and is used in 35 hospitals as of October, 2006. So far, 3725 templates (3242 concepts) have been produced. The data in XML and narrative text data are stored in the EMR database. The XML data are retrieved, and then patient's data are extracted, to be stored in the data ware-house (DWH). We developed a search assisting system that enables users to find objective data from the DWH without requiring complicated SQL. By using this method, general descriptions in medical records can be structured and made available for clinical research.

  10. Protein Structure Determination Using Chemical Shifts

    DEFF Research Database (Denmark)

    Christensen, Anders Steen

    In this thesis, a protein structure determination using chemical shifts is presented. The method is implemented in the open source PHAISTOS protein simulation framework. The method combines sampling from a generative model with a coarse-grained force field and an energy function that includes...... chemical shifts. The method is benchmarked on folding simulations of five small proteins. In four cases the resulting structures are in excellent agreement with experimental data, the fifth case fail likely due to inaccuracies in the energy function. For the Chymotrypsin Inhibitor protein, a structure...... is determined using only chemical shifts recorded and assigned through automated processes. The CARMSD to the experimental X-ray for this structure is 1.1. Å. Additionally, the method is combined with very sparse NOE-restraints and evolutionary distance restraints and tested on several protein structures >100...

  11. Website on Protein Interaction and Protein Structure Related Work

    Science.gov (United States)

    Samanta, Manoj; Liang, Shoudan; Biegel, Bryan (Technical Monitor)

    2003-01-01

    In today's world, three seemingly diverse fields - computer information technology, nanotechnology and biotechnology are joining forces to enlarge our scientific knowledge and solve complex technological problems. Our group is dedicated to conduct theoretical research exploring the challenges in this area. The major areas of research include: 1) Yeast Protein Interactions; 2) Protein Structures; and 3) Current Transport through Small Molecules.

  12. Optimized null model for protein structure networks.

    Science.gov (United States)

    Milenković, Tijana; Filippis, Ioannis; Lappe, Michael; Przulj, Natasa

    2009-06-26

    Much attention has recently been given to the statistical significance of topological features observed in biological networks. Here, we consider residue interaction graphs (RIGs) as network representations of protein structures with residues as nodes and inter-residue interactions as edges. Degree-preserving randomized models have been widely used for this purpose in biomolecular networks. However, such a single summary statistic of a network may not be detailed enough to capture the complex topological characteristics of protein structures and their network counterparts. Here, we investigate a variety of topological properties of RIGs to find a well fitting network null model for them. The RIGs are derived from a structurally diverse protein data set at various distance cut-offs and for different groups of interacting atoms. We compare the network structure of RIGs to several random graph models. We show that 3-dimensional geometric random graphs, that model spatial relationships between objects, provide the best fit to RIGs. We investigate the relationship between the strength of the fit and various protein structural features. We show that the fit depends on protein size, structural class, and thermostability, but not on quaternary structure. We apply our model to the identification of significantly over-represented structural building blocks, i.e., network motifs, in protein structure networks. As expected, choosing geometric graphs as a null model results in the most specific identification of motifs. Our geometric random graph model may facilitate further graph-based studies of protein conformation space and have important implications for protein structure comparison and prediction. The choice of a well-fitting null model is crucial for finding structural motifs that play an important role in protein folding, stability and function. To our knowledge, this is the first study that addresses the challenge of finding an optimized null model for RIGs, by

  13. Optimized null model for protein structure networks.

    Directory of Open Access Journals (Sweden)

    Tijana Milenković

    Full Text Available Much attention has recently been given to the statistical significance of topological features observed in biological networks. Here, we consider residue interaction graphs (RIGs as network representations of protein structures with residues as nodes and inter-residue interactions as edges. Degree-preserving randomized models have been widely used for this purpose in biomolecular networks. However, such a single summary statistic of a network may not be detailed enough to capture the complex topological characteristics of protein structures and their network counterparts. Here, we investigate a variety of topological properties of RIGs to find a well fitting network null model for them. The RIGs are derived from a structurally diverse protein data set at various distance cut-offs and for different groups of interacting atoms. We compare the network structure of RIGs to several random graph models. We show that 3-dimensional geometric random graphs, that model spatial relationships between objects, provide the best fit to RIGs. We investigate the relationship between the strength of the fit and various protein structural features. We show that the fit depends on protein size, structural class, and thermostability, but not on quaternary structure. We apply our model to the identification of significantly over-represented structural building blocks, i.e., network motifs, in protein structure networks. As expected, choosing geometric graphs as a null model results in the most specific identification of motifs. Our geometric random graph model may facilitate further graph-based studies of protein conformation space and have important implications for protein structure comparison and prediction. The choice of a well-fitting null model is crucial for finding structural motifs that play an important role in protein folding, stability and function. To our knowledge, this is the first study that addresses the challenge of finding an optimized null model

  14. Protein Structure Determination Using Protein Threading and Sparse NMR Data

    Energy Technology Data Exchange (ETDEWEB)

    Crawford, O.H.; Einstein, J.R.; Xu, D.; Xu, Y.

    1999-11-14

    It is well known that the NMR method for protein structure determination applies to small proteins and that its effectiveness decreases very rapidly as the molecular weight increases beyond about 30 kD. We have recently developed a method for protein structure determination that can fully utilize partial NMR data as calculation constraints. The core of the method is a threading algorithm that guarantees to find a globally optimal alignment between a query sequence and a template structure, under distance constraints specified by NMR/NOE data. Our preliminary tests have demonstrated that a small number of NMR/NOE distance restraints can significantly improve threading performance in both fold recognition and threading-alignment accuracy, and can possibly extend threading's scope of applicability from structural homologs to structural analogs. An accurate backbone structure generated by NMR-constrained threading can then provide a significant amount of structural information, equivalent to that provided by the NMR method with many NMR/NOE restraints; and hence can greatly reduce the amount of NMR data typically required for accurate structure determination. Our preliminary study suggests that a small number of NMR/NOE restraints may suffice to determine adequately the all-atom structure when those restraints are incorporated in a procedure combining threading, modeling of loops and sidechains, and molecular dynamics simulation. Potentially, this new technique can expand NMR's capability to larger proteins.

  15. The Identification and Tracking of Uterine Contractions Using Template Based Cross-Correlation.

    Science.gov (United States)

    McDonald, Sarah C; Brooker, Graham; Phipps, Hala; Hyett, Jon

    2017-06-28

    The purpose of this paper is to outline a novel method of using template based cross-correlation to identify and track uterine contractions during labour. A purpose built six-channel Electromyography (EMG) device was used to collect data from consenting women during labour and birth. A range of templates were constructed for the purpose of identifying and tracking uterine activity when cross-correlated with the EMG signal. Peak finding techniques were applied on the cross-correlated result to simplify and automate the identification and tracking of contractions. The EMG data showed a unique pattern when a woman was contracting with key features of the contraction signal remaining consistent and identifiable across subjects. Contraction profiles across subjects were automatically identified using template based cross-correlation. Synthetic templates from a rectangular function with a duration of between 5 and 10 s performed best at identifying and tracking uterine activity across subjects. The successful application of this technique provides opportunity for both simple and accurate real-time analysis of contraction data while enabling investigations into the application of techniques such as machine learning which could enable automated learning from contraction data as part of real-time monitoring and post analysis.

  16. Protein NMR structures refined without NOE data.

    Science.gov (United States)

    Ryu, Hyojung; Kim, Tae-Rae; Ahn, SeonJoo; Ji, Sunyoung; Lee, Jinhyuk

    2014-01-01

    The refinement of low-quality structures is an important challenge in protein structure prediction. Many studies have been conducted on protein structure refinement; the refinement of structures derived from NMR spectroscopy has been especially intensively studied. In this study, we generated flat-bottom distance potential instead of NOE data because NOE data have ambiguity and uncertainty. The potential was derived from distance information from given structures and prevented structural dislocation during the refinement process. A simulated annealing protocol was used to minimize the potential energy of the structure. The protocol was tested on 134 NMR structures in the Protein Data Bank (PDB) that also have X-ray structures. Among them, 50 structures were used as a training set to find the optimal "width" parameter in the flat-bottom distance potential functions. In the validation set (the other 84 structures), most of the 12 quality assessment scores of the refined structures were significantly improved (total score increased from 1.215 to 2.044). Moreover, the secondary structure similarity of the refined structure was improved over that of the original structure. Finally, we demonstrate that the combination of two energy potentials, statistical torsion angle potential (STAP) and the flat-bottom distance potential, can drive the refinement of NMR structures.

  17. Membrane protein structure determination in membrana.

    Science.gov (United States)

    Ding, Yi; Yao, Yong; Marassi, Francesca M

    2013-09-17

    The two principal components of biological membranes, the lipid bilayer and the proteins integrated within it, have coevolved for specific functions that mediate the interactions of cells with their environment. Molecular structures can provide very significant insights about protein function. In the case of membrane proteins, the physical and chemical properties of lipids and proteins are highly interdependent; therefore structure determination should include the membrane environment. Considering the membrane alongside the protein eliminates the possibility that crystal contacts or detergent molecules could distort protein structure, dynamics, and function and enables ligand binding studies to be performed in a natural setting. Solid-state NMR spectroscopy is compatible with three-dimensional structure determination of membrane proteins in phospholipid bilayer membranes under physiological conditions and has played an important role in elucidating the physical and chemical properties of biological membranes, providing key information about the structure and dynamics of the phospholipid components. Recently, developments in the recombinant expression of membrane proteins, sample preparation, pulse sequences for high-resolution spectroscopy, radio frequency probes, high-field magnets, and computational methods have enabled a number of membrane protein structures to be determined in lipid bilayer membranes. In this Account, we illustrate solid-state NMR methods with examples from two bacterial outer membrane proteins (OmpX and Ail) that form integral membrane β-barrels. The ability to measure orientation-dependent frequencies in the solid-state NMR spectra of membrane-embedded proteins provides the foundation for a powerful approach to structure determination based primarily on orientation restraints. Orientation restraints are particularly useful for NMR structural studies of membrane proteins because they provide information about both three-dimensional structure

  18. Lessons from making the Structural Classification of Proteins (SCOP) and their implications for protein structure modelling.

    Science.gov (United States)

    Andreeva, Antonina

    2016-06-15

    The Structural Classification of Proteins (SCOP) database has facilitated the development of many tools and algorithms and it has been successfully used in protein structure prediction and large-scale genome annotations. During the development of SCOP, numerous exceptions were found to topological rules, along with complex evolutionary scenarios and peculiarities in proteins including the ability to fold into alternative structures. This article reviews cases of structural variations observed for individual proteins and among groups of homologues, knowledge of which is essential for protein structure modelling. © 2016 The Author(s). published by Portland Press Limited on behalf of the Biochemical Society.

  19. Secondary structure and rigidity in model proteins.

    Science.gov (United States)

    Perticaroli, Stefania; Nickels, Jonathan D; Ehlers, Georg; O'Neill, Hugh; Zhang, Qui; Sokolov, Alexei P

    2013-10-28

    There is tremendous interest in understanding the role that secondary structure plays in the rigidity and dynamics of proteins. In this work we analyze nanomechanical properties of proteins chosen to represent different secondary structures: α-helices (myoglobin and bovine serum albumin), β-barrels (green fluorescent protein), and α + β + loop structures (lysozyme). Our experimental results show that in these model proteins, the β motif is a stiffer structural unit than the α-helix in both dry and hydrated states. This difference appears not only in the rigidity of the protein, but also in the amplitude of fast picosecond fluctuations. Moreover, we show that for these examples the secondary structure correlates with the temperature- and hydration-induced changes in the protein dynamics and rigidity. Analysis also suggests a connection between the length of the secondary structure (α-helices) and the low-frequency vibrational mode, the so-called boson peak. The presented results suggest an intimate connection of dynamics and rigidity with the protein secondary structure.

  20. Predicting Protein Secondary Structure with Markov Models

    DEFF Research Database (Denmark)

    Fischer, Paul; Larsen, Simon; Thomsen, Claus

    2004-01-01

    we are considering here, is to predict the secondary structure from the primary one. To this end we train a Markov model on training data and then use it to classify parts of unknown protein sequences as sheets, helices or coils. We show how to exploit the directional information contained......The primary structure of a protein is the sequence of its amino acids. The secondary structure describes structural properties of the molecule such as which parts of it form sheets, helices or coils. Spacial and other properties are described by the higher order structures. The classification task...

  1. Structural symmetry and protein function.

    Science.gov (United States)

    Goodsell, D S; Olson, A J

    2000-01-01

    The majority of soluble and membrane-bound proteins in modern cells are symmetrical oligomeric complexes with two or more subunits. The evolutionary selection of symmetrical oligomeric complexes is driven by functional, genetic, and physicochemical needs. Large proteins are selected for specific morphological functions, such as formation of rings, containers, and filaments, and for cooperative functions, such as allosteric regulation and multivalent binding. Large proteins are also more stable against denaturation and have a reduced surface area exposed to solvent when compared with many individual, smaller proteins. Large proteins are constructed as oligomers for reasons of error control in synthesis, coding efficiency, and regulation of assembly. Symmetrical oligomers are favored because of stability and finite control of assembly. Several functions limit symmetry, such as interaction with DNA or membranes, and directional motion. Symmetry is broken or modified in many forms: quasisymmetry, in which identical subunits adopt similar but different conformations; pleomorphism, in which identical subunits form different complexes; pseudosymmetry, in which different molecules form approximately symmetrical complexes; and symmetry mismatch, in which oligomers of different symmetries interact along their respective symmetry axes. Asymmetry is also observed at several levels. Nearly all complexes show local asymmetry at the level of side chain conformation. Several complexes have reciprocating mechanisms in which the complex is asymmetric, but, over time, all subunits cycle through the same set of conformations. Global asymmetry is only rarely observed. Evolution of oligomeric complexes may favor the formation of dimers over complexes with higher cyclic symmetry, through a mechanism of prepositioned pairs of interacting residues. However, examples have been found for all of the crystallographic point groups, demonstrating that functional need can drive the evolution of

  2. Efficient protein structure search using indexing methods.

    Science.gov (United States)

    Kim, Sungchul; Sael, Lee; Yu, Hwanjo

    2013-01-01

    Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively.

  3. Simultaneous determination of protein structure and dynamics

    DEFF Research Database (Denmark)

    Lindorff-Larsen, Kresten; Best, Robert B.; DePristo, M. A.

    2005-01-01

    at the atomic level about the structural and dynamical features of proteins-with the ability of molecular dynamics simulations to explore a wide range of protein conformations. We illustrate the method for human ubiquitin in solution and find that there is considerable conformational heterogeneity throughout......We present a protocol for the experimental determination of ensembles of protein conformations that represent simultaneously the native structure and its associated dynamics. The procedure combines the strengths of nuclear magnetic resonance spectroscopy-for obtaining experimental information...... the protein structure. The interior atoms of the protein are tightly packed in each individual conformation that contributes to the ensemble but their overall behaviour can be described as having a significant degree of liquid-like character. The protocol is completely general and should lead to significant...

  4. Datamining protein structure databanks for crystallization patterns of proteins.

    Science.gov (United States)

    Valafar, Homayoun; Prestegard, James H; Valafar, Faramarz

    2002-12-01

    A study of 345 protein structures selected among 1,500 structures determined by nuclear magnetic resonance (NMR) methods, revealed useful correlations between crystallization properties and several parameters for the studied proteins. NMR methods of structure determination do not require the growth of protein crystals, and hence allow comparison of properties of proteins that have or have not been the subject of crystallographic approaches. One- and two-dimensional statistical analyses of the data confirmed a hypothesized relation between the size of the molecule and its crystallization potential. Furthermore, two-dimensional Bayesian analysis revealed a significant relationship between relative ratio of different secondary structures and the likelihood of success for crystallization trials. The most immediate result is an apparent correlation of crystallization potential with protein size. Further analysis of the data revealed a relationship between the unstructured fraction of proteins and the success of its crystallization. Utilization of Bayesian analysis on the latter correlation resulted in a prediction performance of about 64%, whereas a two-dimensional Bayesian analysis succeeded with a performance of about 75%.

  5. Protein structural similarity search by Ramachandran codes

    Directory of Open Access Journals (Sweden)

    Chang Chih-Hung

    2007-08-01

    Full Text Available Abstract Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation. SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.

  6. Dispersing hydrophilic nanoparticles in hydrophobic polymers: HDPE/ZnO nanocomposites by a novel template-based approach

    Directory of Open Access Journals (Sweden)

    G. Filippone

    2014-05-01

    Full Text Available The efficiency of a novel template-based approach for the dispersion of hydrophilic nanoparticles within hydrophobic polymer matrices is investigated. The procedure envisages the permeation of a well dispersed nanoparticle suspension inside a micro-porous matrix, obtained through selective extraction of a sacrificial phase from a finely interpenetrated co-continuous polymer blend. Specifically, a blend of high density polyethylene (HDPE and polyethylene oxide (PEO at 50/50 wt% is prepared by melt mixing. The addition of small amounts of organo-clay promotes the necessary refinement of the blend morphology. Once removed the PEO, the micro-porous HDPE matrix is dipped in a colloidal suspension of zinc oxide nanoparticles which exhibits low interfacial tension with HDPE. A system prepared by traditional melt mixing is used as reference. Melt- and solid-state viscoelastic measurements reveal a good quality of the filler dispersion despite the uneven distribution on micro-scale. The latter can be capitalized to minimize the filler content to attain a certain improvement of the material properties or to design nano-structured polymer composites.

  7. Proteins with Novel Structure, Function and Dynamics

    Science.gov (United States)

    Pohorille, Andrew

    2014-01-01

    Recently, a small enzyme that ligates two RNA fragments with the rate of 10(exp 6) above background was evolved in vitro (Seelig and Szostak, Nature 448:828-831, 2007). This enzyme does not resemble any contemporary protein (Chao et al., Nature Chem. Biol. 9:81-83, 2013). It consists of a dynamic, catalytic loop, a small, rigid core containing two zinc ions coordinated by neighboring amino acids, and two highly flexible tails that might be unimportant for protein function. In contrast to other proteins, this enzyme does not contain ordered secondary structure elements, such as alpha-helix or beta-sheet. The loop is kept together by just two interactions of a charged residue and a histidine with a zinc ion, which they coordinate on the opposite side of the loop. Such structure appears to be very fragile. Surprisingly, computer simulations indicate otherwise. As the coordinating, charged residue is mutated to alanine, another, nearby charged residue takes its place, thus keeping the structure nearly intact. If this residue is also substituted by alanine a salt bridge involving two other, charged residues on the opposite sides of the loop keeps the loop in place. These adjustments are facilitated by high flexibility of the protein. Computational predictions have been confirmed experimentally, as both mutants retain full activity and overall structure. These results challenge our notions about what is required for protein activity and about the relationship between protein dynamics, stability and robustness. We hypothesize that small, highly dynamic proteins could be both active and fault tolerant in ways that many other proteins are not, i.e. they can adjust to retain their structure and activity even if subjected to mutations in structurally critical regions. This opens the doors for designing proteins with novel functions, structures and dynamics that have not been yet considered.

  8. Use of Restraints from Consensus Fragments of Multiple Server Models To Enhance Protein-Structure Prediction Capability of the UNRES Force Field.

    Science.gov (United States)

    Mozolewska, Magdalena A; Krupa, Paweł; Zaborowski, Bartłomiej; Liwo, Adam; Lee, Jooyoung; Joo, Keehyoung; Czaplewski, Cezary

    2016-11-28

    Recently, we developed a new approach to protein-structure prediction, which combines template-based modeling with the physics-based coarse-grained UNited RESidue (UNRES) force field. In this approach, restrained multiplexed replica exchange molecular dynamics simulations with UNRES, with the Cα-distance and virtual-bond-dihedral-angle restraints derived from knowledge-based models are carried out. In this work, we report a test of this approach in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11), in which we used the template-based models from early-stage predictions by the LEE group CASP11 server (group 038, called "nns"), and further improvement of the method. The quality of the models obtained in CASP11 was better than that resulting from unrestrained UNRES simulations; however, the obtained models were generally worse than the final nns models. Calculations with the final nns models, performed after CASP11, resulted in substantial improvement, especially for multi-domain proteins. Based on these results, we modified the procedure by deriving restraints from models from multiple servers, in this study the four top-performing servers in CASP11 (nns, BAKER-ROSETTASERVER, Zhang-server, and QUARK), and implementing either all restraints or only the restraints on the fragments that appear similar in the majority of models (the consensus fragments), outlier models discarded. Tests with 29 CASP11 human-prediction targets with length less than 400 amino-acid residues demonstrated that the consensus-fragment approach gave better results, i.e., lower α-carbon root-mean-square deviation from the experimental structures, higher template modeling score, and global distance test total score values than the best of the parent server models. Apart from global improvement (repacking and improving the orientation of domains and other substructures), improvement was also reached for template-based modeling

  9. Is protein structure prediction still an enigma?

    African Journals Online (AJOL)

    STORAGESEVER

    2008-12-29

    Dec 29, 2008 ... Proteins are large molecules indispensable for the existence and proper functioning of biological organisms. They perform a wide array of functions including catalysis, structure formation, transport, body defense, etc. Understanding the functions of proteins is a fundamental problem in the discovery of.

  10. PEGylated nanoparticles: protein corona and secondary structure

    Science.gov (United States)

    Runa, Sabiha; Hill, Alexandra; Cochran, Victoria L.; Payne, Christine K.

    2014-09-01

    Nanoparticles have important biological and biomedical applications ranging from drug and gene delivery to biosensing. In the presence of extracellular proteins, a "corona" of proteins adsorbs on the surface of the nanoparticles, altering their interaction with cells, including immune cells. Nanoparticles are often functionalized with polyethylene glycol (PEG) to reduce this non-specific adsorption of proteins. To understand the change in protein corona that occurs following PEGylation, we first quantified the adsorption of blood serum proteins on bare and PEGylated gold nanoparticles using gel electrophoresis. We find a threefold decrease in the amount of protein adsorbed on PEGylated gold nanoparticles compared to the bare gold nanoparticles, showing that PEG reduces, but does not prevent, corona formation. To determine if the secondary structure of corona proteins was altered upon adsorption onto the bare and PEGylated gold nanoparticles, we use CD spectroscopy to characterize the secondary structure of bovine serum albumin following incubation with the nanoparticles. Our results show no significant change in protein secondary structure following incubation with bare or PEGylated nanoparticles. Further examination of the secondary structure of bovine serum albumin, α2-macroglobulin, and transferrin in the presence of free PEG showed similar results. These findings provide important insights for the use of PEGylated gold nanoparticles under physiological conditions.

  11. Extracting knowledge from protein structure geometry

    DEFF Research Database (Denmark)

    Røgen, Peter; Koehl, Patrice

    2013-01-01

    to minima of their free energy surfaces. It is well known however that the situation is more complicated as the current force fields used for molecular simulations fail to recognize native states from misfolded structures. In an attempt to solve this problem, we follow an alternate approach and derive a new......Protein structure prediction techniques proceed in two steps, namely the generation of many structural models for the protein of interest, followed by an evaluation of all these models to identify those that are native-like. In theory, the second step is easy, as native structures correspond...

  12. A 'periodic table' for protein structures.

    Science.gov (United States)

    Taylor, William R

    2002-04-11

    Current structural genomics programs aim systematically to determine the structures of all proteins coded in both human and other genomes, providing a complete picture of the number and variety of protein structures that exist. In the past, estimates have been made on the basis of the incomplete sample of structures currently known. These estimates have varied greatly (between 1,000 and 10,000; see for example refs 1 and 2), partly because of limited sample size but also owing to the difficulties of distinguishing one structure from another. This distinction is usually topological, based on the fold of the protein; however, in strict topological terms (neglecting to consider intra-chain cross-links), protein chains are open strings and hence are all identical. To avoid this trivial result, topologies are determined by considering secondary links in the form of intra-chain hydrogen bonds (secondary structure) and tertiary links formed by the packing of secondary structures. However, small additions to or loss of structure can make large changes to these perceived topologies and such subjective solutions are neither robust nor amenable to automation. Here I formalize both secondary and tertiary links to allow the rigorous and automatic definition of protein topology.

  13. Bayesian segmentation of protein secondary structure.

    Science.gov (United States)

    Schmidler, S C; Liu, J S; Brutlag, D L

    2000-01-01

    We present a novel method for predicting the secondary structure of a protein from its amino acid sequence. Most existing methods predict each position in turn based on a local window of residues, sliding this window along the length of the sequence. In contrast, we develop a probabilistic model of protein sequence/structure relationships in terms of structural segments, and formulate secondary structure prediction as a general Bayesian inference problem. A distinctive feature of our approach is the ability to develop explicit probabilistic models for alpha-helices, beta-strands, and other classes of secondary structure, incorporating experimentally and empirically observed aspects of protein structure such as helical capping signals, side chain correlations, and segment length distributions. Our model is Markovian in the segments, permitting efficient exact calculation of the posterior probability distribution over all possible segmentations of the sequence using dynamic programming. The optimal segmentation is computed and compared to a predictor based on marginal posterior modes, and the latter is shown to provide significant improvement in predictive accuracy. The marginalization procedure provides exact secondary structure probabilities at each sequence position, which are shown to be reliable estimates of prediction uncertainty. We apply this model to a database of 452 nonhomologous structures, achieving accuracies as high as the best currently available methods. We conclude by discussing an extension of this framework to model nonlocal interactions in protein structures, providing a possible direction for future improvements in secondary structure prediction accuracy.

  14. Foldit Standalone: a video game-derived protein structure manipulation interface using Rosetta.

    Science.gov (United States)

    Kleffner, Robert; Flatten, Jeff; Leaver-Fay, Andrew; Baker, David; Siegel, Justin B; Khatib, Firas; Cooper, Seth

    2017-09-01

    Foldit Standalone is an interactive graphical interface to the Rosetta molecular modeling package. In contrast to most command-line or batch interactions with Rosetta, Foldit Standalone is designed to allow easy, real-time, direct manipulation of protein structures, while also giving access to the extensive power of Rosetta computations. Derived from the user interface of the scientific discovery game Foldit (itself based on Rosetta), Foldit Standalone has added more advanced features and removed the competitive game elements. Foldit Standalone was built from the ground up with a custom rendering and event engine, configurable visualizations and interactions driven by Rosetta. Foldit Standalone contains, among other features: electron density and contact map visualizations, multiple sequence alignment tools for template-based modeling, rigid body transformation controls, RosettaScripts support and an embedded Lua interpreter. Foldit Standalone is available for download at https://fold.it/standalone , under the Rosetta license, which is free for academic and non-profit users. It is implemented in cross-platform C ++ and binary executables are available for Windows, macOS and Linux. scooper@ccs.neu.edu.

  15. Human cancer protein-protein interaction network: a structural perspective.

    Directory of Open Access Journals (Sweden)

    Gozde Kar

    2009-12-01

    Full Text Available Protein-protein interaction networks provide a global picture of cellular function and biological processes. Some proteins act as hub proteins, highly connected to others, whereas some others have few interactions. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. Similar or overlapping binding sites should be used repeatedly in single interface hub proteins, making them promiscuous. Alternatively, multi-interface hub proteins make use of several distinct binding sites to bind to different partners. We propose a methodology to integrate protein interfaces into cancer interaction networks (ciSPIN, cancer structural protein interface network. The interactions in the human protein interaction network are replaced by interfaces, coming from either known or predicted complexes. We provide a detailed analysis of cancer related human protein-protein interfaces and the topological properties of the cancer network. The results reveal that cancer-related proteins have smaller, more planar, more charged and less hydrophobic binding sites than non-cancer proteins, which may indicate low affinity and high specificity of the cancer-related interactions. We also classified the genes in ciSPIN according to phenotypes. Within phenotypes, for breast cancer, colorectal cancer and leukemia, interface properties were found to be discriminating from non-cancer interfaces with an accuracy of 71%, 67%, 61%, respectively. In addition, cancer-related proteins tend to interact with their partners through distinct interfaces, corresponding mostly to multi-interface hubs, which comprise 56% of cancer-related proteins, and constituting the nodes with higher essentiality in the network (76%. We illustrate the interface related affinity properties of two cancer-related hub

  16. De novo membrane protein structure prediction.

    Science.gov (United States)

    Nugent, Timothy

    2015-01-01

    Recent advances in identifying residue-residue contacts from large multiple sequence alignments have enabled impressive gains to be made in the field of protein structure prediction. In this chapter, we discuss these advances and provide a step-by-step guide to applying the latest tools to the de novo modelling of alpha-helical transmembrane proteins. As a practical example, we demonstrate the process of building an accurate 3D model of a G protein-coupled receptor, correctly orientated in the membrane, using only its primary protein sequence.

  17. Protein Structure Recognition: From Eigenvector Analysis to Structural Threading Method

    Energy Technology Data Exchange (ETDEWEB)

    Cao, Haibo [Iowa State Univ., Ames, IA (United States)

    2003-01-01

    In this work, they try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. They found a strong correlation between amino acid sequences and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, they give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part includes discussions of interactions among amino acids residues, lattice HP model, and the design ability principle. In the second part, they try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in the eigenvector study of protein contact matrix. They believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, they discuss a threading method based on the correlation between amino acid sequences and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, they list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches.

  18. On characterization of anisotropic plant protein structures

    NARCIS (Netherlands)

    Krintiras, G.A.; Göbel, J.; Bouwman, W.G.; Goot, van der A.J.; Stefanidis, G.D.

    2014-01-01

    In this paper, a set of complementary techniques was used to characterize surface and bulk structures of an anisotropic Soy Protein Isolate (SPI)–vital wheat gluten blend after it was subjected to heat and simple shear flow in a Couette Cell. The structured biopolymer blend can form a basis for a

  19. The Protein Data Bank and structural genomics

    OpenAIRE

    Westbrook, John; Feng, Zukang; Chen, Li; Yang, Huanwang; Berman, Helen M.

    2003-01-01

    The Protein Data Bank (PDB; http://www.pdb.org/) continues to be actively involved in various aspects of the informatics of structural genomics projects—developing and maintaining the Target Registration Database (TargetDB), organizing data dictionaries that will define the specification for the exchange and deposition of data with the structural genomics centers and creating software tools to capture data from standard structure determination applications.

  20. The Protein Data Bank and structural genomics.

    Science.gov (United States)

    Westbrook, John; Feng, Zukang; Chen, Li; Yang, Huanwang; Berman, Helen M

    2003-01-01

    The Protein Data Bank (PDB; http://www.pdb.org/) continues to be actively involved in various aspects of the informatics of structural genomics projects--developing and maintaining the Target Registration Database (TargetDB), organizing data dictionaries that will define the specification for the exchange and deposition of data with the structural genomics centers and creating software tools to capture data from standard structure determination applications.

  1. Structure-guided deimmunization of therapeutic proteins.

    Science.gov (United States)

    Parker, Andrew S; Choi, Yoonjoo; Griswold, Karl E; Bailey-Kellogg, Chris

    2013-02-01

    Therapeutic proteins continue to yield revolutionary new treatments for a growing spectrum of human disease, but the development of these powerful drugs requires solving a unique set of challenges. For instance, it is increasingly apparent that mitigating potential anti-therapeutic immune responses, driven by molecular recognition of a therapeutic protein's peptide fragments, may be best accomplished early in the drug development process. One may eliminate immunogenic peptide fragments by mutating the cognate amino acid sequences, but deimmunizing mutations are constrained by the need for a folded, stable, and functional protein structure. These two concerns may be competing, as the mutations that are best at reducing immunogenicity often involve amino acids that are substantially different physicochemically. We develop a novel approach, called EpiSweep, that simultaneously optimizes both concerns. Our algorithm identifies sets of mutations making such Pareto optimal trade-offs between structure and immunogenicity, embodied by a molecular mechanics energy function and a T-cell epitope predictor, respectively. EpiSweep integrates structure-based protein design, sequence-based protein deimmunization, and algorithms for finding the Pareto frontier of a design space. While structure-based protein design is NP-hard, we employ integer programming techniques that are efficient in practice. Furthermore, EpiSweep only invokes the optimizer once per identified Pareto optimal design. We show that EpiSweep designs of regions of the therapeutics erythropoietin and staphylokinase are predicted to outperform previous experimental efforts. We also demonstrate EpiSweep's capacity for deimmunization of the entire proteins, case analyses involving dozens of predicted epitopes, and tens of thousands of unique side-chain interactions. Ultimately, Epi-Sweep is a powerful protein design tool that guides the protein engineer toward the most promising immunotolerant biotherapeutic

  2. Structural deformation upon protein-protein interaction: A structural alphabet approach

    Science.gov (United States)

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-01-01

    Background In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. Results In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Conclusion Our study provides qualitative information about induced fit. These results could be of help for flexible docking. PMID:18307769

  3. Template-based automatic breast segmentation on MRI by excluding the chest region.

    Science.gov (United States)

    Lin, Muqing; Chen, Jeon-Hor; Wang, Xiaoyong; Chan, Siwa; Chen, Siping; Su, Min-Ying

    2013-12-01

    Methods for quantification of breast density on MRI using semiautomatic approaches are commonly used. In this study, the authors report on a fully automatic chest template-based method. Nonfat-suppressed breast MR images from 31 healthy women were analyzed. Among them, one case was randomly selected and used as the template, and the remaining 30 cases were used for testing. Unlike most model-based breast segmentation methods that use the breast region as the template, the chest body region on a middle slice was used as the template. Within the chest template, three body landmarks (thoracic spine and bilateral boundary of the pectoral muscle) were identified for performing the initial V-shape cut to determine the posterior lateral boundary of the breast. The chest template was mapped to each subject's image space to obtain a subject-specific chest model for exclusion. On the remaining image, the chest wall muscle was identified and excluded to obtain clean breast segmentation. The chest and muscle boundaries determined on the middle slice were used as the reference for the segmentation of adjacent slices, and the process continued superiorly and inferiorly until all 3D slices were segmented. The segmentation results were evaluated by an experienced radiologist to mark voxels that were wrongly included or excluded for error analysis. The breast volumes measured by the proposed algorithm were very close to the radiologist's corrected volumes, showing a % difference ranging from 0.01% to 3.04% in 30 tested subjects with a mean of 0.86% ± 0.72%. The total error was calculated by adding the inclusion and the exclusion errors (so they did not cancel each other out), which ranged from 0.05% to 6.75% with a mean of 3.05% ± 1.93%. The fibroglandular tissue segmented within the breast region determined by the algorithm and the radiologist were also very close, showing a % difference ranging from 0.02% to 2.52% with a mean of 1.03% ± 1.03%. The total error by adding the

  4. Utilization of Protein Crystal Structures in Industry

    Science.gov (United States)

    Ishikawa, Kohki

    In industry, protein crystallography is used in mainly two technologies. One is structure-based drug design, and the other is structure-based enzyme engineering. Some successful cases together with recent advances are presented in this article. The cases include the development of an anti-influenza drug, and the introduction of engineered acid phosphatase to the manufacturing process of nucleotides used as umami seasoning.

  5. Dodecylamine Template-Based Hexagonal Mesoporous Silica (HMS) as a Carrier for Improved Oral Delivery of Fenofibrate.

    Science.gov (United States)

    Jadhav, Nitin V; Vavia, Pradeep R

    2017-10-01

    The aim of present investigation was the preparation of dodecylamine template-based hexagonal mesoporous silica (HMS) as a carrier for poorly water-soluble drug (fenofibrate). HMS material has distinctive characteristics such as easy synthesis, high surface area and wormhole pores. These characteristics are highly admirable to make use of it as a carrier in drug delivery system. HMS was prepared by pH and temperature-independent process. Fenofibrate was loaded into the HMS by solvent immersion method using organic solvent. The BET surface area of HMS was evaluated by nitrogen adsorption/desorption analysis. HMS and drug-loaded HMS were characterized by differential scanning calorimetry (DSC), X-ray powder diffraction (XRPD), Fourier transform infrared spectroscopy (FTIR), scanning electron microscopy (SEM), transmission electron microscopy (TEM) and contact angle study. The HMS-based system was also evaluated for in vitro and in vivo study as compared to plain drug. The BET surface area of HMS was found 974 m 2 /g with a narrow pore size average of 2.6 nm. The DSC and XRD study confirmed the amorphization of drug within the HMS. SEM and TEM study showed morphological features of HMS as well as revealed the wormhole porous structure. Contact angle study showed improvement in aqueous wetting property of drug within the HMS (contact angle 46°). The In vitro drug release study showed a remarkable dissolution enhancement in HMS-based system as compared to plain drug. In vivo pharmacodynamic study (hyperlipidaemia model) exhibited HMS-based formulation was significantly improved the bioavailability of fenofibrate. Thus, HMS has admirable properties; makes it a potential carrier for delivery system of poorly water-soluble drugs.

  6. Structural Changes of Malt Proteins During Boiling

    Directory of Open Access Journals (Sweden)

    2009-03-01

    Full Text Available Changes in the physicochemical properties and structure of proteins derived from two malt varieties (Baudin and Guangmai during wort boiling were investigated by differential scanning calorimetry, SDS-PAGE, two-dimensional electrophoresis, gel filtration chromatography and circular dichroism spectroscopy. The results showed that both protein content and amino acid composition changed only slightly during boiling, and that boiling might cause a gradual unfolding of protein structures, as indicated by the decrease in surface hydrophobicity and free sulfhydryl content and enthalpy value, as well as reduced α-helix contents and markedly increased random coil contents. It was also found that major component of both worts was a boiling-resistant protein with a molecular mass of 40 kDa, and that according to the two-dimensional electrophoresis and SE-HPLC analyses, a small amount of soluble aggregates might be formed via hydrophobic interactions. It was thus concluded that changes of protein structure caused by boiling that might influence beer quality are largely independent of malt variety.

  7. Structural mechanisms of nonplanar hemes in proteins

    Energy Technology Data Exchange (ETDEWEB)

    Shelnutt, J.A.

    1997-05-01

    The objective is to assess the occurrence of nonplanar distortions of hemes and other tetrapyrroles in proteins and to determine the biological function of these distortions. Recently, these distortions were found by us to be conserved among proteins belonging to a functional class. Conservation of the conformation of the heme indicates a possible functional role. Researchers have suggested possible mechanisms by which heme distortions might influence biological properties; however, no heme distortion has yet been shown conclusively to participate in a structural mechanism of hemoprotein function. The specific aims of the proposed work are: (1) to characterize and quantify the distortions of the hemes in all of the more than 300 hemoprotein X-ray crystal structures in terms of displacements along the lowest-frequency normal coordinates, (2) to determine the structural features of the protein component that generate and control these nonplanar distortions by using spectroscopic studies and molecular-mechanics calculations for the native proteins, their mutants and heme-peptide fragments, and model porphyrins, (3) to determine spectroscopic markers for the various types of distortion, and, finally, (4) to discover the functional significance of the nonplanar distortions by correlating function with porphyrin conformation for proteins and model porphyrins.

  8. Intertwined associations in structures of homooligomeric proteins.

    Science.gov (United States)

    Mackinnon, Stephen S; Malevanets, Anatoly; Wodak, Shoshana J

    2013-04-02

    Intertwined homo-oligomers are complexes comprising identical protein subunits, where small segments or compact protein substructures (domains) are exchanged between the subunits. Using a formal definition of intertwined homo-oligomers, we survey the Protein Data Bank for all such complexes. Results show that intertwining occurs in 13,442 (24%) of all surveyed structures. A majority (∼72%) exchanges one contiguous chain segment of varying length. Another ∼10%, exchange structural domains, and the remaining ∼20% display complex intertwining topologies. Smaller proteins are more often intertwined, and intertwining is dominant in solution homodimers. These findings and analyses of various properties of the major category of intertwined complexes, their interfaces and quaternary context, support the physiological role of intertwining in promoting homooligomer stability. Furthermore, the number of different intertwining modes observed in families of related proteins is limited, and likely specific to the protein fold. These findings yield unique insights into the role of intertwining in homomeric association. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. The hydration structure of DNA and proteins

    Energy Technology Data Exchange (ETDEWEB)

    Niimura, Nobuo [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment

    2002-03-01

    Water-soluble proteins are surrounded by water molecules, and the water molecules mediate the biological processes: i.e. the protein folding, the enzymatic reaction, the molecular recognition via hydrogen bonds, electrostatic interactions and van der Waals interactions. It is essential to know the structural information such as orientation and dynamical behavior of water molecules including hydrogen atoms in order to characterize these interactions. The neutron analysis can determine the positions of the hydrogen atoms at the medium resolution in the protein crystallography (d{sub min}{approx}2.0 A). Recently we have constructed the high-resolution neutron diffractometer (BIX) dedicated for the biological macromolecules. By using this diffractometer, the high resolution (1.5 or 1.6A) neutron structure analyses of sperm whale myoglobin, a wild-type rubredoxin from Pyrococcus furiosus, and the rubredoxin mutant have been successfully carried out and their hydration structure including hydrogen atoms have been observed. Hydrogen atoms in the water molecule can be clearly identified in two boomerang-shaped water molecules and the forming of the hydrogen bonds between the two water molecules can be recognized well. It has been concluded that hydration structure observed by the high resolution neutron protein crystallography provides where a water molecule locates, and how it binds to the neighbor atoms, and how it behaves. (M.Suetake)

  10. Template-based self-assembly for silicon chips and 01005 surface-mount components

    Science.gov (United States)

    Hoo, J. H.; Park, Kwang Soon; Baskaran, Rajashree; Böhringer, Karl F.

    2014-04-01

    We present template-based microscale self-assembly as a technique that promotes the electronics industry's initiative towards functional diversification and function densification, demonstrating that our process can improve existing assembly and packaging techniques, and also enable possibilities restricted by current industry methodologies. We first present foundational work that performs part (370 × 370 × 150 µm3) delivery to receptor sites (20 × 10 array) with a stochastic batch delivery process that completes within tens of seconds. The delivery mechanism is statistically characterized and a chemical kinetics inspired model is developed. Based on this understanding, repeatable and programmable 100% yield assembly is achieved in open-loop and feedback-based configurations. The established methodology is adapted to deliver and assemble standard 01 005 format (0.016″ × 0.008″, 0.4 mm × 0.2 mm) monolithic ceramic capacitors and thin-film resistors onto silicon substrates. This process is CMOS compatible and is competitive with capacitors and resistors fabricated through standard foundry processes.

  11. A template-based computerized instruction entry system helps the comunication between doctors and nurses.

    Science.gov (United States)

    Takeda, Toshihiro; Mihara, Naoki; Nakagawa, Rie; Manabe, Shiro; Shimai, Yoshie; Teramoto, Kei; Matsumura, Yasushi

    2015-01-01

    In a hospital, doctors and nurses shares roles in treating admitted patients. Communication between them is necessary and communication errors become the problem in medical safety. In Japan, verbal instruction is prohibited and doctors write their instruction on paper instruction slips. However, because it is difficult to ascertain revision history and the active instructions on instruction slips, human errors can occur. We developed template-based computerized instruction entry system to reduce ward workloads and contribute to medical safety. Templates enable us to input the instructions easily and standardize the descriptions of instructions. By standardizing and combine the instruction into one template for one instruction item, the systems could prevent instructions overlap. We created sets of templates (e.g., admission set, preoperative set), so that doctors could enter their instructions easily. Instructions entered via any of the sets can be subdivided into separate items by the system before being submitted, and can also be changed on a per-item basis. The instructions were displayed as calendar form. Calendar form represents the instruction shift and current active instructions. We prepared 382 standardized instruction templates. In our system, 66% of instructions were entered via templates, and 34% were entered as free-text comments. Our system prevents communication errors between medical staff.

  12. QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information

    Directory of Open Access Journals (Sweden)

    Schwede Torsten

    2009-05-01

    Full Text Available Abstract Background The selection of the most accurate protein model from a set of alternatives is a crucial step in protein structure prediction both in template-based and ab initio approaches. Scoring functions have been developed which can either return a quality estimate for a single model or derive a score from the information contained in the ensemble of models for a given sequence. Local structural features occurring more frequently in the ensemble have a greater probability of being correct. Within the context of the CASP experiment, these so called consensus methods have been shown to perform considerably better in selecting good candidate models, but tend to fail if the best models are far from the dominant structural cluster. In this paper we show that model selection can be improved if both approaches are combined by pre-filtering the models used during the calculation of the structural consensus. Results Our recently published QMEAN composite scoring function has been improved by including an all-atom interaction potential term. The preliminary model ranking based on the new QMEAN score is used to select a subset of reliable models against which the structural consensus score is calculated. This scoring function called QMEANclust achieves a correlation coefficient of predicted quality score and GDT_TS of 0.9 averaged over the 98 CASP7 targets and perform significantly better in selecting good models from the ensemble of server models than any other groups participating in the quality estimation category of CASP7. Both scoring functions are also benchmarked on the MOULDER test set consisting of 20 target proteins each with 300 alternatives models generated by MODELLER. QMEAN outperforms all other tested scoring functions operating on individual models, while the consensus method QMEANclust only works properly on decoy sets containing a certain fraction of near-native conformations. We also present a local version of QMEAN for the per

  13. Structured illumination microscopy using photoswitchable fluorescent proteins

    Science.gov (United States)

    Hirvonen, Liisa; Mandula, Ondrej; Wicker, Kai; Heintzmann, Rainer

    2008-02-01

    In fluorescence microscopy the lateral resolution is limited to about 200 nm because of diffraction. Resolution improvement by a factor of two can be achieved using structured illumination, where a ine grating is projected onto the sample, and the final image is reconstructed from a set of images taken at different grating positions. Further resolution improvement can be achieved by saturating the transitions involved in fluorescence emission. Recently discovered photoswitchable proteins undergo transitions that are saturable at low illumination intensity. Combining this concept with structured illumination, theoretically unlimited resolution can be achieved, where the smallest resolvable distance will be determined by signal-to-noise ratio. This work focuses on the use of the photoswitchable protein Dronpa with structured illumination to achieve nanometre scale resolution in fixed cells.

  14. Predicting protein structure classes from function predictions

    DEFF Research Database (Denmark)

    Sommer, I.; Rahnenfuhrer, J.; de Lichtenberg, Ulrik

    2004-01-01

    We introduce a new approach to using the information contained in sequence-to-function prediction data in order to recognize protein template classes, a critical step in predicting protein structure. The data on which our method is based comprise probabilities of functional categories; for given...... query sequences these probabilities are obtained by a neural net that has previously been trained on a variety of functionally important features. On a training set of sequences we assess the relevance of individual functional categories for identifying a given structural family. Using a combination...... of the most relevant categories, the likelihood of a query sequence to belong to a specific family can be estimated. Results: The performance of the method is evaluated using cross-validation. For a fixed structural family and for every sequence, a score is calculated that measures the evidence for family...

  15. Trends in structural coverage of the protein universe and the impact of the Protein Structure Initiative

    Science.gov (United States)

    Khafizov, Kamil; Madrid-Aliste, Carlos; Almo, Steven C.; Fiser, Andras

    2014-01-01

    The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins—including proteins for which reliable homology models can be generated—on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long. PMID:24567391

  16. Protein Structure and the Sequential Structure of mRNA

    DEFF Research Database (Denmark)

    Brunak, Søren; Engelbrecht, Jacob

    1996-01-01

    protein, The degeneracy of the genetic code allows for a biased selection of codons which may control the translational rate of the ribosome, and may thus in vivo have a catalyzing effect on the folding of the polypeptide chain, A complete search for GenBank nucleotide sequences coding for structural...

  17. Photoinduced structural changes to protein kinase A

    Science.gov (United States)

    Rozinek, Sarah C.; Thomas, Robert J.; Brancaleon, Lorenzo

    2014-03-01

    The importance of porphyrins in organisms is underscored by the ubiquitous biological and biochemical functions that are mediated by these compounds and by their potential biomedical and biotechnological applications. Protoporphyrin IX (PPIX) is the precursor to heme and has biomedical applications such as its use as a photosensitizer in phototherapy and photodetection of cancer. Among other applications, our group has demonstrated that low-irradiance exposure to laser irradiation of PPIX, Fe-PPIX, or meso-tetrakis (4-sulfonatophenyl) porphyrin (TSPP) non-covalently docked to a protein causes conformational changes in the polypeptide. Such approach can have remarkable consequences in the study of protein structure/function relationship and can be used to prompt non-native protein properties. Therefore we have investigated protein kinase A (PKA), a more relevant protein model towards the photo-treatment of cancer. PKA's enzymatic functions are regulated by the presence of cyclic adenosine monophosphate for intracellular signal transduction involved in, among other things, stimulation of transcription, tumorigenesis in Carney complex and migration of breast carcinoma cells. Since phosphorylation is a necessary step in some cancers and inflammatory diseases, inhibiting the protein kinase, and therefore phosphorylation, may serve to treat these diseases. Changes in absorption, steady-state fluorescence, and fluorescence lifetime indicate: 1) both TSPP and PPIX non-covalently bind to PKA where they maintain photoreactivity; 2) absorptive photoproduct formation occurs only when PKA is bound to TSPP and irradiated; and 3) PKA undergoes secondary structural changes after irradiation with either porphyrin bound. These photoinduced changes could affect the protein's enzymatic and signaling capabilities.

  18. The structure and function of endophilin proteins

    DEFF Research Database (Denmark)

    Kjaerulff, Ole; Brodin, Lennart; Jung, Anita

    2011-01-01

    Members of the BAR domain protein superfamily are essential elements of cellular traffic. Endophilins are among the best studied BAR domain proteins. They have a prominent function in synaptic vesicle endocytosis (SVE), receptor trafficking and apoptosis, and in other processes that require...... remodeling of the membrane structure. Here, we discuss the role of endophilins in these processes and summarize novel insights into the molecular aspects of endophilin function. Also, we discuss phosphorylation of endophilins and how this and other mechanisms may contribute to disease....

  19. Electronic structure of bacterial surface protein layers

    Science.gov (United States)

    Maslyuk, Volodymyr V.; Mertig, Ingrid; Bredow, Thomas; Mertig, Michael; Vyalikh, Denis V.; Molodtsov, Serguei L.

    2008-01-01

    We report an approach for the calculation of the electronic density of states of the dried two-dimensional crystalline surface protein layer ( S layer) of the bacterium Bacillus sphaericus NCTC 9602. The proposed model is based on the consideration of individual amino acids in the corresponding conformation of the peptide chain which additively contribute to the electronic structure of the entire protein complex. The derived results agree well with the experimental data obtained by means of photoemission (PE), resonant PE, and near-edge x-ray absorption spectroscopy.

  20. Template-based automatic breast segmentation on MRI by excluding the chest region

    Energy Technology Data Exchange (ETDEWEB)

    Lin, Muqing [Tu and Yuen Center for Functional Onco-Imaging, Department of Radiological Sciences, University of California, Irvine, California 92697-5020 and National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, Department of Biomedical Engineering, School of Medicine, Shenzhen University, 518060 China (China); Chen, Jeon-Hor [Tu and Yuen Center for Functional Onco-Imaging, Department of Radiological Sciences, University of California, Irvine, California 92697-5020 and Department of Radiology, E-Da Hospital and I-Shou University, Kaohsiung 82445, Taiwan (China); Wang, Xiaoyong; Su, Min-Ying, E-mail: msu@uci.edu [Tu and Yuen Center for Functional Onco-Imaging, Department of Radiological Sciences, University of California, Irvine, California 92697-5020 (United States); Chan, Siwa [Department of Radiology, Taichung Veterans General Hospital, Taichung 40407, Taiwan (China); Chen, Siping [National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, Department of Biomedical Engineering, School of Medicine, Shenzhen University, 518060 China (China)

    2013-12-15

    Purpose: Methods for quantification of breast density on MRI using semiautomatic approaches are commonly used. In this study, the authors report on a fully automatic chest template-based method. Methods: Nonfat-suppressed breast MR images from 31 healthy women were analyzed. Among them, one case was randomly selected and used as the template, and the remaining 30 cases were used for testing. Unlike most model-based breast segmentation methods that use the breast region as the template, the chest body region on a middle slice was used as the template. Within the chest template, three body landmarks (thoracic spine and bilateral boundary of the pectoral muscle) were identified for performing the initial V-shape cut to determine the posterior lateral boundary of the breast. The chest template was mapped to each subject's image space to obtain a subject-specific chest model for exclusion. On the remaining image, the chest wall muscle was identified and excluded to obtain clean breast segmentation. The chest and muscle boundaries determined on the middle slice were used as the reference for the segmentation of adjacent slices, and the process continued superiorly and inferiorly until all 3D slices were segmented. The segmentation results were evaluated by an experienced radiologist to mark voxels that were wrongly included or excluded for error analysis. Results: The breast volumes measured by the proposed algorithm were very close to the radiologist's corrected volumes, showing a % difference ranging from 0.01% to 3.04% in 30 tested subjects with a mean of 0.86% ± 0.72%. The total error was calculated by adding the inclusion and the exclusion errors (so they did not cancel each other out), which ranged from 0.05% to 6.75% with a mean of 3.05% ± 1.93%. The fibroglandular tissue segmented within the breast region determined by the algorithm and the radiologist were also very close, showing a % difference ranging from 0.02% to 2.52% with a mean of 1.03% ± 1

  1. DTI template-based estimation of cardiac fiber orientations from 3D ultrasound.

    Science.gov (United States)

    Qin, Xulei; Fei, Baowei

    2015-06-01

    Cardiac muscle fibers directly affect the mechanical, physiological, and pathological properties of the heart. Patient-specific quantification of cardiac fiber orientations is an important but difficult problem in cardiac imaging research. In this study, the authors proposed a cardiac fiber orientation estimation method based on three-dimensional (3D) ultrasound images and a cardiac fiber template that was obtained from magnetic resonance diffusion tensor imaging (DTI). A DTI template-based framework was developed to estimate cardiac fiber orientations from 3D ultrasound images using an animal model. It estimated the cardiac fiber orientations of the target heart by deforming the fiber orientations of the template heart, based on the deformation field of the registration between the ultrasound geometry of the target heart and the MRI geometry of the template heart. In the experiments, the animal hearts were imaged by high-frequency ultrasound, T1-weighted MRI, and high-resolution DTI. The proposed method was evaluated by four different parameters: Dice similarity coefficient (DSC), target errors, acute angle error (AAE), and inclination angle error (IAE). Its ability of estimating cardiac fiber orientations was first validated by a public database. Then, the performance of the proposed method on 3D ultrasound data was evaluated by an acquired database. Their average values were 95.4% ± 2.0% for the DSC of geometric registrations, 21.0° ± 0.76° for AAE, and 19.4° ± 1.2° for IAE of fiber orientation estimations. Furthermore, the feasibility of this framework was also performed on 3D ultrasound images of a beating heart. The proposed framework demonstrated the feasibility of using 3D ultrasound imaging to estimate cardiac fiber orientation of in vivo beating hearts and its further improvements could contribute to understanding the dynamic mechanism of the beating heart and has the potential to help diagnosis and therapy of heart disease.

  2. Protein Structure By FTIR Self-Deconvolution

    Science.gov (United States)

    Byler, D. Michael; Susi, Heino

    1985-12-01

    Fourier self-deconvolution was applied to the peptide-carbonyl stretching vibration (amide I mode) of more than 20 globular proteins in deuterium oxide solution. This band, which usually exhibits little discernible fine structure, was thereby resolved into three to nine components. The individual components were assigned to protein segments consisting of extended chains, helices, and various turns and bends. The areas of the components were evaluated by Gauss-Newton iterative curve fitting with the assumption of Gaussian band shapes. Quantitative estimations regarding secondary structure were made by calculating the sum of the areas of the components asssociated with a particular conformation as a fraction of the total amide I band area. The results for helix content and for extended chain content are in good agreement with literature values obtained from X-ray data.

  3. Protein Structure Prediction with Evolutionary Algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Hart, W.E.; Krasnogor, N.; Pelta, D.A.; Smith, J.

    1999-02-08

    Evolutionary algorithms have been successfully applied to a variety of molecular structure prediction problems. In this paper we reconsider the design of genetic algorithms that have been applied to a simple protein structure prediction problem. Our analysis considers the impact of several algorithmic factors for this problem: the confirmational representation, the energy formulation and the way in which infeasible conformations are penalized, Further we empirically evaluated the impact of these factors on a small set of polymer sequences. Our analysis leads to specific recommendations for both GAs as well as other heuristic methods for solving PSP on the HP model.

  4. Structural Simulation of MHC-peptide Interactions using T-cell Epitope in Iron-acquisition Protein of N. meningitides for Vaccine Design

    Directory of Open Access Journals (Sweden)

    Namrata Mishra

    2010-12-01

    Full Text Available The present work uses a structural simulation approach to identify the potential target vaccine candidates or T cell epitopes (antigenic region that can activate T cell response in two iron acquisition proteins from Neisseria. An iron regulated outer membrane protein frpB: extracellular, [NMB1988], and a Major ferric Iron-binding protein fbpA: periplasmic, [NMB0634] critical for the survival of the pathogen in the host were used. Ten novel promiscuous epitopes from the two iron acquisition proteins were identified using bioinformatics interface. Of these epitopes, 630VQKAVGSIL638 present on frpB with high binding affinity for allele HLA*DR1 was identified with an anchor position at P2, an aliphatic residue at P4 and glycine at P6 making it thereby a potential quality choice for linking peptide-loaded MHC dynamics to T-cell activation and vaccine constructs. The feasibility and structural binding of predicted peptide to the respective HLA allele was investigated by molecular modeling and template-based structural simulation. The conformational properties of the linear peptide were investigated by molecular dynamics using GROMOS96 package and Swiss PDB viewer.

  5. Are specialized servers better at predicting protein structures than ...

    African Journals Online (AJOL)

    This research study answers the question that technology is the best for predicting protein structures. Stand-alone software only depend on protein structure prediction algorithms, while web servers consult a number of other sources such as meta servers and protein data banks to produce a protein structure achieved ...

  6. Protein-mediated surface structuring in biomembranes

    Directory of Open Access Journals (Sweden)

    Maggio B.

    2005-01-01

    Full Text Available The lipids and proteins of biomembranes exhibit highly dissimilar conformations, geometrical shapes, amphipathicity, and thermodynamic properties which constrain their two-dimensional molecular packing, electrostatics, and interaction preferences. This causes inevitable development of large local tensions that frequently relax into phase or compositional immiscibility along lateral and transverse planes of the membrane. On the other hand, these effects constitute the very codes that mediate molecular and structural changes determining and controlling the possibilities for enzymatic activity, apposition and recombination in biomembranes. The presence of proteins constitutes a major perturbing factor for the membrane sculpturing both in terms of its surface topography and dynamics. We will focus on some results from our group within this context and summarize some recent evidence for the active involvement of extrinsic (myelin basic protein, integral (Folch-Lees proteolipid protein and amphitropic (c-Fos and c-Jun proteins, as well as a membrane-active amphitropic phosphohydrolytic enzyme (neutral sphingomyelinase, in the process of lateral segregation and dynamics of phase domains, sculpturing of the surface topography, and the bi-directional modulation of the membrane biochemical reactivity.

  7. Hydrophobic interactions of sucralose with protein structures.

    Science.gov (United States)

    Shukla, Nimesh; Pomarico, Enrico; Hecht, Cody J S; Taylor, Erika A; Chergui, Majed; Othon, Christina M

    2018-02-01

    Sucralose is a commonly employed artificial sweetener that appears to destabilize protein native structures. This is in direct contrast to the bio-preservative nature of its natural counterpart, sucrose, which enhances the stability of biomolecules against environmental stress. We have further explored the molecular interactions of sucralose as compared to sucrose to illuminate the origin of the differences in their bio-preservative efficacy. We show that the mode of interactions of sucralose and sucrose in bulk solution differ subtly through the use of hydration dynamics measurement and computational simulation. Sucralose does not appear to disturb the native state of proteins for moderate concentrations (sucralose appears to differ in its interactions with protein leading to the reduction of native state stability. This difference in interaction appears weak. We explored the difference in the preferential exclusion model using time-resolved spectroscopic techniques and observed that both molecules appear to be effective reducers of bulk hydration dynamics. However, the chlorination of sucralose appears to slightly enhance the hydrophobicity of the molecule, which reduces the preferential exclusion of sucralose from the protein-water interface. The weak interaction of sucralose with hydrophobic pockets on the protein surface differs from the behavior of sucrose. We experimentally followed up upon the extent of this weak interaction using isothermal titration calorimetry (ITC) measurements. We propose this as a possible origin for the difference in their bio-preservative properties. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v3; ref status: indexed, http://f1000r.es/2kg

    Directory of Open Access Journals (Sweden)

    Sandeep Chakraborty

    2013-12-01

    Full Text Available Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database, is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.

  9. Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v1; ref status: indexed, http://f1000r.es/1tg

    Directory of Open Access Journals (Sweden)

    Sandeep Chakraborty

    2013-10-01

    Full Text Available Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database, is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.

  10. PSIbase: a database of Protein Structural Interactome map (PSIMAP).

    Science.gov (United States)

    Gong, Sungsam; Yoon, Giseok; Jang, Insoo; Bolser, Dan; Dafas, Panos; Schroeder, Michael; Choi, Hansol; Cho, Yoobok; Han, Kyungsook; Lee, Sunghoon; Choi, Hwanho; Lappe, Michael; Holm, Liisa; Kim, Sangsoo; Oh, Donghoon; Bhak, Jonghwa

    2005-05-15

    Protein Structural Interactome map (PSIMAP) is a global interaction map that describes domain-domain and protein-protein interaction information for known Protein Data Bank structures. It calculates the Euclidean distance to determine interactions between possible pairs of structural domains in proteins. PSIbase is a database and file server for protein structural interaction information calculated by the PSIMAP algorithm. PSIbase also provides an easy-to-use protein domain assignment module, interaction navigation and visual tools. Users can retrieve possible interaction partners of their proteins of interests if a significant homology assignment is made with their query sequences. http://psimap.org and http://psibase.kaist.ac.kr/

  11. Protein mechanics: a route from structure to function

    Indian Academy of Sciences (India)

    PRAKASH KUMAR

    Why do proteins have such varied and complicated structures and how are these structures related to the functions that each protein must perform? Almost 50 years after the first protein structures were solved (Kendrew et al 1958; Perutz 1960), these questions are still very much part of molecular biology. While structures ...

  12. Quaternion maps of global protein structure.

    Science.gov (United States)

    Hanson, Andrew J; Thakur, Sidharth

    2012-09-01

    The geometric structures of proteins are vital to the understanding of biochemical interactions. However, there is much yet to be understood about the spatial arrangements of the chains of amino acids making up any given protein. In particular, while conventional analysis tools like the Ramachandran plot supply some insight into the local relative orientation of pairs of amino acid residues, they provide little information about the global relative orientations of large groups of residues. We apply quaternion maps to families of coordinate frames defined naturally by amino acid residue structures as a way to expose global spatial relationships among residues within proteins. The resulting visualizations enable comparisons of absolute orientations as well as relative orientations, and thus generalize the framework of the Ramachandran plot. There are a variety of possible quaternion frames and visual representation strategies that can be chosen, and very complex quaternion maps can result. Just as Ramachandran plots are useful for addressing particular questions and not others, quaternion tools have characteristic domains of relevance. In particular, quaternion maps show great potential for answering specific questions about global residue alignment in crystallographic data and statistical orientation properties in Nuclear Magnetic Resonance (NMR) data that are very difficult to treat by other methods. Copyright © 2012 Elsevier Inc. All rights reserved.

  13. Automated template-based brain localization and extraction for fetal brain MRI reconstruction.

    Science.gov (United States)

    Tourbier, Sébastien; Velasco-Annis, Clemente; Taimouri, Vahid; Hagmann, Patric; Meuli, Reto; Warfield, Simon K; Bach Cuadra, Meritxell; Gholipour, Ali

    2017-07-15

    Most fetal brain MRI reconstruction algorithms rely only on brain tissue-relevant voxels of low-resolution (LR) images to enhance the quality of inter-slice motion correction and image reconstruction. Consequently the fetal brain needs to be localized and extracted as a first step, which is usually a laborious and time consuming manual or semi-automatic task. We have proposed in this work to use age-matched template images as prior knowledge to automatize brain localization and extraction. This has been achieved through a novel automatic brain localization and extraction method based on robust template-to-slice block matching and deformable slice-to-template registration. Our template-based approach has also enabled the reconstruction of fetal brain images in standard radiological anatomical planes in a common coordinate space. We have integrated this approach into our new reconstruction pipeline that involves intensity normalization, inter-slice motion correction, and super-resolution (SR) reconstruction. To this end we have adopted a novel approach based on projection of every slice of the LR brain masks into the template space using a fusion strategy. This has enabled the refinement of brain masks in the LR images at each motion correction iteration. The overall brain localization and extraction algorithm has shown to produce brain masks that are very close to manually drawn brain masks, showing an average Dice overlap measure of 94.5%. We have also demonstrated that adopting a slice-to-template registration and propagation of the brain mask slice-by-slice leads to a significant improvement in brain extraction performance compared to global rigid brain extraction and consequently in the quality of the final reconstructed images. Ratings performed by two expert observers show that the proposed pipeline can achieve similar reconstruction quality to reference reconstruction based on manual slice-by-slice brain extraction. The proposed brain mask refinement and

  14. General overview on structure prediction of twilight-zone proteins.

    Science.gov (United States)

    Khor, Bee Yin; Tye, Gee Jun; Lim, Theam Soon; Choong, Yee Siew

    2015-09-04

    Protein structure prediction from amino acid sequence has been one of the most challenging aspects in computational structural biology despite significant progress in recent years showed by critical assessment of protein structure prediction (CASP) experiments. When experimentally determined structures are unavailable, the predictive structures may serve as starting points to study a protein. If the target protein consists of homologous region, high-resolution (typically protein (also known as twilight-zone protein, sequence identity with available templates is less than 30%), the protein structure prediction has to be initiated from scratch. Traditionally, twilight-zone proteins can be predicted via threading or ab initio method. Based on the current trend, combination of different methods brings an improved success in the prediction of twilight-zone proteins. In this mini review, the methods, progresses and challenges for the prediction of twilight-zone proteins were discussed.

  15. Synonymous codon usage in different protein secondary structural ...

    Indian Academy of Sciences (India)

    PRAKASH KUMAR

    2007-06-21

    . The relationship between the synonymous codon usage and different protein secondary structural classes were investigated using 401 Homo sapiens proteins extracted from Protein Data Bank (PDB). A simple Chi-square ...

  16. Towards optimal alignment of protein structure distance matrices

    NARCIS (Netherlands)

    I. Wohlers (Inken); F.S. Domingues; G.W. Klau (Gunnar)

    2010-01-01

    htmlabstractMOTIVATION: Structural alignments of proteins are important for identification of structural similarities, homology detection and functional annotation. The structural alignment problem is well studied and computationally difficult. Many different scoring schemes for structural

  17. Structure based alignment and clustering of proteins (STRALCP)

    Science.gov (United States)

    Zemla, Adam T.; Zhou, Carol E.; Smith, Jason R.; Lam, Marisa W.

    2013-06-18

    Disclosed are computational methods of clustering a set of protein structures based on local and pair-wise global similarity values. Pair-wise local and global similarity values are generated based on pair-wise structural alignments for each protein in the set of protein structures. Initially, the protein structures are clustered based on pair-wise local similarity values. The protein structures are then clustered based on pair-wise global similarity values. For each given cluster both a representative structure and spans of conserved residues are identified. The representative protein structure is used to assign newly-solved protein structures to a group. The spans are used to characterize conservation and assign a "structural footprint" to the cluster.

  18. NMR Studies of Protein Structure and Dynamics

    Science.gov (United States)

    Li, Xiang

    Available from UMI in association with The British Library. Requires signed TDF. This thesis describes applications of 2D homonuclear NMR techniques to the study of protein structure and dynamics in solution. The sequential assignments for the 3G-residue bovine Pancreatic Polypeptide (bPP) are reported. The secondary and tertiary structure of bPP in solution has been determined from experimental NMR data. bPP has a well defined C-terminal alpha-helix and a rather ordered conformation in the N-terminal region. The two segments are joined by a turn which is poorly defined. Both the N- and the C-terminus are highly disordered. The mean solution structure of bPP is remarkably similar to the crystal structure of avian Pancreatic Polypeptide (aPP). The average conformations of most side-chains from the alpha-helix of bPP in solution are closely similar to those of aPP in the crystalline state. A large number of side-chains of bPP, however, show significant conformational averaging in solution. The 89-residue kringle domain of urokinase from both human and recombinant sources has been investigated. Sequential assignments based primarily on the recombinant sample and the determination of secondary structure are presented. Two helices have been identified; one of these corresponds to that reported for t-PA kringle 2, but does not exist in other kringles with known structures. The second helix is thus far unique to the urokinase kringle. Three antiparallel beta-sheets and three tight turns have also been identified. The tertiary fold of the molecule conforms broadly to that found for other kringles. Three regions in the urokinase kringle exhibit high local mobility; one of these, the Pro56-Pro62 segment, forms part of the proposed binding site. The other two mobile regions are the N- and C-termini which are likely to form the interfaces between the kringle and the other two domains (EGF and protease) in urokinase. The differential dynamic behaviours of the kringle and

  19. Quantum chemical studies of protein structure

    Science.gov (United States)

    Oldfield, Eric

    2004-01-01

    Quantum chemical methods now permit the prediction of many spectroscopic observables in proteins and related model systems, in addition to electrostatic properties, which are found to be in excellent accord with those determined from experiment. I discuss the developments over the past decade in these areas, including predictions of nuclear magnetic resonance chemical shifts, chemical shielding tensors, scalar couplings and hyperfine (contact) shifts, the isomer shifts and quadrupole splittings in Mössbauer spectroscopy, molecular energies and conformations, as well as a range of electrostatic properties, such as charge densities, the curvatures, Laplacians and Hessians of the charge density, electrostatic potentials, electric field gradients and electrostatic field effects. The availability of structure/spectroscopic correlations from quantum chemistry provides a basis for using numerous spectroscopic observables in determining aspects of protein structure, in determining electrostatic properties which are not readily accessible from experiment, as well as giving additional confidence in the use of these techniques to investigate questions about chemical bonding and chemical reactions. PMID:16147526

  20. Protein flexibility in the light of structural alphabets

    Directory of Open Access Journals (Sweden)

    Pierrick eCraveur

    2015-05-01

    Full Text Available Protein structures are valuable tools to understand protein function. Nonetheless, proteins are often considered as rigid macromolecules while their structures exhibit specific flexibility, which is essential to complete their functions. Analyses of protein structures and dynamics are often performed with a simplified three-state description, i.e. the classical secondary structures. . More precise and complete description of protein backbone conformation can be obtained using libraries of small protein fragments that are able to approximate every part of protein structures. These libraries, called structural alphabets (SAs, have been widely used in structure analysis field, from definition of ligand binding sites to superimposition of protein structures. SAs are also well suited to analyze the dynamics of protein structures.Here, we review innovative approaches that investigate protein flexibility based on SAs description. Coupled to various sources of experimental data (e.g. B-factor and computational methodology (e.g. Molecular Dynamic simulation, SAs turn out to be powerful tools to analyze protein dynamics, e.g. to examine allosteric mechanisms in large set of structures in complexes, to identify order/disorder transition. SAs were also shown to be quite efficient to predict protein flexibility from amino-acid sequence. Finally, in this review, we exemplify the interest of SAs for studying flexibility with different cases of proteins implicated in pathologies and diseases.

  1. Computational Methods for Protein Structure Prediction and Modeling Volume 2: Structure Prediction

    CERN Document Server

    Xu, Ying; Liang, Jie

    2007-01-01

    Volume 2 of this two-volume sequence focuses on protein structure prediction and includes protein threading, De novo methods, applications to membrane proteins and protein complexes, structure-based drug design, as well as structure prediction as a systems problem. A series of appendices review the biological and chemical basics related to protein structure, computer science for structural informatics, and prerequisite mathematics and statistics.

  2. Structure-based druggability assessment of the mammalian structural proteome with inclusion of light protein flexibility.

    Directory of Open Access Journals (Sweden)

    Kathryn A Loving

    2014-07-01

    Full Text Available Advances reported over the last few years and the increasing availability of protein crystal structure data have greatly improved structure-based druggability approaches. However, in practice, nearly all druggability estimation methods are applied to protein crystal structures as rigid proteins, with protein flexibility often not directly addressed. The inclusion of protein flexibility is important in correctly identifying the druggability of pockets that would be missed by methods based solely on the rigid crystal structure. These include cryptic pockets and flexible pockets often found at protein-protein interaction interfaces. Here, we apply an approach that uses protein modeling in concert with druggability estimation to account for light protein backbone movement and protein side-chain flexibility in protein binding sites. We assess the advantages and limitations of this approach on widely-used protein druggability sets. Applying the approach to all mammalian protein crystal structures in the PDB results in identification of 69 proteins with potential druggable cryptic pockets.

  3. Structure-based druggability assessment of the mammalian structural proteome with inclusion of light protein flexibility.

    Science.gov (United States)

    Loving, Kathryn A; Lin, Andy; Cheng, Alan C

    2014-07-01

    Advances reported over the last few years and the increasing availability of protein crystal structure data have greatly improved structure-based druggability approaches. However, in practice, nearly all druggability estimation methods are applied to protein crystal structures as rigid proteins, with protein flexibility often not directly addressed. The inclusion of protein flexibility is important in correctly identifying the druggability of pockets that would be missed by methods based solely on the rigid crystal structure. These include cryptic pockets and flexible pockets often found at protein-protein interaction interfaces. Here, we apply an approach that uses protein modeling in concert with druggability estimation to account for light protein backbone movement and protein side-chain flexibility in protein binding sites. We assess the advantages and limitations of this approach on widely-used protein druggability sets. Applying the approach to all mammalian protein crystal structures in the PDB results in identification of 69 proteins with potential druggable cryptic pockets.

  4. The utility of comparative models and the local model quality for protein crystal structure determination by Molecular Replacement

    Directory of Open Access Journals (Sweden)

    Pawlowski Marcin

    2012-11-01

    Full Text Available Abstract Background Computational models of protein structures were proved to be useful as search models in Molecular Replacement (MR, a common method to solve the phase problem faced by macromolecular crystallography. The success of MR depends on the accuracy of a search model. Unfortunately, this parameter remains unknown until the final structure of the target protein is determined. During the last few years, several Model Quality Assessment Programs (MQAPs that predict the local accuracy of theoretical models have been developed. In this article, we analyze whether the application of MQAPs improves the utility of theoretical models in MR. Results For our dataset of 615 search models, the real local accuracy of a model increases the MR success ratio by 101% compared to corresponding polyalanine templates. On the contrary, when local model quality is not utilized in MR, the computational models solved only 4.5% more MR searches than polyalanine templates. For the same dataset of the 615 models, a workflow combining MR with predicted local accuracy of a model found 45% more correct solution than polyalanine templates. To predict such accuracy MetaMQAPclust, a “clustering MQAP” was used. Conclusions Using comparative models only marginally increases the MR success ratio in comparison to polyalanine structures of templates. However, the situation changes dramatically once comparative models are used together with their predicted local accuracy. A new functionality was added to the GeneSilico Fold Prediction Metaserver in order to build models that are more useful for MR searches. Additionally, we have developed a simple method, AmIgoMR (Am I good for MR?, to predict if an MR search with a template-based model for a given template is likely to find the correct solution.

  5. Searching protein 3-D structures for optimal structure alignment using intelligent algorithms and data structures.

    Science.gov (United States)

    Novosád, Tomáš; Snášel, Václav; Abraham, Ajith; Yang, Jack Y

    2010-11-01

    In this paper, we present a novel algorithm for measuring protein similarity based on their 3-D structure (protein tertiary structure). The algorithm used a suffix tree for discovering common parts of main chains of all proteins appearing in the current research collaboratory for structural bioinformatics protein data bank (PDB). By identifying these common parts, we build a vector model and use some classical information retrieval (IR) algorithms based on the vector model to measure the similarity between proteins--all to all protein similarity. For the calculation of protein similarity, we use term frequency × inverse document frequency ( tf × idf ) term weighing schema and cosine similarity measure. The goal of this paper is to introduce new protein similarity metric based on suffix trees and IR methods. Whole current PDB database was used to demonstrate very good time complexity of the algorithm as well as high precision. We have chosen the structural classification of proteins (SCOP) database for verification of the precision of our algorithm because it is maintained primarily by humans. The next success of this paper would be the ability to determine SCOP categories of proteins not included in the latest version of the SCOP database (v. 1.75) with nearly 100% precision.

  6. Structural determination of intact proteins using mass spectrometry

    Science.gov (United States)

    Kruppa, Gary [San Francisco, CA; Schoeniger, Joseph S [Oakland, CA; Young, Malin M [Livermore, CA

    2008-05-06

    The present invention relates to novel methods of determining the sequence and structure of proteins. Specifically, the present invention allows for the analysis of intact proteins within a mass spectrometer. Therefore, preparatory separations need not be performed prior to introducing a protein sample into the mass spectrometer. Also disclosed herein are new instrumental developments for enhancing the signal from the desired modified proteins, methods for producing controlled protein fragments in the mass spectrometer, eliminating complex microseparations, and protein preparatory chemical steps necessary for cross-linking based protein structure determination.Additionally, the preferred method of the present invention involves the determination of protein structures utilizing a top-down analysis of protein structures to search for covalent modifications. In the preferred method, intact proteins are ionized and fragmented within the mass spectrometer.

  7. Protein structure prediction using basin-hopping

    Science.gov (United States)

    Prentiss, Michael C.; Wales, David J.; Wolynes, Peter G.

    2008-06-01

    Associative memory Hamiltonian structure prediction potentials are not overly rugged, thereby suggesting their landscapes are like those of actual proteins. In the present contribution we show how basin-hopping global optimization can identify low-lying minima for the corresponding mildly frustrated energy landscapes. For small systems the basin-hopping algorithm succeeds in locating both lower minima and conformations closer to the experimental structure than does molecular dynamics with simulated annealing. For large systems the efficiency of basin-hopping decreases for our initial implementation, where the steps consist of random perturbations to the Cartesian coordinates. We implemented umbrella sampling using basin-hopping to further confirm when the global minima are reached. We have also improved the energy surface by employing bioinformatic techniques for reducing the roughness or variance of the energy surface. Finally, the basin-hopping calculations have guided improvements in the excluded volume of the Hamiltonian, producing better structures. These results suggest a novel and transferable optimization scheme for future energy function development.

  8. Membrane protein structures without crystals, by single particle electron cryomicroscopy.

    Science.gov (United States)

    Vinothkumar, Kutti R

    2015-08-01

    It is an exciting period in membrane protein structural biology with a number of medically important protein structures determined at a rapid pace. However, two major hurdles still remain in the structural biology of membrane proteins. One is the inability to obtain large amounts of protein for crystallization and the other is the failure to get well-diffracting crystals. With single particle electron cryomicroscopy, both these problems can be overcome and high-resolution structures of membrane proteins and other labile protein complexes can be obtained with very little protein and without the need for crystals. In this review, I highlight recent advances in electron microscopy, detectors and software, which have allowed determination of medium to high-resolution structures of membrane proteins and complexes that have been difficult to study by other structural biological techniques. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  9. Protein structure similarity from principle component correlation analysis

    Directory of Open Access Journals (Sweden)

    Chou James

    2006-01-01

    Full Text Available Abstract Background Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. Results We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. Conclusion The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum

  10. 3-Dimensional Protein Structure of Influenza

    Science.gov (United States)

    2004-01-01

    The loss of productivity due to flu is staggering. Costs range as much as $20 billio a year. High mutation rates of the flu virus have hindered development of new drugs or vaccines. The secret lies in a small molecule which is attached to the host cell's surface. Each flu virus, no matter what strain, must remove this small molecule to escape the host cell to spread infection. Using data from space and earth grown crystals, researchers from the Center of Macromolecular Crystallography (CMC) are desining drugs to bind with this protein's active site. This lock and key fit reduces the spread of flu in the body by blocking its escape route. In collaboration with its corporate partner, the CMC has refined drug structure in preparation for clinical trials. Tested and approved relief is expected to reach drugstores by year 2004.

  11. Automating Embedded Analysis Capabilities and Managing Software Complexity in Multiphysics Simulation, Part I: Template-Based Generic Programming

    Directory of Open Access Journals (Sweden)

    Roger P. Pawlowski

    2012-01-01

    Full Text Available An approach for incorporating embedded simulation and analysis capabilities in complex simulation codes through template-based generic programming is presented. This approach relies on templating and operator overloading within the C++ language to transform a given calculation into one that can compute a variety of additional quantities that are necessary for many state-of-the-art simulation and analysis algorithms. An approach for incorporating these ideas into complex simulation codes through general graph-based assembly is also presented. These ideas have been implemented within a set of packages in the Trilinos framework and are demonstrated on a simple problem from chemical engineering.

  12. Sampling Realistic Protein Conformations Using Local Structural Bias

    DEFF Research Database (Denmark)

    Hamelryck, Thomas Wim; Kent, John T.; Krogh, A.

    2006-01-01

    The prediction of protein structure from sequence remains a major unsolved problem in biology. The most successful protein structure prediction methods make use of a divide-and-conquer strategy to attack the problem: a conformational sampling method generates plausible candidate structures, which...... are subsequently accepted or rejected using an energy function. Conceptually, this often corresponds to separating local structural bias from the long-range interactions that stabilize the compact, native state. However, sampling protein conformations that are compatible with the local structural bias encoded...... for protein structure prediction, determination, simulation, and design....

  13. T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension.

    Science.gov (United States)

    Di Tommaso, Paolo; Moretti, Sebastien; Xenarios, Ioannis; Orobitg, Miquel; Montanyola, Alberto; Chang, Jia-Ming; Taly, Jean-François; Notredame, Cedric

    2011-07-01

    This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. These three available template modes are Expresso for the alignment of protein with a known 3D-Structure, R-Coffee to align RNA sequences with conserved secondary structures and PSI-Coffee to accurately align distantly related sequences using homology extension. The new server benefits from recent improvements of the T-Coffee algorithm and can align up to 150 sequences as long as 10,000 residues and is available from both http://www.tcoffee.org and its main mirror http://tcoffee.crg.cat.

  14. PredictProtein--an open resource for online prediction of protein structural and functional features

    NARCIS (Netherlands)

    Yachdav, G.; Kloppmann, E.; Kajan, L.; Hecht, M.; Goldberg, T.; Hamp, T.; Honigschmid, P.; Schafferhans, A.; Roos, M.; Bernhofer, M.; Richter, L.; Ashkenazy, H.; Punta, M.; Schlessinger, A.; Bromberg, Y.; Schneider, R.; Vriend, G.; Sander, C.; Ben-Tal, N.; Rost, B.

    2014-01-01

    PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility,

  15. Protein folding, protein structure and the origin of life: Theoretical methods and solutions of dynamical problems

    Science.gov (United States)

    Weaver, D. L.

    1982-01-01

    Theoretical methods and solutions of the dynamics of protein folding, protein aggregation, protein structure, and the origin of life are discussed. The elements of a dynamic model representing the initial stages of protein folding are presented. The calculation and experimental determination of the model parameters are discussed. The use of computer simulation for modeling protein folding is considered.

  16. STRUCTURAL FEATURES OF PLANT CHITINASES AND CHITIN-BINDING PROTEINS

    NARCIS (Netherlands)

    BEINTEMA, JJ

    1994-01-01

    Structural features of plant chitinases and chitin-binding proteins are discussed. Many of these proteins consist of multiple domains,of which the chitin-binding hevein domain is a predominant one. X-ray and NMR structures of representatives of the major classes of these proteins are available now,

  17. Studying Membrane Protein Structure and Function Using Nanodiscs

    DEFF Research Database (Denmark)

    Huda, Pie

    The structure and dynamic of membrane proteins can provide valuable information about general functions, diseases and effects of various drugs. Studying membrane proteins are a challenge as an amphiphilic environment is necessary to stabilise the protein in a functionally and structurally relevan...

  18. Nonlinear deterministic structures and the randomness of protein sequences

    CERN Document Server

    Huang Yan Zhao

    2003-01-01

    To clarify the randomness of protein sequences, we make a detailed analysis of a set of typical protein sequences representing each structural classes by using nonlinear prediction method. No deterministic structures are found in these protein sequences and this implies that they behave as random sequences. We also give an explanation to the controversial results obtained in previous investigations.

  19. Prediction of protein folding rates from simplified secondary structure alphabet.

    Science.gov (United States)

    Huang, Jitao T; Wang, Titi; Huang, Shanran R; Li, Xin

    2015-10-21

    Protein folding is a very complicated and highly cooperative dynamic process. However, the folding kinetics is likely to depend more on a few key structural features. Here we find that secondary structures can determine folding rates of only large, multi-state folding proteins and fails to predict those for small, two-state proteins. The importance of secondary structures for protein folding is ordered as: extended β strand > α helix > bend > turn > undefined secondary structure>310 helix > isolated β strand > π helix. Only the first three secondary structures, extended β strand, α helix and bend, can achieve a good correlation with folding rates. This suggests that the rate-limiting step of protein folding would depend upon the formation of regular secondary structures and the buckling of chain. The reduced secondary structure alphabet provides a simplified description for the machine learning applications in protein design. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Gaia: automated quality assessment of protein structure models.

    Science.gov (United States)

    Kota, Pradeep; Ding, Feng; Ramachandran, Srinivas; Dokholyan, Nikolay V

    2011-08-15

    Increasing use of structural modeling for understanding structure-function relationships in proteins has led to the need to ensure that the protein models being used are of acceptable quality. Quality of a given protein structure can be assessed by comparing various intrinsic structural properties of the protein to those observed in high-resolution protein structures. In this study, we present tools to compare a given structure to high-resolution crystal structures. We assess packing by calculating the total void volume, the percentage of unsatisfied hydrogen bonds, the number of steric clashes and the scaling of the accessible surface area. We assess covalent geometry by determining bond lengths, angles, dihedrals and rotamers. The statistical parameters for the above measures, obtained from high-resolution crystal structures enable us to provide a quality-score that points to specific areas where a given protein structural model needs improvement. We provide these tools that appraise protein structures in the form of a web server Gaia (http://chiron.dokhlab.org). Gaia evaluates the packing and covalent geometry of a given protein structure and provides quantitative comparison of the given structure to high-resolution crystal structures. dokh@unc.edu Supplementary data are available at Bioinformatics online.

  1. SCPC: a method to structurally compare protein complexes.

    Science.gov (United States)

    Koike, Ryotaro; Ota, Motonori

    2012-02-01

    Protein-protein interactions play vital functional roles in various biological phenomena. Physical contacts between proteins have been revealed using experimental approaches that have solved the structures of protein complexes at atomic resolution. To examine the huge number of protein complexes available in the Protein Data Bank, an efficient automated method that compares protein complexes is required. We have developed Structural Comparison of Protein Complexes (SCPC), a novel method to structurally compare protein complexes. SCPC compares the spatial arrangements of subunits in a complex with those in another complex using secondary structure elements. Similar substructures are detected in two protein complexes and the similarity is scored. SCPC was applied to dimers, homo-oligomers and haemoglobins. SCPC properly estimated structural similarities between the dimers examined as well as an existing method, MM-align. Conserved substructures were detected in a homo-tetramer and a homo-hexamer composed of homologous proteins. Classification of quaternary structures of haemoglobins using SCPC was consistent with the conventional classification. The results demonstrate that SCPC is a valuable tool to investigate the structures of protein complexes. SCPC is available at http://idp1.force.cs.is.nagoya-u.ac.jp/scpc/. rkoike@is.nagoya-u.ac.jp Supplementary data are available at Bioinformatics online.

  2. K-nearest uphill clustering in the protein structure space

    KAUST Repository

    Cui, Xuefeng

    2016-08-26

    The protein structure classification problem, which is to assign a protein structure to a cluster of similar proteins, is one of the most fundamental problems in the construction and application of the protein structure space. Early manually curated protein structure classifications (e.g., SCOP and CATH) are very successful, but recently suffer the slow updating problem because of the increased throughput of newly solved protein structures. Thus, fully automatic methods to cluster proteins in the protein structure space have been designed and developed. In this study, we observed that the SCOP superfamilies are highly consistent with clustering trees representing hierarchical clustering procedures, but the tree cutting is very challenging and becomes the bottleneck of clustering accuracy. To overcome this challenge, we proposed a novel density-based K-nearest uphill clustering method that effectively eliminates noisy pairwise protein structure similarities and identifies density peaks as cluster centers. Specifically, the density peaks are identified based on K-nearest uphills (i.e., proteins with higher densities) and K-nearest neighbors. To our knowledge, this is the first attempt to apply and develop density-based clustering methods in the protein structure space. Our results show that our density-based clustering method outperforms the state-of-the-art clustering methods previously applied to the problem. Moreover, we observed that computational methods and human experts could produce highly similar clusters at high precision values, while computational methods also suggest to split some large superfamilies into smaller clusters. © 2016 Elsevier B.V.

  3. Structural and Function Prediction of Musa acuminata subsp. Malaccensis Protein

    National Research Council Canada - National Science Library

    Anum Munir; Azhar Mehmood; Shumaila Azam

    2016-01-01

    ... built up. Illustrating the structural and functional privileged insights of these HPs might likewise prompt a superior comprehension of the protein-protein associations or networks in diverse types of life. Bananas (Musa acuminata spp...

  4. TSTMP: target selection for structural genomics of human transmembrane proteins.

    Science.gov (United States)

    Varga, Julia; Dobson, László; Reményi, István; Tusnády, Gábor E

    2017-01-04

    The TSTMP database is designed to help the target selection of human transmembrane proteins for structural genomics projects and structure modeling studies. Currently, there are only 60 known 3D structures among the polytopic human transmembrane proteins and about a further 600 could be modeled using existing structures. Although there are a great number of human transmembrane protein structures left to be determined, surprisingly only a small fraction of these proteins have 'selected' (or above) status according to the current version the TargetDB/TargetTrack database. This figure is even worse regarding those transmembrane proteins that would contribute the most to the structural coverage of the human transmembrane proteome. The database was built by sorting out proteins from the human transmembrane proteome with known structure and searching for suitable model structures for the remaining proteins by combining the results of a state-of-the-art transmembrane specific fold recognition algorithm and a sequence similarity search algorithm. Proteins were searched for homologues among the human transmembrane proteins in order to select targets whose successful structure determination would lead to the best structural coverage of the human transmembrane proteome. The pipeline constructed for creating the TSTMP database guarantees to keep the database up-to-date. The database is available at http://tstmp.enzim.ttk.mta.hu. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Rheology and structure of milk protein gels

    NARCIS (Netherlands)

    Vliet, van T.; Lakemond, C.M.M.; Visschers, R.W.

    2004-01-01

    Recent studies on gel formation and rheology of milk gels are reviewed. A distinction is made between gels formed by aggregated casein, gels of `pure` whey proteins and gels in which both casein and whey proteins contribute to their properties. For casein' whey protein mixtures, it has been shown

  6. Implementation of a Parallel Protein Structure Alignment Service on Cloud

    Science.gov (United States)

    Hung, Che-Lun; Lin, Yaw-Ling

    2013-01-01

    Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform. PMID:23671842

  7. Implementation of a Parallel Protein Structure Alignment Service on Cloud

    Directory of Open Access Journals (Sweden)

    Che-Lun Hung

    2013-01-01

    Full Text Available Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.

  8. Using linear algebra for protein structural comparison and classification

    OpenAIRE

    Janaína Gomide; Raquel Melo-Minardi; Marcos Augusto dos Santos; Goran Neshich; Wagner Meira Jr.; Júlio César Lopes; Marcelo Santoro

    2009-01-01

    In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as term...

  9. Protein Structure and Function Prediction Using I-TASSER.

    Science.gov (United States)

    Yang, Jianyi; Zhang, Yang

    2015-12-17

    I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. Copyright © 2015 John Wiley & Sons, Inc.

  10. Protein structural modularity and robustness are associated with evolvability.

    Science.gov (United States)

    Rorick, Mary M; Wagner, Günter P

    2011-01-01

    Theory suggests that biological modularity and robustness allow for maintenance of fitness under mutational change, and when this change is adaptive, for evolvability. Empirical demonstrations that these traits promote evolvability in nature remain scant however. This is in part because modularity, robustness, and evolvability are difficult to define and measure in real biological systems. Here, we address whether structural modularity and/or robustness confer evolvability at the level of proteins by looking for associations between indices of protein structural modularity, structural robustness, and evolvability. We propose a novel index for protein structural modularity: the number of regular secondary structure elements (helices and strands) divided by the number of residues in the structure. We index protein evolvability as the proportion of sites with evidence of being under positive selection multiplied by the average rate of adaptive evolution at these sites, and we measure this as an average over a phylogeny of 25 mammalian species. We use contact density as an index of protein designability, and thus, structural robustness. We find that protein evolvability is positively associated with structural modularity as well as structural robustness and that the effect of structural modularity on evolvability is independent of the structural robustness index. We interpret these associations to be the result of reduced constraints on amino acid substitutions in highly modular and robust protein structures, which results in faster adaptation through natural selection.

  11. Synthesis of Foam-Shaped Nanoporous Zeolite Material: A Simple Template-Based Method

    Science.gov (United States)

    Saini, Vipin K.; Pires, Joao

    2012-01-01

    Nanoporous zeolite foam is an interesting crystalline material with an open-cell microcellular structure, similar to polyurethane foam (PUF). The aluminosilicate structure of this material has a large surface area, extended porosity, and mechanical strength. Owing to these properties, this material is suitable for industrial applications such as…

  12. PSI-2: structural genomics to cover protein domain family space.

    Science.gov (United States)

    Dessailly, Benoît H; Nair, Rajesh; Jaroszewski, Lukasz; Fajardo, J Eduardo; Kouranov, Andrei; Lee, David; Fiser, Andras; Godzik, Adam; Rost, Burkhard; Orengo, Christine

    2009-06-10

    One major objective of structural genomics efforts, including the NIH-funded Protein Structure Initiative (PSI), has been to increase the structural coverage of protein sequence space. Here, we present the target selection strategy used during the second phase of PSI (PSI-2). This strategy, jointly devised by the bioinformatics groups associated with the PSI-2 large-scale production centers, targets representatives from large, structurally uncharacterized protein domain families, and from structurally uncharacterized subfamilies in very large and diverse families with incomplete structural coverage. These very large families are extremely diverse both structurally and functionally, and are highly overrepresented in known proteomes. On the basis of several metrics, we then discuss to what extent PSI-2, during its first 3 years, has increased the structural coverage of genomes, and contributed structural and functional novelty. Together, the results presented here suggest that PSI-2 is successfully meeting its objectives and provides useful insights into structural and functional space.

  13. A New Hidden Markov Model for Protein Quality Assessment Using Compatibility Between Protein Sequence and Structure.

    Science.gov (United States)

    He, Zhiquan; Ma, Wenji; Zhang, Jingfen; Xu, Dong

    2015-03-25

    Protein structure Quality Assessment (QA) is an essential component in protein structure prediction and analysis. The relationship between protein sequence and structure often serves as a basis for protein structure QA. In this work, we developed a new Hidden Markov Model (HMM) to assess the compatibility of protein sequence and structure for capturing their complex relationship. More specifically, the emission of the HMM consists of protein local structures in angular space, secondary structures, and sequence profiles. This model has two capabilities: (1) encoding local structure of each position by jointly considering sequence and structure information, and (2) assigning a global score to estimate the overall quality of a predicted structure, as well as local scores to assess the quality of specific regions of a structure, which provides useful guidance for targeted structure refinement. We compared the HMM model to state-of-art single structure quality assessment methods OPUSCA, DFIRE, GOAP, and RW in protein structure selection. Computational results showed our new score HMM.Z can achieve better overall selection performance on the benchmark datasets.

  14. Predicting nucleic acid binding interfaces from structural models of proteins

    Science.gov (United States)

    Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

    2011-01-01

    The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared to patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. PMID:22086767

  15. Predicting nucleic acid binding interfaces from structural models of proteins.

    Science.gov (United States)

    Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

    2012-02-01

    The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.

  16. Structural Studies of G Protein-Coupled Receptors.

    Science.gov (United States)

    Zhang, Dandan; Zhao, Qiang; Wu, Beili

    2015-10-01

    G protein-coupled receptors (GPCRs) constitute the largest and the most physiologically important membrane protein family that recognizes a variety of environmental stimuli, and are drug targets in the treatment of numerous diseases. Recent progress on GPCR structural studies shed light on molecular mechanisms of GPCR ligand recognition, activation and allosteric modulation, as well as structural basis of GPCR dimerization. In this review, we will discuss the structural features of GPCRs and structural insights of different aspects of GPCR biological functions.

  17. Emerging Methods for Structural Analysis of Protein Aggregation.

    Science.gov (United States)

    Khan, Eshan; Mishra, Subodh K; Kumar, Amit

    2017-01-01

    Protein misfolding and aggregation is a key attribute of different neurodegenerative diseases. Misfolded and aggregated proteins are intrinsically disordered and rule out structure based drug design. The comprehensive characterization of misfolded proteins and associated aggregation pathway is prerequisite to develop therapeutics for neurodegenerative diseases caused due to the protein aggregation. Visible protein aggregates used to be the final stage during aggregation mechanism. The structural analysis of intermediate steps in such protein aggregates will help us to discern the conformational role and subsequently involved pathways. The structural analysis of protein aggregation using various biophysical methods may aid for improved therapeutics for protein misfolding and aggregation related neurodegenerative diseases. In this mini review, we have summarized different spectroscopic methods such as fluorescence spectroscopy, circular dichroism (CD), nuclear magnetic resonance (NMR) spectroscopy, Fourier transform infrared spectroscopy (FTIR), and Raman spectroscopy for structural analysis of protein aggregation. We believe that the understanding of invisible intermediate of misfolded proteins and the key steps involved during protein aggregation mechanisms may advance the therapeutic approaches for targeting neurological diseases that are caused due to misfolded proteins. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  18. Host Proteins Determine MRSA Biofilm Structure and Integrity

    DEFF Research Database (Denmark)

    Dreier, Cindy; Nielsen, Astrid; Jørgensen, Nis Pedersen

    Human extracellular matrix (hECM) proteins aids the initial attachment and initiation of an infection, by specific binding to bacterial cell surface proteins. However, the importance of hECM proteins in structure, integrity and antibiotic resilience of a biofilm is unknown. This study aims to det...

  19. Protein folds and families: sequence and structure alignments.

    Science.gov (United States)

    Holm, L; Sander, C

    1999-01-01

    Dali and HSSP are derived databases organizing protein space in the structurally known regions. We use an automatic structure alignment program (Dali) for the classification of all known 3D structures based on all-against-all comparison of 3D structures in the Protein Data Bank. The HSSP database associates 1D sequences with known 3D structures using a position-weighted dynamic programming method for sequence profile alignment (MaxHom). As a result, the HSSP database not only provides aligned sequence families, but also implies secondary and tertiary structures covering 36% of all sequences in Swiss-Prot. The structure classification by Dali and the sequence families in HSSP can be browsed jointly from a web interface providing a rich network of links between neighbours in fold space, between domains and proteins, and between structures and sequences. In particular, this results in a database of explicit multiple alignments of protein families in the twilight zone of sequence similarity. The organization of protein structures and families provides a map of the currently known regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The databases are available from http://www.embl-ebi.ac.uk/dali/

  20. DOCK/PIERR: web server for structure prediction of protein-protein complexes.

    Science.gov (United States)

    Viswanath, Shruthi; Ravikant, D V S; Elber, Ron

    2014-01-01

    In protein docking we aim to find the structure of the complex formed when two proteins interact. Protein-protein interactions are crucial for cell function. Here we discuss the usage of DOCK/PIERR. In DOCK/PIERR, a uniformly discrete sampling of orientations of one protein with respect to the other, are scored, followed by clustering, refinement, and reranking of structures. The novelty of this method lies in the scoring functions used. These are obtained by examining hundreds of millions of correctly and incorrectly docked structures, using an algorithm based on mathematical programming, with provable convergence properties.

  1. Current strategies for protein production and purification enabling membrane protein structural biology.

    Science.gov (United States)

    Pandey, Aditya; Shin, Kyungsoo; Patterson, Robin E; Liu, Xiang-Qin; Rainey, Jan K

    2016-12-01

    Membrane proteins are still heavily under-represented in the protein data bank (PDB), owing to multiple bottlenecks. The typical low abundance of membrane proteins in their natural hosts makes it necessary to overexpress these proteins either in heterologous systems or through in vitro translation/cell-free expression. Heterologous expression of proteins, in turn, leads to multiple obstacles, owing to the unpredictability of compatibility of the target protein for expression in a given host. The highly hydrophobic and (or) amphipathic nature of membrane proteins also leads to challenges in producing a homogeneous, stable, and pure sample for structural studies. Circumventing these hurdles has become possible through the introduction of novel protein production protocols; efficient protein isolation and sample preparation methods; and, improvement in hardware and software for structural characterization. Combined, these advances have made the past 10-15 years very exciting and eventful for the field of membrane protein structural biology, with an exponential growth in the number of solved membrane protein structures. In this review, we focus on both the advances and diversity of protein production and purification methods that have allowed this growth in structural knowledge of membrane proteins through X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM).

  2. Structural footprinting in protein structure comparison: the impact of structural fragments

    Directory of Open Access Journals (Sweden)

    Wilbur W John

    2007-08-01

    Full Text Available Abstract Background One approach for speeding-up protein structure comparison is the projection approach, where a protein structure is mapped to a high-dimensional vector and structural similarity is approximated by distance between the corresponding vectors. Structural footprinting methods are projection methods that employ the same general technique to produce the mapping: first select a representative set of structural fragments as models and then map a protein structure to a vector in which each dimension corresponds to a particular model and "counts" the number of times the model appears in the structure. The main difference between any two structural footprinting methods is in the set of models they use; in fact a large number of methods can be generated by varying the type of structural fragments used and the amount of detail in their representation. How do these choices affect the ability of the method to detect various types of structural similarity? Results To answer this question we benchmarked three structural footprinting methods that vary significantly in their selection of models against the CATH database. In the first set of experiments we compared the methods' ability to detect structural similarity characteristic of evolutionarily related structures, i.e., structures within the same CATH superfamily. In the second set of experiments we tested the methods' agreement with the boundaries imposed by classification groups at the Class, Architecture, and Fold levels of the CATH hierarchy. Conclusion In both experiments we found that the method which uses secondary structure information has the best performance on average, but no one method performs consistently the best across all groups at a given classification level. We also found that combining the methods' outputs significantly improves the performance. Moreover, our new techniques to measure and visualize the methods' agreement with the CATH hierarchy, including the

  3. Bayesian inference of protein structure from chemical shift data

    DEFF Research Database (Denmark)

    Bratholm, Lars Andersen; Christensen, Anders Steen; Hamelryck, Thomas Wim

    2015-01-01

    Protein chemical shifts are routinely used to augment molecular mechanics force fields in protein structure simulations, with weights of the chemical shift restraints determined empirically. These weights, however, might not be an optimal descriptor of a given protein structure and predictive model...... Monte Carlo simulations of three small proteins (ENHD, Protein G and the SMN Tudor Domain) using the PROFASI force field and the chemical shift predictor CamShift. Using a clustering-criterion for identifying the best structure, together with the addition of a solvent exposure scoring term......, result in overall better convergence to the native fold, suggesting that both types of distribution might be useful in different aspects of the protein structure prediction....

  4. Tactile Teaching: Exploring Protein Structure/Function Using Physical Models

    Science.gov (United States)

    Herman, Tim; Morris, Jennifer; Colton, Shannon; Batiza, Ann; Patrick, Michael; Franzen, Margaret; Goodsell, David S.

    2006-01-01

    The technology now exists to construct physical models of proteins based on atomic coordinates of solved structures. We review here our recent experiences in using physical models to teach concepts of protein structure and function at both the high school and the undergraduate levels. At the high school level, physical models are used in a…

  5. The contact activation proteins: a structure/function overview

    NARCIS (Netherlands)

    Meijers, J. C.; McMullen, B. A.; Bouma, B. N.

    1992-01-01

    In recent years, extensive knowledge has been obtained on the structure/function relationships of blood coagulation proteins. In this overview, we present recent developments on the structure/function relationships of the contact activation proteins: factor XII, high molecular weight kininogen,

  6. Functional differentiation of proteins: implications for structural genomics.

    Science.gov (United States)

    Friedberg, Iddo; Godzik, Adam

    2007-04-01

    Structural genomics is a broad initiative of various centers aiming to provide complete coverage of protein structure space. Because it is not feasible to experimentally determine the structures of all proteins, it is generally agreed that the only viable strategy to achieve such coverage is to carefully select specific proteins (targets), determine their structure experimentally, and then use comparative modeling techniques to model the rest. Here we suggest that structural genomics centers refine the structure-driven approach in target selection by adopting function-based criteria. We suggest targeting functionally divergent superfamilies within a given structural fold so that each function receives a structural characterization. We have developed a method to do so, and an itemized survey of several functionally rich folds shows that they are only partially functionally characterized. We call upon structural genomics centers to consider this approach and upon computational biologists to further develop function-based targeting methods.

  7. Using an alignment of fragment strings for comparing protein structures

    DEFF Research Database (Denmark)

    Friedberg, Iddo; Harder, Tim; Kolodny, Rachel

    2007-01-01

    MOTIVATION: Most methods that are used to compare protein structures use three-dimensional (3D) structural information. At the same time, it has been shown that a 1D string representation of local protein structure retains a degree of structural information. This type of representation can be a p....... The results of this study have immediate applications towards fast structure recognition, and for fold prediction and classification.......MOTIVATION: Most methods that are used to compare protein structures use three-dimensional (3D) structural information. At the same time, it has been shown that a 1D string representation of local protein structure retains a degree of structural information. This type of representation can...... be a powerful tool for protein structure comparison and classification, given the arsenal of sequence comparison tools developed by computational biology. However, in order to do so, there is a need to first understand how much information is contained in various possible 1D representations of protein structure...

  8. Elliptical structure of phospholipid bilayer nanodiscs encapsulated by scaffold proteins

    DEFF Research Database (Denmark)

    Skar-Gislinge, Nicholas; Simonsen, Jens Bæk; Mortensen, Kell

    2010-01-01

    Phospholipid bilayers host and support the function of membrane proteins and may be stabilized in disc-like nanostructures, allowing for unprecedented solution studies of the assembly, structure, and function of membrane proteins (Bayburt et al. Nano Lett. 2002, 2, 853-856). Based on small-angle ...... the experimental scattering profile from nanodiscs. The model paves the way for future detailed structural studies of functional membrane proteins encapsulated in nanodiscs....

  9. Structure of synaptophysin: a hexameric MARVEL-domain channel protein.

    Science.gov (United States)

    Arthur, Christopher P; Stowell, Michael H B

    2007-06-01

    Synaptophysin I (SypI) is an archetypal member of the MARVEL-domain family of integral membrane proteins and one of the first synaptic vesicle proteins to be identified and cloned. Most all MARVEL-domain proteins are involved in membrane apposition and vesicle-trafficking events, but their precise role in these processes is unclear. We have purified mammalian SypI and determined its three-dimensional (3D) structure by using electron microscopy and single-particle 3D reconstruction. The hexameric structure resembles an open basket with a large pore and tenuous interactions within the cytosolic domain. The structure suggests a model for Synaptophysin's role in fusion and recycling that is regulated by known interactions with the SNARE machinery. This 3D structure of a MARVEL-domain protein provides a structural foundation for understanding the role of these important proteins in a variety of biological processes.

  10. The Structural Characterization of Tumor Fusion Genes and Proteins.

    Science.gov (United States)

    Wang, Dandan; Li, Daixi; Qin, Guangrong; Zhang, Wen; Ouyang, Jian; Zhang, Menghuan; Xie, Lu

    2015-01-01

    Chromosomal translocation, which generates fusion proteins in blood tumor or solid tumor, is considered as one of the major causes leading to cancer. Recent studies suggested that the disordered fragments in a fusion protein might contribute to its carcinogenicity. Here, we investigated the sequence feature near the breakpoints in the fusion partner genes, the structure features of breakpoints in fusion proteins, and the posttranslational modification preference in the fusion proteins. Results show that the breakpoints in the fusion partner genes have both sequence preference and structural preference. At the sequence level, nucleotide combination AG is preferred before the breakpoint and GG is preferred at the breakpoint. At the structural level, the breakpoints in the fusion proteins prefer to be located in the disordered regions. Further analysis suggests the phosphorylation sites at serine, threonine, and the methylation sites at arginine are enriched in disordered regions of the fusion proteins. Using EML4-ALK as an example, we further explained how the fusion protein leads to the protein disorder and contributes to its carcinogenicity. The sequence and structural features of the fusion proteins may help the scientific community to predict novel breakpoints in fusion genes and better understand the structure and function of fusion proteins.

  11. The Structural Characterization of Tumor Fusion Genes and Proteins

    OpenAIRE

    Wang, Dandan; Li, Daixi; Qin, Guangrong; Zhang, Wen; Ouyang, Jian; Zhang, Menghuan; Xie, Lu

    2015-01-01

    Chromosomal translocation, which generates fusion proteins in blood tumor or solid tumor, is considered as one of the major causes leading to cancer. Recent studies suggested that the disordered fragments in a fusion protein might contribute to its carcinogenicity. Here, we investigated the sequence feature near the breakpoints in the fusion partner genes, the structure features of breakpoints in fusion proteins, and the posttranslational modification preference in the fusion proteins. Result...

  12. Structural genomics plucks high-hanging membrane proteins.

    Science.gov (United States)

    Kloppmann, Edda; Punta, Marco; Rost, Burkhard

    2012-06-01

    Recent years have seen the establishment of structural genomics centers that explicitly target integral membrane proteins. Here, we review the advances in targeting these extremely high-hanging fruits of structural biology in high-throughput mode. We observe that the experimental determination of high-resolution structures of integral membrane proteins is increasingly successful both in terms of getting structures and of covering important protein families, for example, from Pfam. Structural genomics has begun to contribute significantly toward this progress. An important component of this contribution is the set up of robotic pipelines that generate a wealth of experimental data for membrane proteins. We argue that prediction methods for the identification of membrane regions and for the comparison of membrane proteins largely suffice to meet the challenges of target selection for structural genomics of membrane proteins. In contrast, we need better methods to prioritize the most promising members in a family of closely related proteins and to annotate protein function from sequence and structure in absence of homology. Copyright © 2012 Elsevier Ltd. All rights reserved.

  13. Continuum secondary structure captures protein flexibility

    DEFF Research Database (Denmark)

    Anderson, C.A.F.; Palmer, A.G.; Brunak, Søren

    2002-01-01

    with different hydrogen bond thresholds. The final continuous assignment for a single NMR model successfully reflected the structural variations observed between all NMR models in the ensemble. The structural variations between NMR models were verified to correlate with thermal motion; these variations were...... captured by the continuous assignments. Because the continuous assignment reproduces the structural variation between many NMR models from one single model, functionally important variation can be extracted from a single X-ray structure. Thus, continuous assignments of secondary structure may affect future...

  14. Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms [v2; ref status: indexed, http://f1000r.es/2d2

    Directory of Open Access Journals (Sweden)

    Sandeep Chakraborty

    2013-11-01

    Full Text Available Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database, is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.

  15. Structural Aspects of GPCR-G Protein Coupling.

    Science.gov (United States)

    Chung, Ka Young

    2013-09-01

    G protein-coupled receptors (GPCRs) are membrane receptors; approximately 40% of drugs on the market target GPCRs. A precise understanding of the activation mechanism of GPCRs would facilitate the development of more effective and less toxic drugs. Heterotrimeric G proteins are important molecular switches in GPCR-mediated signal transduction. An agonist-activated receptor interacts with specific sites on G proteins and promotes the release of GDP from the Gα subunit. Because of the important biological role of the GPCR-G protein coupling, conformational changes in the G protein upon receptor coupling have been of great interest. One of the most important questions was the interface between the GPCR and G proteins and the structural mechanism of GPCR-induced G protein activation. A number of biochemical and biophysical studies have been performed since the late 80s to address these questions; there was a significant breakthrough in 2011 when the crystal structure of a GPCR-G protein complex was solved. This review discusses the structural aspects of GPCR-G protein coupling by comparing the results of previous biochemical and biophysical studies to the GPCR-G protein crystal structure.

  16. Computational protein design quantifies structural constraints on amino acid covariation.

    Directory of Open Access Journals (Sweden)

    Noah Ollikainen

    Full Text Available Amino acid covariation, where the identities of amino acids at different sequence positions are correlated, is a hallmark of naturally occurring proteins. This covariation can arise from multiple factors, including selective pressures for maintaining protein structure, requirements imposed by a specific function, or from phylogenetic sampling bias. Here we employed flexible backbone computational protein design to quantify the extent to which protein structure has constrained amino acid covariation for 40 diverse protein domains. We find significant similarities between the amino acid covariation in alignments of natural protein sequences and sequences optimized for their structures by computational protein design methods. These results indicate that the structural constraints imposed by protein architecture play a dominant role in shaping amino acid covariation and that computational protein design methods can capture these effects. We also find that the similarity between natural and designed covariation is sensitive to the magnitude and mechanism of backbone flexibility used in computational protein design. Our results thus highlight the necessity of including backbone flexibility to correctly model precise details of correlated amino acid changes and give insights into the pressures underlying these correlations.

  17. Using linear algebra for protein structural comparison and classification

    Science.gov (United States)

    2009-01-01

    In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in. PMID:21637532

  18. Constraining cyclic peptides to mimic protein structure motifs

    DEFF Research Database (Denmark)

    Hill, Timothy A.; Shepherd, Nicholas E.; Diness, Frederik

    2014-01-01

    Many proteins exert their biological activities through small exposed surface regions called epitopes that are folded peptides of well-defined three-dimensional structures. Short synthetic peptide sequences corresponding to these bioactive protein surfaces do not form thermodynamically stable...... protein-like structures in water. However, short peptides can be induced to fold into protein-like bioactive conformations (strands, helices, turns) by cyclization, in conjunction with the use of other molecular constraints, that helps to fine-tune three-dimensional structure. Such constrained cyclic...... peptides can have protein-like biological activities and potencies, enabling their uses as biological probes and leads to therapeutics, diagnostics and vaccines. This Review highlights examples of cyclic peptides that mimic three-dimensional structures of strand, turn or helical segments of peptides...

  19. Using linear algebra for protein structural comparison and classification

    Directory of Open Access Journals (Sweden)

    Janaína Gomide

    2009-01-01

    Full Text Available In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD and Latent Semantic Indexing (LSI techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.

  20. Using linear algebra for protein structural comparison and classification.

    Science.gov (United States)

    Gomide, Janaína; Melo-Minardi, Raquel; Dos Santos, Marcos Augusto; Neshich, Goran; Meira, Wagner; Lopes, Júlio César; Santoro, Marcelo

    2009-07-01

    In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.

  1. Structural and Energetic Characterization of the Ankyrin Repeat Protein Family.

    Directory of Open Access Journals (Sweden)

    R Gonzalo Parra

    2015-12-01

    Full Text Available Ankyrin repeat containing proteins are one of the most abundant solenoid folds. Usually implicated in specific protein-protein interactions, these proteins are readily amenable for design, with promising biotechnological and biomedical applications. Studying repeat protein families presents technical challenges due to the high sequence divergence among the repeating units. We developed and applied a systematic method to consistently identify and annotate the structural repetitions over the members of the complete Ankyrin Repeat Protein Family, with increased sensitivity over previous studies. We statistically characterized the number of repeats, the folding of the repeat-arrays, their structural variations, insertions and deletions. An energetic analysis of the local frustration patterns reveal the basic features underlying fold stability and its relation to the functional binding regions. We found a strong linear correlation between the conservation of the energetic features in the repeat arrays and their sequence variations, and discuss new insights into the organization and function of these ubiquitous proteins.

  2. Integral membrane protein structure determination using pseudocontact shifts.

    Science.gov (United States)

    Crick, Duncan J; Wang, Jue X; Graham, Bim; Swarbrick, James D; Mott, Helen R; Nietlispach, Daniel

    2015-04-01

    Obtaining enough experimental restraints can be a limiting factor in the NMR structure determination of larger proteins. This is particularly the case for large assemblies such as membrane proteins that have been solubilized in a membrane-mimicking environment. Whilst in such cases extensive deuteration strategies are regularly utilised with the aim to improve the spectral quality, these schemes often limit the number of NOEs obtainable, making complementary strategies highly beneficial for successful structure elucidation. Recently, lanthanide-induced pseudocontact shifts (PCSs) have been established as a structural tool for globular proteins. Here, we demonstrate that a PCS-based approach can be successfully applied for the structure determination of integral membrane proteins. Using the 7TM α-helical microbial receptor pSRII, we show that PCS-derived restraints from lanthanide binding tags attached to four different positions of the protein facilitate the backbone structure determination when combined with a limited set of NOEs. In contrast, the same set of NOEs fails to determine the correct 3D fold. The latter situation is frequently encountered in polytopical α-helical membrane proteins and a PCS approach is thus suitable even for this particularly challenging class of membrane proteins. The ease of measuring PCSs makes this an attractive route for structure determination of large membrane proteins in general.

  3. Integral membrane protein structure determination using pseudocontact shifts

    Energy Technology Data Exchange (ETDEWEB)

    Crick, Duncan J.; Wang, Jue X. [University of Cambridge, Department of Biochemistry (United Kingdom); Graham, Bim; Swarbrick, James D. [Monash University, Monash Institute of Pharmaceutical Sciences (Australia); Mott, Helen R.; Nietlispach, Daniel, E-mail: dn206@cam.ac.uk [University of Cambridge, Department of Biochemistry (United Kingdom)

    2015-04-15

    Obtaining enough experimental restraints can be a limiting factor in the NMR structure determination of larger proteins. This is particularly the case for large assemblies such as membrane proteins that have been solubilized in a membrane-mimicking environment. Whilst in such cases extensive deuteration strategies are regularly utilised with the aim to improve the spectral quality, these schemes often limit the number of NOEs obtainable, making complementary strategies highly beneficial for successful structure elucidation. Recently, lanthanide-induced pseudocontact shifts (PCSs) have been established as a structural tool for globular proteins. Here, we demonstrate that a PCS-based approach can be successfully applied for the structure determination of integral membrane proteins. Using the 7TM α-helical microbial receptor pSRII, we show that PCS-derived restraints from lanthanide binding tags attached to four different positions of the protein facilitate the backbone structure determination when combined with a limited set of NOEs. In contrast, the same set of NOEs fails to determine the correct 3D fold. The latter situation is frequently encountered in polytopical α-helical membrane proteins and a PCS approach is thus suitable even for this particularly challenging class of membrane proteins. The ease of measuring PCSs makes this an attractive route for structure determination of large membrane proteins in general.

  4. High-Throughput Characterization of Intrinsic Disorder in Proteins from the Protein Structure Initiative

    Science.gov (United States)

    Johnson, Derrick E.; Xue, Bin; Sickmeier, Megan D.; Meng, Jingwei; Cortese, Marc S.; Oldfield, Christopher J.; Le Gall, Tanguy; Dunker, A. Keith; Uversky, Vladimir N.

    2012-01-01

    The identification of intrinsically disordered proteins (IDPs) among the targets that fail to form satisfactory crystal structures in the Protein Structure Initiative represent a key to reducing the costs and time for determining three-dimensional structures of proteins. To help in this endeavor, several Protein Structure Initiative Centers were asked to send samples of both crystallizable proteins and proteins that failed to crystallize. The abundance of intrinsic disorder in these proteins was evaluated via computational analysis using Predictors of Natural Disordered Regions (PONDR®) and the potential cleavage sites and corresponding fragments were determined. Then, the target proteins were analyzed for intrinsic disorder by their resistance to limited proteolysis. The rates of tryptic digestion of sample target proteins were compared to those of lysozyme/myoglobin, apo-myoglobin and α-casein as standards of ordered, partially disordered and completely disordered proteins, respectively. At the next stage, the protein samples were subjected to both far-UV and near-UV circular dichroism (CD) analysis. For most of the samples, a good agreement between CD data, predictions of disorder and the rates of limited tryptic digestion was established. Further experimentation is being performed on a smaller subset of these samples in order to obtain more detailed information on the ordered/disordered nature of the proteins. PMID:22651963

  5. Protein Evolution along Phylogenetic Histories under Structurally Constrained Substitution Models

    Science.gov (United States)

    Arenas, Miguel; Dos Santos, Helena G.; Posada, David; Bastolla, Ugo

    2017-01-01

    Motivation Models of molecular evolution aim at describing the evolutionary processes at the molecular level. However, current models rarely incorporate information from protein structure. Conversely, structure-based models of protein evolution have not been commonly applied to simulate sequence evolution in a phylogenetic framework and they often ignore relevant evolutionary processes such as recombination. A simulation evolutionary framework that integrates substitution models that account for protein structure stability should be able to generate more realistic in silico evolved proteins for a variety of purposes. Results We developed a method to simulate protein evolution that combines models of protein folding stability, such that the fitness depends on the stability of the native state both with respect to unfolding and misfolding, with phylogenetic histories that can be either specified by the user or simulated with the coalescent under complex evolutionary scenarios including recombination, demographics and migration. We have implemented this framework in a computer program called ProteinEvolver. Remarkably, comparing these models with empirical amino acid replacement models, we found that the former produce amino acid distributions closer to distributions observed in real protein families, and proteins that are predicted to be more stable. Therefore, we conclude that evolutionary models that consider protein stability and realistic evolutionary histories constitute a better approximation of the real evolutionary process. Availability ProteinEvolver is written in C, can run in parallel, and is freely available from http://code.google.com/p/proteinevolver/. PMID:24037213

  6. Validation of protein structure models using network similarity score.

    Science.gov (United States)

    Ghosh, Sambit; Gadiyaram, Vasundhara; Vishveshwara, Saraswathi

    2017-09-01

    Accurate structural validation of proteins is of extreme importance in studies like protein structure prediction, analysis of molecular dynamic simulation trajectories and finding subtle changes in very similar structures. The benchmarks for today's structure validation are scoring methods like global distance test-total structure (GDT-TS), TM-score and root mean square deviations (RMSD). However, there is a lack of methods that look at both the protein backbone and side-chain structures at the global connectivity level and provide information about the differences in connectivity. To address this gap, a graph spectral based method (NSS-network similarity score) which has been recently developed to rigorously compare networks in diverse fields, is adopted to compare protein structures both at the backbone and at the side-chain noncovalent connectivity levels. In this study, we validate the performance of NSS by investigating protein structures from X-ray structures, modeling (including CASP models), and molecular dynamics simulations. Further, we systematically identify the local and the global regions of the structures contributing to the difference in NSS, through the components of the score, a feature unique to this spectral based scoring scheme. It is demonstrated that the method can quantify subtle differences in connectivity compared to a reference protein structure and can form a robust basis for protein structure comparison. Additionally, we have also introduced a network-based method to analyze fluctuations in side chain interactions (edge-weights) in an ensemble of structures, which can be an useful tool for the analysis of MD trajectories. © 2017 Wiley Periodicals, Inc.

  7. Protein contact maps: A binary depiction of protein 3D structures

    Science.gov (United States)

    Emerson, Isaac Arnold; Amala, Arumugam

    2017-01-01

    In recent years, there has been a considerable interest in examining the structure and dynamics of complex networks. Proteins in 3D space may also be considered as complex systems emerged through the interactions of their constituent amino acids. This representation provides a powerful framework to uncover the general organized principle of protein contact network. Here we reviewed protein contact map in terms of protein structure prediction and analyses. In addition, we had also discussed the various computational techniques for the prediction of protein contact maps and the tools to visualize contact maps.

  8. A 9-state hidden Markov model using protein secondary structure information for protein fold recognition.

    Science.gov (United States)

    Lee, Sun Young; Lee, Jong Yun; Jung, Kwang Su; Ryu, Keun Ho

    2009-06-01

    In protein fold recognition, the main disadvantage of hidden Markov models (HMMs) is the employment of large-scale model architectures which require large data sets and high computational resources for training. Also, HMMs must consider sequential information about secondary structures of proteins, to improve prediction performance and reduce model parameters. Therefore, we propose a novel method for protein fold recognition based on a hidden Markov model, called a 9-state HMM. The method can (i) reduce the number of states using secondary structure information about proteins for each fold and (ii) recognize protein folds more accurately than other HMMs.

  9. Fusion proteins as alternate crystallization paths to difficult structure problems

    Science.gov (United States)

    Carter, Daniel C.; Rueker, Florian; Ho, Joseph X.; Lim, Kap; Keeling, Kim; Gilliland, Gary; Ji, Xinhua

    1994-01-01

    The three-dimensional structure of a peptide fusion product with glutathione transferase from Schistosoma japonicum (SjGST) has been solved by crystallographic methods to 2.5 A resolution. Peptides or proteins can be fused to SjGST and expressed in a plasmid for rapid synthesis in Escherichia coli. Fusion proteins created by this commercial method can be purified rapidly by chromatography on immobilized glutathione. The potential utility of using SjGST fusion proteins as alternate paths to the crystallization and structure determination of proteins is demonstrated.

  10. Structural properties of proteins specific to the myelin sheath.

    Science.gov (United States)

    Kursula, P

    2008-02-01

    The myelin sheath is an insulating membrane layer surrounding myelinated axons in vertebrates, which is formed when the plasma membrane of an oligodendrocyte or a Schwann cell wraps itself around the axon. A large fraction of the total protein in this membrane layer is comprised of only a small number of individual proteins, which have certain intriguing structural properties. The myelin proteins are implicated in a number of neurological diseases, including, for example, autoimmune diseases and peripheral neuropathies. In this review, the structural properties of a number of myelin-specific proteins are described.

  11. Exploring Protein Dynamics Space: The Dynasome as the Missing Link between Protein Structure and Function

    Science.gov (United States)

    Hensen, Ulf; Meyer, Tim; Haas, Jürgen; Rex, René; Vriend, Gert; Grubmüller, Helmut

    2012-01-01

    Proteins are usually described and classified according to amino acid sequence, structure or function. Here, we develop a minimally biased scheme to compare and classify proteins according to their internal mobility patterns. This approach is based on the notion that proteins not only fold into recurring structural motifs but might also be carrying out only a limited set of recurring mobility motifs. The complete set of these patterns, which we tentatively call the dynasome, spans a multi-dimensional space with axes, the dynasome descriptors, characterizing different aspects of protein dynamics. The unique dynamic fingerprint of each protein is represented as a vector in the dynasome space. The difference between any two vectors, consequently, gives a reliable measure of the difference between the corresponding protein dynamics. We characterize the properties of the dynasome by comparing the dynamics fingerprints obtained from molecular dynamics simulations of 112 proteins but our approach is, in principle, not restricted to any specific source of data of protein dynamics. We conclude that: 1. the dynasome consists of a continuum of proteins, rather than well separated classes. 2. For the majority of proteins we observe strong correlations between structure and dynamics. 3. Proteins with similar function carry out similar dynamics, which suggests a new method to improve protein function annotation based on protein dynamics. PMID:22606222

  12. Expression screening, protein purification and NMR analysis of human protein domains for structural genomics

    NARCIS (Netherlands)

    Folkers, G.E.|info:eu-repo/dai/nl/162277202; van Buuren, B.N.M.; Kaptein, R.|info:eu-repo/dai/nl/074334603

    2004-01-01

    Structural genomics, the determination of protein structures on a genome-wide scale, is still in its infancy for eukaryotes due to the number and size of their genes. Low protein expression and solubility of eukaryotic geneproducts are the major bottlenecks in high-throughput (HTP) recombinant

  13. Analysis of the interface variability in NMR structure ensembles of protein-protein complexes

    NARCIS (Netherlands)

    Calvanese, Luisa; D'Auria, Gabriella; Vangone, Anna; Falcigno, Lucia; Oliva, Romina

    NMR structures consist in ensembles of conformers, all satisfying the experimental restraints, which exhibit a certain degree of structural variability. We analyzed here the interface in NMR ensembles of protein-protein heterodimeric complexes and found it to span a wide range of different

  14. How Do Rab Proteins Determine Golgi Structure?

    Science.gov (United States)

    Liu, Shijie; Storrie, Brian

    2015-01-01

    Rab proteins, small GTPases, are key regulators of mammalian Golgi apparatus organization. Based on the effect of Rab activation state, Rab proteins fall into two functional classes. In Class1, inactivation induces Golgi ribbon fragmentation and/or redistribution of Golgi enzymes to the ER, while overexpression of wild type or activation has little, if any, effect on Golgi ribbon organization. In Class 2, the reverse is true. We give emphasis to Rab6, the most abundant Golgi-associated Rab protein. Rab6 depletion in HeLa cells causes an increase in Golgi cisternal number, longer, more continuous cisternae, and a pronounced accumulation of vesicles; the effect of Rab6 on Golgi ribbon organization is probably through regulation of vesicle transport. In effector studies, motor proteins and their regulators are found to be key Rab6 effectors. A related Rab, Rab41, affects Golgi ribbon organization in a contrasting manner. The balance between minus- and plus-end directed motor recruitment may well be the major Rab-dependent factor in Golgi ribbon organization. PMID:25708460

  15. Relationship between Molecular Structure Characteristics of Feed Proteins and Protein In vitro Digestibility and Solubility

    OpenAIRE

    Bai, Mingmei; Qin, Guixin; Sun, Zewei; Long, Guohui

    2015-01-01

    The nutritional value of feed proteins and their utilization by livestock are related not only to the chemical composition but also to the structure of feed proteins, but few studies thus far have investigated the relationship between the structure of feed proteins and their solubility as well as digestibility in monogastric animals. To address this question we analyzed soybean meal, fish meal, corn distiller’s dried grains with solubles, corn gluten meal, and feather meal by Fourier transfor...

  16. Simultaneous prediction of protein secondary structure and transmembrane spans.

    Science.gov (United States)

    Leman, Julia Koehler; Mueller, Ralf; Karakas, Mert; Woetzel, Nils; Meiler, Jens

    2013-07-01

    Prediction of transmembrane spans and secondary structure from the protein sequence is generally the first step in the structural characterization of (membrane) proteins. Preference of a stretch of amino acids in a protein to form secondary structure and being placed in the membrane are correlated. Nevertheless, current methods predict either secondary structure or individual transmembrane states. We introduce a method that simultaneously predicts the secondary structure and transmembrane spans from the protein sequence. This approach not only eliminates the necessity to create a consensus prediction from possibly contradicting outputs of several predictors but bears the potential to predict conformational switches, i.e., sequence regions that have a high probability to change for example from a coil conformation in solution to an α-helical transmembrane state. An artificial neural network was trained on databases of 177 membrane proteins and 6048 soluble proteins. The output is a 3 × 3 dimensional probability matrix for each residue in the sequence that combines three secondary structure types (helix, strand, coil) and three environment types (membrane core, interface, solution). The prediction accuracies are 70.3% for nine possible states, 73.2% for three-state secondary structure prediction, and 94.8% for three-state transmembrane span prediction. These accuracies are comparable to state-of-the-art predictors of secondary structure (e.g., Psipred) or transmembrane placement (e.g., OCTOPUS). The method is available as web server and for download at www.meilerlab.org. Copyright © 2013 Wiley Periodicals, Inc.

  17. [Structural biology of post-translational modifications of proteins].

    Science.gov (United States)

    Kato, Koichi

    2012-01-01

    A majority of proteins encoded in genomes of limited size are post-translationally diversified by covalent modifications such as glycosylation and ubiquitination. Although recent advances in structural proteomics have enabled high-throughput structure determination of proteins, structural analyses of post-translationally modified proteins remain challenging because of the lack of appropriate determination methods. Therefore, we developed methodologies for characterizing the post-translational modifications of proteins from the structural viewpoint, focusing especially on glycosylation and ubiquitination. For instance, we established a systematic method for structural glycomics to address broader issues, including glycosylation profiling and 3D structure analyses of glycoproteins. Our stable-isotope-assisted NMR techniques in conjunction with X-ray crystallographic approach provide valuable information at the atomic level on conformations, dynamics, and interactions of glycoproteins such as antibody and proteins involved in the ubiquitin-proteasome system. These studies provide the structural basis for improved efficacy of therapeutic antibodies on defucosylation of their Fc glycans and mechanistic insights into ubiquitination reactions in glycoprotein-fate determination in cells. These approaches will allow new possibilities for structural studies on post-translationally modified proteins of clinical, pathological, and pharmaceutical interests.

  18. Large cryptic internal sequence repeats in protein structures from ...

    Indian Academy of Sciences (India)

    Prakash

    [Sarani R, Udayaprakash N A, Subashini R, Mridula P, Yamane T and Sekar K 2009 Large cryptic internal sequence repeats in protein structures from Homo sapiens; J. Biosci. 34 103–112]. Keywords. Propensity; structure–function correlation; human genome; structural plasticity; three-dimensional structure; identical and.

  19. Protein dynamics derived from clusters of crystal structures

    NARCIS (Netherlands)

    van Aalten, D.M.F.; Conn, D.A.; de Groot, B.L.; Berendsen, H.J.C.; Findlay, J.B.C.; Amadei, A

    1997-01-01

    A method is presented to mathematically extract concerted structural transitions in proteins from collections of crystal structures. The ''essential dynamics'' procedure is used to filter out small-amplitude fluctuations from such a set of structures; the remaining large conformational changes

  20. Protein sequence and structure relationship ARMA spectral analysis: application to membrane proteins.

    Science.gov (United States)

    Sun, S; Parthasarathy, R

    1994-06-01

    If it is assumed that the primary sequence determines the three-dimensional folded structure of a protein, then the regular folding patterns, such as alpha-helix, beta-sheet, and other ordered patterns in the three-dimensional structure must correspond to the periodic distribution of the physical properties of the amino acids along the primary sequence. An AutoRegressive Moving Average (ARMA) model method of spectral analysis is applied to analyze protein sequences represented by the hydrophobicity of their amino acids. The results for several membrane proteins of known structures indicate that the periodic distribution of hydrophobicity of the primary sequence is closely related to the regular folding patterns in a protein's three-dimensional structure. We also applied the method to the transmembrane regions of acetylcholine receptor alpha subunit and Shaker potassium channel for which no atomic resolution structure is available. This work is an extension of our analysis of globular proteins by a similar method.

  1. Coverage of protein sequence space by current structural genomics targets.

    Science.gov (United States)

    O'Toole, Nicholas; Raymond, Stéphane; Cygler, Miroslaw

    2003-01-01

    By its purest definition the ultimate goal of structural genomics (SG) is the determination of the structures of all proteins encoded by genomes. Most of these will be obtained by homology modeling using the structures of a set of target proteins for experimental determination. Thanks to the open exchange of SG target information, we are able to analyze the sequences of the current target list to evaluate the extent of its coverage of protein sequence space. The presence of homologous sequences currently either in the Protein Data Bank (PDB) or among SG targets has been determined for each of the protein sequences in several organisms. In this way we are able to evaluate the coverage by existing or targeted structural data for the non-membranous parts of entire proteomes. For small bacterial proteomes such as that of H. influenzae almost all proteins have homologous sequences among SG targets or in the PDB. There is significantly lower coverage for more complex organisms, such as C. elegans. We have mapped the SG target list onto the ProtoMap clustering of protein sequences. Clusters occupied by SG targets represent over 150,000 protein sequences, which is approximately 44% of the total protein sequences classified by ProtoMap. The mapping of SG targets also enables an evaluation of the degree of overlap within the target list. An SG target typically occupies a ProtoMap cluster with more than six other homologous targets.

  2. Structural and Functional Annotation of Hypothetical Proteins of O139

    Directory of Open Access Journals (Sweden)

    Md. Saiful Islam

    2015-06-01

    Full Text Available In developing countries threat of cholera is a significant health concern whenever water purification and sewage disposal systems are inadequate. Vibrio cholerae is one of the responsible bacteria involved in cholera disease. The complete genome sequence of V. cholerae deciphers the presence of various genes and hypothetical proteins whose function are not yet understood. Hence analyzing and annotating the structure and function of hypothetical proteins is important for understanding the V. cholerae. V. cholerae O139 is the most common and pathogenic bacterial strain among various V. cholerae strains. In this study sequence of six hypothetical proteins of V. cholerae O139 has been annotated from NCBI. Various computational tools and databases have been used to determine domain family, protein-protein interaction, solubility of protein, ligand binding sites etc. The three dimensional structure of two proteins were modeled and their ligand binding sites were identified. We have found domains and families of only one protein. The analysis revealed that these proteins might have antibiotic resistance activity, DNA breaking-rejoining activity, integrase enzyme activity, restriction endonuclease, etc. Structural prediction of these proteins and detection of binding sites from this study would indicate a potential target aiding docking studies for therapeutic designing against cholera.

  3. Structural changes in gluten protein structure after addition of emulsifier. A Raman spectroscopy study

    Science.gov (United States)

    Ferrer, Evelina G.; Gómez, Analía V.; Añón, María C.; Puppo, María C.

    2011-06-01

    Food protein product, gluten protein, was chemically modified by varying levels of sodium stearoyl lactylate (SSL); and the extent of modifications (secondary and tertiary structures) of this protein was analyzed by using Raman spectroscopy. Analysis of the Amide I band showed an increase in its intensity mainly after the addition of the 0.25% of SSL to wheat flour to produced modified gluten protein, pointing the formation of a more ordered structure. Side chain vibrations also confirmed the observed changes.

  4. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

    Directory of Open Access Journals (Sweden)

    Seyed Morteza Najibi

    2017-01-01

    Full Text Available Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  5. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

    KAUST Repository

    Najibi, Seyed Morteza

    2017-02-08

    Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  6. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions.

    Science.gov (United States)

    Najibi, Seyed Morteza; Maadooliat, Mehdi; Zhou, Lan; Huang, Jianhua Z; Gao, Xin

    2017-01-01

    Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  7. Genome Pool Strategy for Structural Coverage of Protein Families

    Science.gov (United States)

    Jaroszewski, Lukasz; Slabinski, Lukasz; Wooley, John; Deacon, Ashley M.; Lesley, Scott A.; Wilson, Ian. A.; Godzik, Adam

    2010-01-01

    As noticed by generations of structural biologists, closely homologous proteins may have substantially different crystallization properties and propensities. These observations can be used to systematically introduce additional dimensionality into crystallization trials by targeting homologous proteins from multiple genomes in a “genome pool” strategy. Through extensive use of our recently introduced “crystallization feasibility score” (Slabinski et al., 2007a), we can explain that the genome pool strategy works well because the crystallization feasibility scores are surprisingly broad within families of homologous proteins, with most families containing a range of optimal to very difficult targets. We also show that some families can be regarded as relatively “easy”, where a significant number of proteins are predicted to have optimal crystallization features, and others are “very difficult”, where almost none are predicted to result in a crystal structure. Thus, the outcome of such variable distributions of such crystallizability' preferences leads to uneven structural coverage of known families, with “easier” or “optimal” families having several times more solved structures than “very difficult” ones. Nevertheless, this latter category can be successfully targeted by increasing the number of genomes that are used to select targets from a given family. On average, adding 10 new genomes to the “genome pool” provides more promising targets for 7 “very difficult” families. In contrast, our crystallization feasibility score does not indicate that any specific microbial genomes can be readily classified as “easier” or “very difficult” with respect to providing suitable candidates for crystallization and structure determination. Finally, our analyses show that specific physicochemical properties of the protein sequence favor successful outcomes for structure determination and, hence, the group of proteins with known 3D

  8. Rapid and reliable protein structure determination via chemical shift threading.

    Science.gov (United States)

    Hafsa, Noor E; Berjanskii, Mark V; Arndt, David; Wishart, David S

    2017-12-01

    Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often structure) and to significantly outperform other shift-based or threading-based structure determination methods (in terms of top template model accuracy)-with an average TM-score performance of 0.68 (vs. 0.50-0.62 for other methods). Coupled with recent developments in chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca .

  9. Kinetics of protein adsorption on gold nanoparticle with variable protein structure and nanoparticle size

    Science.gov (United States)

    Khan, S.; Gupta, A.; Verma, N. C.; Nandi, C. K.

    2015-10-01

    The spontaneous protein adsorption on nanomaterial surfaces and the formation of a protein corona around nanoparticles are poorly understood physical phenomena, with high biological relevance. The complexity arises mainly due to the poor knowledge of the structural orientation of the adsorbed proteins onto the nanoparticle surface and difficulties in correlating the protein nanoparticle interaction to the protein corona in real time scale. Here, we provide quantitative insights into the kinetics, number, and binding orientation of a few common blood proteins when they interact with citrate and cetyltriethylammoniumbromide stabilized spherical gold nanoparticles with variable sizes. The kinetics of the protein adsorption was studied experimentally by monitoring the change in hydrodynamic diameter and zeta potential of the nanoparticle-protein complex. To understand the competitive binding of human serum albumin and hemoglobin, time dependent fluorescence quenching was studied using dual fluorophore tags. We have performed molecular docking of three different proteins—human serum albumin, bovine serum albumin, and hemoglobin—on different nanoparticle surfaces to elucidate the possible structural orientation of the adsorbed protein. Our data show that the growth kinetics of a protein corona is exclusively dependent on both protein structure and surface chemistry of the nanoparticles. The study quantitatively suggests that a general physical law of protein adsorption is unlikely to exist as the interaction is unique and specific for a given pair.

  10. A novel method to compare protein structures using local descriptors

    Directory of Open Access Journals (Sweden)

    Daniluk Paweł

    2011-08-01

    Full Text Available Abstract Background Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships. Results We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL. Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy. Conclusions DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at http://bioexploratorium.pl/EP/DEDAL.

  11. Columba: an integrated database of proteins, structures, and annotations

    Directory of Open Access Journals (Sweden)

    Preissner Robert

    2005-03-01

    Full Text Available Abstract Background Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. Description COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. Conclusion The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.

  12. Bayesian model of protein primary sequence for secondary structure prediction.

    Directory of Open Access Journals (Sweden)

    Qiwei Li

    Full Text Available Determining the primary structure (i.e., amino acid sequence of a protein has become cheaper, faster, and more accurate. Higher order protein structure provides insight into a protein's function in the cell. Understanding a protein's secondary structure is a first step towards this goal. Therefore, a number of computational prediction methods have been developed to predict secondary structure from just the primary amino acid sequence. The most successful methods use machine learning approaches that are quite accurate, but do not directly incorporate structural information. As a step towards improving secondary structure reduction given the primary structure, we propose a Bayesian model based on the knob-socket model of protein packing in secondary structure. The method considers the packing influence of residues on the secondary structure determination, including those packed close in space but distant in sequence. By performing an assessment of our method on 2 test sets we show how incorporation of multiple sequence alignment data, similarly to PSIPRED, provides balance and improves the accuracy of the predictions. Software implementing the methods is provided as a web application and a stand-alone implementation.

  13. Template-based preparation of free-standing semiconducting polymeric nanorod arrays on conductive substrates.

    Science.gov (United States)

    Haberkorn, Niko; Weber, Stefan A L; Berger, Rüdiger; Theato, Patrick

    2010-06-01

    We describe the synthesis and characterization of a cross-linkable siloxane-derivatized tetraphenylbenzidine (DTMS-TPD), which was used for the fabrication of semiconducting highly ordered nanorod arrays on conductive indium tin oxide or Pt-coated substrates. The stepwise process allow fabricating of macroscopic areas of well-ordered free-standing nanorod arrays, which feature a high resistance against organic solvents, semiconducting properties and a good adhesion to the substrate. Thin films of the TPD derivate with good hole-conducting properties could be prepared by cross-linking and covalently attaching to hydroxylated substrates utilizing an initiator-free thermal curing at 160 degrees C. The nanorod arrays composed of cross-linked DTMS-TPD were fabricated by an anodic aluminum oxide (AAO) template approach. Furthermore, the nanorod arrays were investigated by a recently introduced method allowing to probe local conductivity on fragile structures. It revealed that more than 98% of the nanorods exhibit electrical conductance and consequently feature a good electrical contact to the substrate. The prepared nanorod arrays have the potential to find application in the fabrication of multilayered device architectures for building well-ordered bulk-heterojunction solar cells.

  14. Exploring the effects of sparse restraints on protein structure prediction.

    Science.gov (United States)

    Mandalaparthy, Varun; Sanaboyana, Venkata Ramana; Rafalia, Hitesh; Gosavi, Shachi

    2017-12-03

    One of the main barriers to accurate computational protein structure prediction is searching the vast space of protein conformations. Distance restraints or inter-residue contacts have been used to reduce this search space, easing the discovery of the correct folded state. It has been suggested that about 1 contact for every 12 residues may be sufficient to predict structure at fold level accuracy. Here, we use coarse-grained structure-based models in conjunction with molecular dynamics simulations to examine this empirical prediction. We generate sparse contact maps for 15 proteins of varying sequence lengths and topologies and find that given perfect secondary-structural information, a small fraction of the native contact map (5%-10%) suffices to fold proteins to their correct native states. We also find that different sparse maps are not equivalent and we make several observations about the type of maps that are successful at such structure prediction. Long range contacts are found to encode more information than shorter range ones, especially for α and αβ-proteins. However, this distinction reduces for β-proteins. Choosing contacts that are a consensus from successful maps gives predictive sparse maps as does choosing contacts that are well spread out over the protein structure. Additionally, the folding of proteins can also be used to choose predictive sparse maps. Overall, we conclude that structure-based models can be used to understand the efficacy of structure-prediction restraints and could, in future, be tuned to include specific force-field interactions, secondary structure errors and noise in the sparse maps. © 2017 Wiley Periodicals, Inc.

  15. Integrating protein structures and precomputed genealogies in the Magnum database: Examples with cellular retinoid binding proteins

    Directory of Open Access Journals (Sweden)

    Bradley Michael E

    2006-02-01

    Full Text Available Abstract Background When accurate models for the divergent evolution of protein sequences are integrated with complementary biological information, such as folded protein structures, analyses of the combined data often lead to new hypotheses about molecular physiology. This represents an excellent example of how bioinformatics can be used to guide experimental research. However, progress in this direction has been slowed by the lack of a publicly available resource suitable for general use. Results The precomputed Magnum database offers a solution to this problem for ca. 1,800 full-length protein families with at least one crystal structure. The Magnum deliverables include 1 multiple sequence alignments, 2 mapping of alignment sites to crystal structure sites, 3 phylogenetic trees, 4 inferred ancestral sequences at internal tree nodes, and 5 amino acid replacements along tree branches. Comprehensive evaluations revealed that the automated procedures used to construct Magnum produced accurate models of how proteins divergently evolve, or genealogies, and correctly integrated these with the structural data. To demonstrate Magnum's capabilities, we asked for amino acid replacements requiring three nucleotide substitutions, located at internal protein structure sites, and occurring on short phylogenetic tree branches. In the cellular retinoid binding protein family a site that potentially modulates ligand binding affinity was discovered. Recruitment of cellular retinol binding protein to function as a lens crystallin in the diurnal gecko afforded another opportunity to showcase the predictive value of a browsable database containing branch replacement patterns integrated with protein structures. Conclusion We integrated two areas of protein science, evolution and structure, on a large scale and created a precomputed database, known as Magnum, which is the first freely available resource of its kind. Magnum provides evolutionary and structural

  16. Effects of NMR spectral resolution on protein structure calculation.

    Directory of Open Access Journals (Sweden)

    Suhas Tikole

    Full Text Available Adequate digital resolution and signal sensitivity are two critical factors for protein structure determinations by solution NMR spectroscopy. The prime objective for obtaining high digital resolution is to resolve peak overlap, especially in NOESY spectra with thousands of signals where the signal analysis needs to be performed on a large scale. Achieving maximum digital resolution is usually limited by the practically available measurement time. We developed a method utilizing non-uniform sampling for balancing digital resolution and signal sensitivity, and performed a large-scale analysis of the effect of the digital resolution on the accuracy of the resulting protein structures. Structure calculations were performed as a function of digital resolution for about 400 proteins with molecular sizes ranging between 5 and 33 kDa. The structural accuracy was assessed by atomic coordinate RMSD values from the reference structures of the proteins. In addition, we monitored also the number of assigned NOESY cross peaks, the average signal sensitivity, and the chemical shift spectral overlap. We show that high resolution is equally important for proteins of every molecular size. The chemical shift spectral overlap depends strongly on the corresponding spectral digital resolution. Thus, knowing the extent of overlap can be a predictor of the resulting structural accuracy. Our results show that for every molecular size a minimal digital resolution, corresponding to the natural linewidth, needs to be achieved for obtaining the highest accuracy possible for the given protein size using state-of-the-art automated NOESY assignment and structure calculation methods.

  17. Automatic classification of protein structure by using Gauss integrals

    DEFF Research Database (Denmark)

    Røgen, Peter; Fain, B.

    2003-01-01

    We introduce a method of looking at, analyzing, and comparing protein structures. The topology of a protein is captured by 30 numbers inspired by Vassiliev knot invariants. To illustrate the simplicity and power of this topological approach, we construct a measure (scaled Gauss metric, SGM...... dimensions, show the relative locations of the major structural classes, and "zoom into" the space of proteins to show architecture, topology, and fold clusters. The existence of a simple measure of a protein fold computed from the chain path will have a major impact on automatic fold classification.......) of similarity of protein shapes. Under this metric, protein chains naturally separate into fold clusters. We use SGM to construct an automatic classification procedure for the CATH2.4 database. The method is very fast because it requires neither alignment of the chains nor any chain-chain comparison. It also...

  18. Using circular dichroism spectra to estimate protein secondary structure

    Science.gov (United States)

    Greenfield, Norma J.

    2009-01-01

    Circular dichroism (CD) is an excellent tool for rapid determination of the secondary structure and folding properties of proteins that have been obtained using recombinant techniques or purified from tissues. The most widely used applications of protein CD are to determine whether an expressed, purified protein is folded, or if a mutation affects its conformation or stability. In addition, it can be used to study protein interactions. This protocol details the basic steps of obtaining and interpreting CD data and methods for analyzing spectra to estimate the secondary structural composition of proteins. CD has the advantage that it is that measurements may be made on multiple samples containing 20 µg or less of proteins in physiological buffers in a few hours. However, it does not give the residue-specific information that can be obtained by X-ray crystallography or NMR. PMID:17406547

  19. Structure, Function, and Evolution of Coronavirus Spike Proteins.

    Science.gov (United States)

    Li, Fang

    2016-09-29

    The coronavirus spike protein is a multifunctional molecular machine that mediates coronavirus entry into host cells. It first binds to a receptor on the host cell surface through its S1 subunit and then fuses viral and host membranes through its S2 subunit. Two domains in S1 from different coronaviruses recognize a variety of host receptors, leading to viral attachment. The spike protein exists in two structurally distinct conformations, prefusion and postfusion. The transition from prefusion to postfusion conformation of the spike protein must be triggered, leading to membrane fusion. This article reviews current knowledge about the structures and functions of coronavirus spike proteins, illustrating how the two S1 domains recognize different receptors and how the spike proteins are regulated to undergo conformational transitions. I further discuss the evolution of these two critical functions of coronavirus spike proteins, receptor recognition and membrane fusion, in the context of the corresponding functions from other viruses and host cells.

  20. Kinetic study and growth behavior of template-based electrodeposited platinum nanotubes controlled by overpotential

    Energy Technology Data Exchange (ETDEWEB)

    Yousefi, E. [Department of Materials Science and Engineering, Sharif University of Technology, Azadi Ave., P.O.Box 11155-9466, Tehran (Iran, Islamic Republic of); Dolati, A., E-mail: dolati@sharif.edu [Department of Materials Science and Engineering, Sharif University of Technology, Azadi Ave., P.O.Box 11155-9466, Tehran (Iran, Islamic Republic of); Imanieh, I. [Department of Materials Science and Engineering, Sharif University of Technology, Azadi Ave., P.O.Box 11155-9466, Tehran (Iran, Islamic Republic of); Yashiro, H.; Kure-Chu, S.-Z. [Department of Chemistry and Bioengineering, Faculty of Engineering, Iwate University, 4-3-5 Ueda, Morioka, Iwate, 020-8551 (Japan)

    2017-02-01

    Platinum nanotubes (PtNTs) are fabricated by potentiostatic electrodeposition at various overpotentials (−200 up to −400 mV versus SCE) in polycarbonate templates (PCTs) with pore diameter of 200 nm in a solution containing 5 mM H{sub 2}PtCl{sub 6} and 0.1 M H{sub 2}SO{sub 4}. The synthesized PtNTs are characterized by field emission scanning electron microscopy (FE-SEM), and transmission electron microscopy (TEM). The electrochemical growth mechanism within nanoscopic pores and the relationship between morphological variations and kinetic parameters are investigated for the first time. It is shown that more porous structure of nanotubes forms at high overpotentials possibly due to preferably nucleation. The kinetics of electrodeposition process is studied by electrochemical techniques such as voltammetry and chronoamperometry. The linear diffusion coefficient at the early stage of the deposition and the radial diffusion coefficients at steady state regime are calculated as D = 8.39 × 10{sup −5} and 2.33–13.26 × 10{sup −8} cm{sup 2}/s, respectively. The synthesized PtNT electrode is tested as electrocatalyst for hydrogen peroxide oxidation in phosphate buffer solution (PBS) and shows a sensitivity as high as 2.89 mA per 1 μM that is an indication to its enlarged electrochemical surface area. - Highlights: • PtNT is electrodeposited in a 3-aminopropyltrimethoxysilane-modified PCT. • The electrochemical growth mechanism within nanoscopic pores is discussed. • The kinetics of PtNT electrodeposition is studied based on models for UME arrays. • Relationship between morphological variations vs. kinetic parameters is studied.

  1. Template-based CTA X-ray angio rigid registration of coronary arteries in frequency domain

    Science.gov (United States)

    Aksoy, Timur; Demirci, Stefanie; Degertekin, Muzaffer; Navab, Nassir; Unal, Gozde

    2013-03-01

    This study performs 3D to 2D rigid registration of segmented pre-operative CTA coronary arteries with a single segmented intra-operative X-ray Angio frame in both frequency and spatial domains for real-time Angiography interventions by C-arm fluoroscopy. Most of the work on rigid registration in literature required a close initial- ization of poses and/or positions because of the abundance of local minima and high complexity that searching algorithms face. This study avoids such setbacks by transforming the projections into translation-invariant Fourier domain for estimating the 3D pose. First, template DRRs as candidate poses of 3D vessels of segmented CTA are produced by rotating the camera (image intensifier) around the DICOM angle values with a wide range as in C-arm setup. We have compared the 3D poses of template DRRs with the real X-ray after equalizing the scales (due to disparities in focal length distances) in 3 domains, namely Fourier magnitude, Fourier phase and Fourier polar. The best pose candidate was chosen by one of the highest similarity measures returned by the methods in these domains. It has been noted in literature that these methods are robust against noise and occlusion which was also validated by our results. Translation of the volume was then recovered by distance-map based BFGS optimization well suited to convex structure of our objective function without local minima due to distance maps. Final results were evaluated in 2D projection space rather than with actual values in 3D due to lack of ground truth, ill-posedness of the problem which we intend to address in future.

  2. Structural and Function Prediction of Musa acuminata subsp. Malaccensis Protein

    Directory of Open Access Journals (Sweden)

    Anum Munir

    2016-03-01

    Full Text Available Hypothetical proteins (HPs are the proteins whose presence has been anticipated, yet in vivo function has not been built up. Illustrating the structural and functional privileged insights of these HPs might likewise prompt a superior comprehension of the protein-protein associations or networks in diverse types of life. Bananas (Musa acuminata spp., including sweet and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister grouped to the all-around considered Poales, which incorporate oats. Bananas are crucial for nourishment security in numerous tropical and subtropical nations and the most prominent organic product in industrialized nations. In the present study, the hypothetical protein of M. acuminata (Banana was chosen for analysis and modeling by distinctive bioinformatics apparatuses and databases. As indicated by primary and secondary structure analysis, XP_009393594.1 is a stable hydrophobic protein containing a noteworthy extent of α-helices; Homology modeling was done utilizing SWISS-MODEL server where the templates identity with XP_009393594.1 protein was less which demonstrated novelty of our protein. Ab initio strategy was conducted to produce its 3D structure. A few evaluations of quality assessment and validation parameters determined the generated protein model as stable with genuinely great quality. Functional analysis was completed by ProtFun 2.2, and KEGG (KAAS, recommended that the hypothetical protein is a transcription factor with cytoplasmic domain as zinc finger. The protein was observed to be vital for translation process, involved in metabolism, signaling and cellular processes, genetic information processing and Zinc ion binding. It is suggested that further test approval would help to anticipate the structures and functions of other uncharacterized proteins of different plants and living being.

  3. Structural and functional properties of hemp seed protein products.

    Science.gov (United States)

    Malomo, Sunday A; He, Rong; Aluko, Rotimi E

    2014-08-01

    The effects of pH and protein concentration on some structural and functional properties of hemp seed protein isolate (HPI, 84.15% protein content) and defatted hemp seed protein meal (HPM, 44.32% protein content) were determined. The HPI had minimum protein solubility (PS) at pH 4.0, which increased as pH was decreased or increased. In contrast, the HPM had minimum PS at pH 3.0, which increased at higher pH values. Gel electrophoresis showed that some of the high molecular weight proteins (>45 kDa) present in HPM were not well extracted by the alkali and were absent or present in low ratio in the HPI polypeptide profile. The amino acid composition showed that the isolation process increased the Arg/Lys ratio of HPI (5.52%) when compared to HPM (3.35%). Intrinsic fluorescence and circular dichroism data indicate that the HPI proteins had a well-defined structure at pH 3.0, which was lost as pH value increased. The differences in structural conformation of HPI at different pH values were reflected as better foaming capacity at pH 3.0 when compared to pH 5.0, 7.0, and 9.0. At 10 and 25 mg/mL protein concentrations, emulsions formed by the HPM had smaller oil droplet sizes (higher quality), when compared to the HPI-formed emulsions. In contrast at 50 mg/mL protein concentration, the HPI-formed emulsions had smaller oil droplet sizes (except at pH 3.0). We conclude that the functional properties of hemp seed protein products are dependent on structural conformations as well as protein concentration and pH. © 2014 Institute of Food Technologists®

  4. Analysis on sliding helices and strands in protein structural ...

    Indian Academy of Sciences (India)

    PRAKASH KUMAR

    2007-06-16

    Holm ... enable identification of conserved core of a protein fold it is not clear if the quality of .... Percentage of pairs of secondary structural elements for various SCOP classes (a) alpha helices (b) beta strands. Number of pairs.

  5. A physical approach to protein structure prediction: CASP4 results

    Energy Technology Data Exchange (ETDEWEB)

    Crivelli, Silvia; Eskow, Elizabeth; Bader, Brett; Lamberti, Vincent; Byrd, Richard; Schnabel, Robert; Head-Gordon, Teresa

    2001-02-27

    We describe our global optimization method called Stochastic Perturbation with Soft Constraints (SPSC), which uses information from known proteins to predict secondary structure, but not in the tertiary structure predictions or in generating the terms of the physics-based energy function. Our approach is also characterized by the use of an all atom energy function that includes a novel hydrophobic solvation function derived from experiments that shows promising ability for energy discrimination against misfolded structures. We present the results obtained using our SPSC method and energy function for blind prediction in the 4th Critical Assessment of Techniques for Protein Structure Prediction (CASP4) competition, and show that our approach is more effective on targets for which less information from known proteins is available. In fact our SPSC method produced the best prediction for one of the most difficult targets of the competition, a new fold protein of 240 amino acids.

  6. Determination of Structures of Proteins in Solution using Nuclear ...

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 8; Issue 8. Determination of Structures of Proteins in Solution using Nuclear Magnetic Resonance. Siddhartha P Sarma. Research News Volume 8 Issue 8 August 2003 pp 86-99 ...

  7. An object-oriented database for protein structure analysis.

    Science.gov (United States)

    Gray, P M; Paton, N W; Kemp, G J; Fothergill, J E

    1990-03-01

    An object-oriented database system has been developed which is being used to store protein structure data. The database can be queried using the logic programming language Prolog or the query language Daplex. Queries retrieve information by navigating through a network of objects which represent the primary, secondary and tertiary structures of proteins. Routines written in both Prolog and Daplex can integrate complex calculations with the retrieval of data from the database, and can also be stored in the database for sharing among users. Thus object-oriented databases are better suited to prototyping applications and answering complex queries about protein structure than relational databases. This system has been used to find loops of varying length and anchor positions when modelling homologous protein structures.

  8. Structural Aspects of Protein-Metal Recognition and Discrimination

    National Research Council Canada - National Science Library

    Christianson, David

    1998-01-01

    This project utilized the zinc enzyme human carbonic anhydrase II (CAII) as a paradigm for dissecting and understanding the structural basis of protein-transition metal recognition and discrimination...

  9. Neural Network Algorithm for Prediction of Secondary Protein Structure

    National Research Council Canada - National Science Library

    Zikrija Avdagic; Elvir Purisevic; Emir Buza; Zlatan Coralic

    2009-01-01

    .... In this paper we describe the method and results of using CB513 as a dataset suitable for development of artificial neural network algorithms for prediction of secondary protein structure with MATLAB...

  10. Protein Structural Perturbation and Aggregation on Homogeneous Surfaces

    National Research Council Canada - National Science Library

    Sethuraman, Ananthakrishnan; Belfort, Georges

    2005-01-01

    We have demonstrated that globular proteins, such as hen egg lysozyme in phosphate buffered saline at room temperature, lose native structural stability and activity when adsorbed onto well-defined...

  11. Structural studies of human glioma pathogenesis-related protein 1

    Energy Technology Data Exchange (ETDEWEB)

    Asojo, Oluwatoyin A., E-mail: oasojo@unmc.edu [College of Medicine, Nebraska Medical Center, Omaha, NE 68198-6495 (United States); Koski, Raymond A.; Bonafé, Nathalie [L2 Diagnostics LLC, 300 George Street, New Haven, CT 06511 (United States); College of Medicine, Nebraska Medical Center, Omaha, NE 68198-6495 (United States)

    2011-10-01

    Structural analysis of a truncated soluble domain of human glioma pathogenesis-related protein 1, a membrane protein implicated in the proliferation of aggressive brain cancer, is presented. Human glioma pathogenesis-related protein 1 (GLIPR1) is a membrane protein that is highly upregulated in brain cancers but is barely detectable in normal brain tissue. GLIPR1 is composed of a signal peptide that directs its secretion, a conserved cysteine-rich CAP (cysteine-rich secretory proteins, antigen 5 and pathogenesis-related 1 proteins) domain and a transmembrane domain. GLIPR1 is currently being investigated as a candidate for prostate cancer gene therapy and for glioblastoma targeted therapy. Crystal structures of a truncated soluble domain of the human GLIPR1 protein (sGLIPR1) solved by molecular replacement using a truncated polyalanine search model of the CAP domain of stecrisp, a snake-venom cysteine-rich secretory protein (CRISP), are presented. The correct molecular-replacement solution could only be obtained by removing all loops from the search model. The native structure was refined to 1.85 Å resolution and that of a Zn{sup 2+} complex was refined to 2.2 Å resolution. The latter structure revealed that the putative binding cavity coordinates Zn{sup 2+} similarly to snake-venom CRISPs, which are involved in Zn{sup 2+}-dependent mechanisms of inflammatory modulation. Both sGLIPR1 structures have extensive flexible loop/turn regions and unique charge distributions that were not observed in any of the previously reported CAP protein structures. A model is also proposed for the structure of full-length membrane-bound GLIPR1.

  12. Structural Transitions and Aggregation in Amyloidogenic Proteins

    Science.gov (United States)

    Steckmann, Timothy; Chapagain, Prem; Gerstman, Bernard; Computational and Theoretical Biophysics Group at Florida International University Team

    2014-03-01

    Amyloid fibrils are a common component in many debilitating human neurological diseases such as Alzheimer's and Parkinson's. A detailed molecular-level understanding of the formation process of amyloid fibrils is crucial for developing methods to slow down or prevent these horrific diseases. Alpha-helix to beta-sheet structural transformation is commonly observed in the process of fibril formation. We performed replica-exchange molecular dynamics simulations of structural transformations in an engineered model peptide cc-beta. Several sets of simulations with different number of cc-beta monomers were considered. Conversion of alpha-helix monomers to beta strands and the aggregation of beta strand monomers into sheets were analyzed as a function of the system size. Hydrogen bond analysis was performed and the beta-aggregate structures were characterized by a nematic order parameter.

  13. Relationship between Molecular Structure Characteristics of Feed Proteins and Protein In vitro Digestibility and Solubility.

    Science.gov (United States)

    Bai, Mingmei; Qin, Guixin; Sun, Zewei; Long, Guohui

    2016-08-01

    The nutritional value of feed proteins and their utilization by livestock are related not only to the chemical composition but also to the structure of feed proteins, but few studies thus far have investigated the relationship between the structure of feed proteins and their solubility as well as digestibility in monogastric animals. To address this question we analyzed soybean meal, fish meal, corn distiller's dried grains with solubles, corn gluten meal, and feather meal by Fourier transform infrared (FTIR) spectroscopy to determine the protein molecular spectral band characteristics for amides I and II as well as α-helices and β-sheets and their ratios. Protein solubility and in vitro digestibility were measured with the Kjeldahl method using 0.2% KOH solution and the pepsin-pancreatin two-step enzymatic method, respectively. We found that all measured spectral band intensities (height and area) of feed proteins were correlated with their the in vitro digestibility and solubility (p≤0.003); moreover, the relatively quantitative amounts of α-helices, random coils, and α-helix to β-sheet ratio in protein secondary structures were positively correlated with protein in vitro digestibility and solubility (p≤0.004). On the other hand, the percentage of β-sheet structures was negatively correlated with protein in vitro digestibility (pproteins are closely related to their in vitro digestibility at 28 h and solubility. Furthermore, the α-helix-to-β-sheet ratio can be used to predict the nutritional value of feed proteins.

  14. Are specialized web servers better at predicting protein structures ...

    African Journals Online (AJOL)

    RABAIL HAFEEZ (0973106)

    2012-07-03

    Jul 3, 2012 ... process of protein structure prediction. We gave the insulin sequence as input in all the stand alone software and saved the models produced as PDB files on my computer. Our next step was to use the insulin protein sequence as an input for all the web servers chosen for this research. All the web servers ...

  15. Is protein structure prediction still an enigma? | Sobha | African ...

    African Journals Online (AJOL)

    Proteins are large molecules indispensable for the existence and proper functioning of biological organisms. They perform a wide array of functions including catalysis, structure formation, transport, body defense, etc. Understanding the functions of proteins is a fundamental problem in the discovery of drugs to treat various ...

  16. Computing a new family of shape descriptors for protein structures

    DEFF Research Database (Denmark)

    Røgen, Peter; Sinclair, Robert

    2003-01-01

    The large-scale 3D structure of a protein can be represented by the polygonal curve through the carbon a atoms of the protein backbone. We introduce an algorithm for computing the average number of times that a given configuration of crossings on such polygonal curves is seen, the average being...

  17. Reuse of structural domain–domain interactions in protein networks

    Science.gov (United States)

    Schuster-Böckler, Benjamin; Bateman, Alex

    2007-01-01

    Background Protein interactions are thought to be largely mediated by interactions between structural domains. Databases such as iPfam relate interactions in protein structures to known domain families. Here, we investigate how the domain interactions from the iPfam database are distributed in protein interactions taken from the HPRD, MPact, BioGRID, DIP and IntAct databases. Results We find that known structural domain interactions can only explain a subset of 4–19% of the available protein interactions, nevertheless this fraction is still significantly bigger than expected by chance. There is a correlation between the frequency of a domain interaction and the connectivity of the proteins it occurs in. Furthermore, a large proportion of protein interactions can be attributed to a small number of domain interactions. We conclude that many, but not all, domain interactions constitute reusable modules of molecular recognition. A substantial proportion of domain interactions are conserved between E. coli, S. cerevisiae and H. sapiens. These domains are related to essential cellular functions, suggesting that many domain interactions were already present in the last universal common ancestor. Conclusion Our results support the concept of domain interactions as reusable, conserved building blocks of protein interactions, but also highlight the limitations currently imposed by the small number of available protein structures. PMID:17640363

  18. Self-consistent field approach to protein structure and stability

    NARCIS (Netherlands)

    Dimitrov, R.A.

    1999-01-01

    The organization of the thesis is as follows: after a short introduction (chapter 1), chapter 2 presents a review of the basic physical principle that govern protein structure and focuses on the thermodynamics as well as kinetics of protein folding and ufolding. Then chapter 3 starts with a

  19. Combining neural networks for protein secondary structure prediction

    DEFF Research Database (Denmark)

    Riis, Søren Kamaric

    1995-01-01

    In this paper structured neural networks are applied to the problem of predicting the secondary structure of proteins. A hierarchical approach is used where specialized neural networks are designed for each structural class and then combined using another neural network. The submodels are designed...... by using a priori knowledge of the mapping between protein building blocks and the secondary structure and by using weight sharing. Since none of the individual networks have more than 600 adjustable weights over-fitting is avoided. When ensembles of specialized experts are combined the performance...

  20. A generative, probabilistic model of local protein structure

    DEFF Research Database (Denmark)

    Boomsma, Wouter; Mardia, Kanti V.; Taylor, Charles C.

    2008-01-01

    Despite significant progress in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. One of the key remaining challenges is an efficient probabilistic exploration of the structural space that correctly reflects the relative...... conformational stabilities. Here, we present a fully probabilistic, continuous model of local protein structure in atomic detail. The generative model makes efficient conformational sampling possible and provides a framework for the rigorous analysis of local sequence-structure correlations in the native state...

  1. Sequence and structural features of binding site residues in protein-protein complexes: comparison with protein-nucleic acid complexes

    Directory of Open Access Journals (Sweden)

    Selvaraj S

    2011-10-01

    Full Text Available Abstract Background Protein-protein interactions are important for several cellular processes. Understanding the mechanism of protein-protein recognition and predicting the binding sites in protein-protein complexes are long standing goals in molecular and computational biology. Methods We have developed an energy based approach for identifying the binding site residues in protein–protein complexes. The binding site residues have been analyzed with sequence and structure based parameters such as binding propensity, neighboring residues in the vicinity of binding sites, conservation score and conformational switching. Results We observed that the binding propensities of amino acid residues are specific for protein-protein complexes. Further, typical dipeptides and tripeptides showed high preference for binding, which is unique to protein-protein complexes. Most of the binding site residues are highly conserved among homologous sequences. Our analysis showed that 7% of residues changed their conformations upon protein-protein complex formation and it is 9.2% and 6.6% in the binding and non-binding sites, respectively. Specifically, the residues Glu, Lys, Leu and Ser changed their conformation from coil to helix/strand and from helix to coil/strand. Leu, Ser, Thr and Val prefer to change their conformation from strand to coil/helix. Conclusions The results obtained in this study will be helpful for understanding and predicting the binding sites in protein-protein complexes.

  2. Automatic classification of protein structures relying on similarities between alignments

    Directory of Open Access Journals (Sweden)

    Santini Guillaume

    2012-09-01

    Full Text Available Abstract Background Identification of protein structural cores requires isolation of sets of proteins all sharing a same subset of structural motifs. In the context of an ever growing number of available 3D protein structures, standard and automatic clustering algorithms require adaptations so as to allow for efficient identification of such sets of proteins. Results When considering a pair of 3D structures, they are stated as similar or not according to the local similarities of their matching substructures in a structural alignment. This binary relation can be represented in a graph of similarities where a node represents a 3D protein structure and an edge states that two 3D protein structures are similar. Therefore, classifying proteins into structural families can be viewed as a graph clustering task. Unfortunately, because such a graph encodes only pairwise similarity information, clustering algorithms may include in the same cluster a subset of 3D structures that do not share a common substructure. In order to overcome this drawback we first define a ternary similarity on a triple of 3D structures as a constraint to be satisfied by the graph of similarities. Such a ternary constraint takes into account similarities between pairwise alignments, so as to ensure that the three involved protein structures do have some common substructure. We propose hereunder a modification algorithm that eliminates edges from the original graph of similarities and gives a reduced graph in which no ternary constraints are violated. Our approach is then first to build a graph of similarities, then to reduce the graph according to the modification algorithm, and finally to apply to the reduced graph a standard graph clustering algorithm. Such method was used for classifying ASTRAL-40 non-redundant protein domains, identifying significant pairwise similarities with Yakusa, a program devised for rapid 3D structure alignments. Conclusions We show that filtering

  3. Structuring oil by protein building blocks

    NARCIS (Netherlands)

    Vries, de Auke

    2017-01-01

    Over the recent years, structuring of oil into ‘organogels’ or ‘oleogels’ has gained much attention amongst colloid-, material,- and food scientists. Potentially, these oleogels could be used as an alternative for saturated- and trans fats in food products. To develop

  4. Protein secondary structure: category assignment and predictability

    DEFF Research Database (Denmark)

    Andersen, Claus A.; Bohr, Henrik; Brunak, Søren

    2001-01-01

    structures. Single sequence prediction of the new three category assignment gives an overall prediction improvement of 3.1% and 5.1%, compared to the DSSP assignment and schemes where the helix category consists of a-helix and 3(10)-helix, respectively. These results were achieved using a standard feed-forward...

  5. Anatomically anchored template-based level set segmentation: application to quadriceps muscles in MR images from the Osteoarthritis Initiative.

    Science.gov (United States)

    Prescott, Jeffrey W; Best, Thomas M; Swanson, Mark S; Haq, Furqan; Jackson, Rebecca D; Gurcan, Metin N

    2011-02-01

    In this paper, we present a semi-automated segmentation method for magnetic resonance images of the quadriceps muscles. Our method uses an anatomically anchored, template-based initialization of the level set-based segmentation approach. The method only requires the input of a single point from the user inside the rectus femoris. The templates are quantitatively selected from a set of images based on modes in the patient population, namely, sex and body type. For a given image to be segmented, a template is selected based on the smallest Kullback-Leibler divergence between the histograms of that image and the set of templates. The chosen template is then employed as an initialization for a level set segmentation, which captures individual anatomical variations in the image to be segmented. Images from 103 subjects were analyzed using the developed method. The algorithm was trained on a randomly selected subset of 50 subjects (25 men and 25 women) and tested on the remaining 53 subjects. The performance of the algorithm on the test set was compared against the ground truth using the Zijdenbos similarity index (ZSI). The average ZSI means and standard deviations against two different manual readers were as follows: rectus femoris, 0.78 ± 0.12; vastus intermedius, 0.79 ± 0.10; vastus lateralis, 0.82 ± 0.08; and vastus medialis, 0.69 ± 0.16.

  6. Context- and Template-Based Compression for Efficient Management of Data Models in Resource-Constrained Systems.

    Science.gov (United States)

    Macho, Jorge Berzosa; Montón, Luis Gardeazabal; Rodriguez, Roberto Cortiñas

    2017-08-01

    The Cyber Physical Systems (CPS) paradigm is based on the deployment of interconnected heterogeneous devices and systems, so interoperability is at the heart of any CPS architecture design. In this sense, the adoption of standard and generic data formats for data representation and communication, e.g., XML or JSON, effectively addresses the interoperability problem among heterogeneous systems. Nevertheless, the verbosity of those standard data formats usually demands system resources that might suppose an overload for the resource-constrained devices that are typically deployed in CPS. In this work we present Context- and Template-based Compression (CTC), a data compression approach targeted to resource-constrained devices, which allows reducing the resources needed to transmit, store and process data models. Additionally, we provide a benchmark evaluation and comparison with current implementations of the Efficient XML Interchange (EXI) processor, which is promoted by the World Wide Web Consortium (W3C), and it is the most prominent XML compression mechanism nowadays. Interestingly, the results from the evaluation show that CTC outperforms EXI implementations in terms of memory usage and speed, keeping similar compression rates. As a conclusion, CTC is shown to be a good candidate for managing standard data model representation formats in CPS composed of resource-constrained devices.

  7. Structural protein 4.1 is located in mammalian centrosomes

    Energy Technology Data Exchange (ETDEWEB)

    Krauss, S.W.; Chasis, J.A.; Rogers, C.; Mohandas, N.; Krockmalnic, G.; Penman, S.

    1997-07-01

    Structural protein 4.1 was first characterized as an important 80-kDa protein in the mature red cell membrane skeleton. It is now known to be a member of a family of protein isoforms detected at diverse intracellular sites in many nucleated mammalian cells. We recently reported that protein 4.1 isoforms are present at interphase in nuclear matrix and are rearranged during the cell cycle. Here we report that protein 4.1 epitopes are present in centrosomes of human and murine cells and are detected by using affinity-purified antibodies specific for 80-kDa red cell 4.1 and for 4.1 peptides. Immunofluorescence, by both conventional and confocal microscopy, showed that protein 4.1 epitopes localized in the pericentriolar region. Protein 4.1 epitopes remained in centrosomes after extraction of cells with detergent, salt, and DNase. Higher resolution electron microscopy of detergent-extracted cell whole mounts showed centrosomal protein 4.1 epitopes distributed along centriolar cylinders and on pericentriolar fibers, at least some of which constitute the filamentous network surrounding each centriole. Double-label electron microscopy showed that protein 4.1 epitopes were predominantly localized in regions also occupied by epitopes for centrosome-specific autoimmune serum 5051 but were not found on microtubules. Our results suggest that protein 4.1 is an integral component of centrosome structure, in which it may play an important role in centrosome function during cell division and organization of cellular architecture.

  8. Illuminating structural proteins in viral "dark matter" with metaproteomics.

    Science.gov (United States)

    Brum, Jennifer R; Ignacio-Espinoza, J Cesar; Kim, Eun-Hae; Trubl, Gareth; Jones, Robert M; Roux, Simon; VerBerkmoes, Nathan C; Rich, Virginia I; Sullivan, Matthew B

    2016-03-01

    Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional "viral dark matter." Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional dark matter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore, four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world's oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Together, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter.

  9. Structural predictions of neurobiologically relevant G-protein coupled receptors and intrinsically disordered proteins.

    Science.gov (United States)

    Rossetti, Giulia; Dibenedetto, Domenica; Calandrini, Vania; Giorgetti, Alejandro; Carloni, Paolo

    2015-09-15

    G protein coupled receptors (GPCRs) and intrinsic disordered proteins (IDPs) are key players for neuronal function and dysfunction. Unfortunately, their structural characterization is lacking in most cases. From one hand, no experimental structure has been determined for the two largest GPCRs subfamilies, both key proteins in neuronal pathways. These are the odorant (450 members out of 900 human GPCRs) and the bitter taste receptors (25 members) subfamilies. On the other hand, also IDPs structural characterization is highly non-trivial. They exist as dynamic, highly flexible structural ensembles that undergo conformational conversions on a wide range of timescales, spanning from picoseconds to milliseconds. Computational methods may be of great help to characterize these neuronal proteins. Here we review recent progress from our lab and other groups to develop and apply in silico methods for structural predictions of these highly relevant, fascinating and challenging systems. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Chaperonin Structure - The Large Multi-Subunit Protein Complex

    Directory of Open Access Journals (Sweden)

    Irena Roterman

    2009-03-01

    Full Text Available The multi sub-unit protein structure representing the chaperonins group is analyzed with respect to its hydrophobicity distribution. The proteins of this group assist protein folding supported by ATP. The specific axial symmetry GroEL structure (two rings of seven units stacked back to back - 524 aa each and the GroES (single ring of seven units - 97 aa each polypeptide chains are analyzed using the hydrophobicity distribution expressed as excess/deficiency all over the molecule to search for structure-to-function relationships. The empirically observed distribution of hydrophobic residues is confronted with the theoretical one representing the idealized hydrophobic core with hydrophilic residues exposure on the surface. The observed discrepancy between these two distributions seems to be aim-oriented, determining the structure-to-function relation. The hydrophobic force field structure generated by the chaperonin capsule is presented. Its possible influence on substrate folding is suggested.

  11. Mining protein loops using a structural alphabet and statistical exceptionality

    Directory of Open Access Journals (Sweden)

    Martin Juliette

    2010-02-01

    Full Text Available Abstract Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times. Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words. These structural words have low structural variability (mean RMSd of 0.85 Å. As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues and long loops. Moreover, half of

  12. Sequential Release of Proteins from Structured Multishell Microcapsules.

    Science.gov (United States)

    Shimanovich, Ulyana; Michaels, Thomas C T; De Genst, Erwin; Matak-Vinkovic, Dijana; Dobson, Christopher M; Knowles, Tuomas P J

    2017-10-09

    In nature, a wide range of functional materials is based on proteins. Increasing attention is also turning to the use of proteins as artificial biomaterials in the form of films, gels, particles, and fibrils that offer great potential for applications in areas ranging from molecular medicine to materials science. To date, however, most such applications have been limited to single component materials despite the fact that their natural analogues are composed of multiple types of proteins with a variety of functionalities that are coassembled in a highly organized manner on the micrometer scale, a process that is currently challenging to achieve in the laboratory. Here, we demonstrate the fabrication of multicomponent protein microcapsules where the different components are positioned in a controlled manner. We use molecular self-assembly to generate multicomponent structures on the nanometer scale and droplet microfluidics to bring together the different components on the micrometer scale. Using this approach, we synthesize a wide range of multiprotein microcapsules containing three well-characterized proteins: glucagon, insulin, and lysozyme. The localization of each protein component in multishell microcapsules has been detected by labeling protein molecules with different fluorophores, and the final three-dimensional microcapsule structure has been resolved by using confocal microscopy together with image analysis techniques. In addition, we show that these structures can be used to tailor the release of such functional proteins in a sequential manner. Moreover, our observations demonstrate that the protein release mechanism from multishell capsules is driven by the kinetic control of mass transport of the cargo and by the dissolution of the shells. The ability to generate artificial materials that incorporate a variety of different proteins with distinct functionalities increases the breadth of the potential applications of artificial protein-based materials

  13. SCOWLP classification: Structural comparison and analysis of protein binding regions

    Directory of Open Access Journals (Sweden)

    Anders Gerd

    2008-01-01

    Full Text Available Abstract Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions

  14. High throughput platforms for structural genomics of integral membrane proteins.

    Science.gov (United States)

    Mancia, Filippo; Love, James

    2011-08-01

    Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.

  15. Fourier Analysis of Conservation Patterns in Protein Secondary Structure.

    Science.gov (United States)

    Palaniappan, Ashok; Jakobsson, Eric

    2017-01-01

    Residue conservation is a common observation in alignments of protein families, underscoring positions important in protein structure and function. Though many methods measure the level of conservation of particular residue positions, currently we do not have a way to study spatial oscillations occurring in protein conservation patterns. It is known that hydrophobicity shows spatial oscillations in proteins, which is characterized by computing the hydrophobic moment of the protein domains. Here, we advance the study of moments of conservation of protein families to know whether there might exist spatial asymmetry in the conservation patterns of regular secondary structures. Analogous to the hydrophobic moment, the conservation moment is defined as the modulus of the Fourier transform of the conservation function of an alignment of related protein, where the conservation function is the vector of conservation values at each column of the alignment. The profile of the conservation moment is useful in ascertaining any periodicity of conservation, which might correlate with the period of the secondary structure. To demonstrate the concept, conservation in the family of potassium ion channel proteins was analyzed using moments. It was shown that the pore helix of the potassium channel showed oscillations in the moment of conservation matching the period of the α-helix. This implied that one side of the pore helix was evolutionarily conserved in contrast to its opposite side. In addition, the method of conservation moments correctly identified the disposition of the voltage sensor of voltage-gated potassium channels to form a 310 helix in the membrane.

  16. Protein structure as a means to triage proposed PTM sites.

    Science.gov (United States)

    Vandermarliere, Elien; Martens, Lennart

    2013-03-01

    PTMs such as phosphorylation are often important actors in protein regulation and recognition. These functions require both visibility and accessibility to other proteins; that the modification is located at the surface of the protein. Currently, many repositories provide information on PTMs but structural information is often lacking. This study, which focuses on phosphorylation sites available in UniProtKB/Swiss-Prot, illustrates that most phosphorylation sites are indeed found at the surface of the protein, but that some sites are found buried in the core of the protein. Several of these identified buried phosphorylation sites can easily become accessible upon small conformational changes while others would require the whole protein to unfold and are hence most unlikely modification sites. Subsequent analysis of phosphorylation sites available in PRIDE demonstrates that taking the structure of the protein into account would be a good guide in the identification of the actual phosphorylated positions in phophoproteomics experiments. This analysis illustrates that care must be taken when simply accepting the position of a PTM without first analyzing its position within the protein structure if the latter is available. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Proteins at flowing interfaces: From understanding structure to treating disease

    Science.gov (United States)

    Posada, David; Young, James; Hirsa, Amir

    2012-11-01

    The field of soft matter offers vast opportunities for scientific and technological developments, with many challenges that need to be addressed by various disciplines. Fluid dynamics has a tremendous potential for greater impact, from broadening fundamental understanding to treating disease. Here we demonstrate the use of fluid dynamics in two biotechnology problems involving proteins at the air/water interface: a) 2-Dimensional protein crystallization and b) amyloid fibril formation. Protein crystallization is usually the most challenging step in X-ray diffraction analysis of protein structure. Recently it was demonstrated that flow can induce 2-D protein crystallization at conditions under which quiescent systems do not form crystals. A different form of protein structuring, namely amyloid fibrillization, is also of interest due to its association with several neurodegenerative diseases such as Alzheimer's and Parkinson's disease. Protein denaturation, which is the root of the fibrillization process, is also a significant concern in biotherapeutics production. Both problems are studied by using shearing free-surface flows in simple geometries. The common finding is that flow can significantly enhance the growth of protein structures.

  18. Water Determines the Structure and Dynamics of Proteins.

    Science.gov (United States)

    Bellissent-Funel, Marie-Claire; Hassanali, Ali; Havenith, Martina; Henchman, Richard; Pohl, Peter; Sterpone, Fabio; van der Spoel, David; Xu, Yao; Garcia, Angel E

    2016-07-13

    Water is an essential participant in the stability, structure, dynamics, and function of proteins and other biomolecules. Thermodynamically, changes in the aqueous environment affect the stability of biomolecules. Structurally, water participates chemically in the catalytic function of proteins and nucleic acids and physically in the collapse of the protein chain during folding through hydrophobic collapse and mediates binding through the hydrogen bond in complex formation. Water is a partner that slaves the dynamics of proteins, and water interaction with proteins affect their dynamics. Here we provide a review of the experimental and computational advances over the past decade in understanding the role of water in the dynamics, structure, and function of proteins. We focus on the combination of X-ray and neutron crystallography, NMR, terahertz spectroscopy, mass spectroscopy, thermodynamics, and computer simulations to reveal how water assist proteins in their function. The recent advances in computer simulations and the enhanced sensitivity of experimental tools promise major advances in the understanding of protein dynamics, and water surely will be a protagonist.

  19. Cavities and atomic packing in protein structures and interfaces.

    Directory of Open Access Journals (Sweden)

    Shrihari Sonavane

    2008-09-01

    Full Text Available A comparative analysis of cavities enclosed in a tertiary structure of proteins and interfaces formed by the interaction of two protein subunits in obligate and non-obligate categories (represented by homodimeric molecules and heterocomplexes, respectively is presented. The total volume of cavities increases with the size of the protein (or the interface, though the exact relationship may vary in different cases. Likewise, for individual cavities also there is quantitative dependence of the volume on the number of atoms (or residues lining the cavity. The larger cavities tend to be less spherical, solvated, and the interfaces are enriched in these. On average 15 A(3 of cavity volume is found to accommodate single water, with another 40-45 A(3 needed for each additional solvent molecule. Polar atoms/residues have a higher propensity to line solvated cavities. Relative to the frequency of occurrence in the whole structure (or interface, residues in beta-strands are found more often lining the cavities, and those in turn and loop the least. Any depression in one chain not complemented by a protrusion in the other results in a cavity in the protein-protein interface. Through the use of the Voronoi volume, the packing of residues involved in protein-protein interaction has been compared to that in the protein interior. For a comparable number of atoms the interface has about twice the number of cavities relative to the tertiary structure.

  20. Structures of multidomain proteins adsorbed on hydrophobic interaction chromatography surfaces.

    Science.gov (United States)

    Gospodarek, Adrian M; Sun, Weitong; O'Connell, John P; Fernandez, Erik J

    2014-12-05

    In hydrophobic interaction chromatography (HIC), interactions between buried hydrophobic residues and HIC surfaces can cause conformational changes that interfere with separations and cause yield losses. This paper extends our previous investigations of protein unfolding in HIC chromatography by identifying protein structures on HIC surfaces under denaturing conditions and relating them to solution behavior. The thermal unfolding of three model multidomain proteins on three HIC surfaces of differing hydrophobicities was investigated with hydrogen exchange mass spectrometry (HXMS). The data were analyzed to obtain unfolding rates and Gibbs free energies for unfolding of adsorbed proteins. The melting temperatures of the proteins were lowered, but by different amounts, on the different surfaces. In addition, the structures of the proteins on the chromatographic surfaces were similar to the partially unfolded structures produced in the absence of a surface by temperature as well as by chemical denaturants. Finally, it was found that patterns of residue exposure to solvent on different surfaces at different temperatures can be largely superimposed. These findings suggest that protein unfolding on various HIC surfaces might be quantitatively related to protein unfolding in solution and that details of surface unfolding behavior might be generalized. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. Ranking beta sheet topologies with applications to protein structure prediction

    DEFF Research Database (Denmark)

    Fonseca, Rasmus; Helles, Glennie; Winter, Pawel

    2011-01-01

    One reason why ab initio protein structure predictors do not perform very well is their inability to reliably identify long-range interactions between amino acids. To achieve reliable long-range interactions, all potential pairings of ß-strands (ß-topologies) of a given protein are enumerated...... of this paper is a method to deal with the inaccuracies of secondary structure predictors when enumerating potential ß-topologies. The results reported in this paper are highly relevant for ab initio protein structure prediction methods based on decoy generation. They indicate that decoy generation can......, consistently top-ranks native ß-topologies. Since the number of potential ß-topologies grows exponentially with the number of ß-strands, it is unrealistic to expect that all potential ß-topologies can be enumerated for large proteins. The second result of this paper is an enumeration scheme of a subset of ß...

  2. Crystal structure of Homo sapiens protein LOC79017

    Energy Technology Data Exchange (ETDEWEB)

    Bae, Euiyoung; Bingman, Craig A.; Aceti, David J.; Phillips, Jr., George N. (UW)

    2010-02-08

    LOC79017 (MW 21.0 kDa, residues 1-188) was annotated as a hypothetical protein encoded by Homo sapiens chromosome 7 open reading frame 24. It was selected as a target by the Center for Eukaryotic Structural Genomics (CESG) because it did not share more than 30% sequence identity with any protein for which the three-dimensional structure is known. The biological function of the protein has not been established yet. Parts of LOC79017 were identified as members of uncharacterized Pfam families (residues 1-95 as PB006073 and residues 104-180 as PB031696). BLAST searches revealed homologues of LOC79017 in many eukaryotes, but none of them have been functionally characterized. Here, we report the crystal structure of H. sapiens protein LOC79017 (UniGene code Hs.530024, UniProt code O75223, CESG target number go.35223).

  3. Applications of graph theory in protein structure identification.

    Science.gov (United States)

    Yan, Yan; Zhang, Shenggui; Wu, Fang-Xiang

    2011-10-14

    There is a growing interest in the identification of proteins on the proteome wide scale. Among different kinds of protein structure identification methods, graph-theoretic methods are very sharp ones. Due to their lower costs, higher effectiveness and many other advantages, they have drawn more and more researchers' attention nowadays. Specifically, graph-theoretic methods have been widely used in homology identification, side-chain cluster identification, peptide sequencing and so on. This paper reviews several methods in solving protein structure identification problems using graph theory. We mainly introduce classical methods and mathematical models including homology modeling based on clique finding, identification of side-chain clusters in protein structures upon graph spectrum, and de novo peptide sequencing via tandem mass spectrometry using the spectrum graph model. In addition, concluding remarks and future priorities of each method are given.

  4. Blind Test of Physics-Based Prediction of Protein Structures

    Science.gov (United States)

    Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

    2009-01-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130

  5. Blind test of physics-based prediction of protein structures.

    Science.gov (United States)

    Shell, M Scott; Ozkan, S Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A

    2009-02-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences.

  6. Structural genomics target selection for the New York consortium on membrane protein structure.

    Science.gov (United States)

    Punta, Marco; Love, James; Handelman, Samuel; Hunt, John F; Shapiro, Lawrence; Hendrickson, Wayne A; Rost, Burkhard

    2009-12-01

    The New York Consortium on Membrane Protein Structure (NYCOMPS), a part of the Protein Structure Initiative (PSI) in the USA, has as its mission to establish a high-throughput pipeline for determination of novel integral membrane protein structures. Here we describe our current target selection protocol, which applies structural genomics approaches informed by the collective experience of our team of investigators. We first extract all annotated proteins from our reagent genomes, i.e. the 96 fully sequenced prokaryotic genomes from which we clone DNA. We filter this initial pool of sequences and obtain a list of valid targets. NYCOMPS defines valid targets as those that, among other features, have at least two predicted transmembrane helices, no predicted long disordered regions and, except for community nominated targets, no significant sequence similarity in the predicted transmembrane region to any known protein structure. Proteins that feed our experimental pipeline are selected by defining a protein seed and searching the set of all valid targets for proteins that are likely to have a transmembrane region structurally similar to that of the seed. We require sequence similarity aligning at least half of the predicted transmembrane region of seed and target. Seeds are selected according to their feasibility and/or biological interest, and they include both centrally selected targets and community nominated targets. As of December 2008, over 6,000 targets have been selected and are currently being processed by the experimental pipeline. We discuss how our target list may impact structural coverage of the membrane protein space.

  7. Packing of protein structures in clusters with magic numbers

    DEFF Research Database (Denmark)

    Lindgård, Per-Anker; Bohr, Henrik

    1997-01-01

    of clusters containing magic numbers of secondary structures and multipla of these cluster. A scheme for the relation between the sequence information and the native fold is given. We have performed a statistical analysis of available protein structures and found agreement with the predicted preferred...

  8. Water-mediated ionic interactions in protein structures

    Indian Academy of Sciences (India)

    It is well known that water molecules play an indispensable role in the structure and function of biological macromolecules. The water-mediated ionic interactions between the charged residues provide stability and plasticity and in turn address the function of the protein structures. Thus, this study specifically addresses the ...

  9. NMR structural studies of protein-small molecule interactions

    NARCIS (Netherlands)

    Shah, Dipen M.

    2014-01-01

    The research presented in the thesis describes the development and implementation of solution based NMR methods that provide 3D structural information on the protein-small molecule complexes. These methods can be critical for structure based drug design and can be readily applied in the early stages

  10. Connecting Protein Structure to Intermolecular Interactions: A Computer Modeling Laboratory

    Science.gov (United States)

    Abualia, Mohammed; Schroeder, Lianne; Garcia, Megan; Daubenmire, Patrick L.; Wink, Donald J.; Clark, Ginevra A.

    2016-01-01

    An understanding of protein folding relies on a solid foundation of a number of critical chemical concepts, such as molecular structure, intra-/intermolecular interactions, and relating structure to function. Recent reports show that students struggle on all levels to achieve these understandings and use them in meaningful ways. Further, several…

  11. Protein structure prediction using bee colony optimization metaheuristic

    DEFF Research Database (Denmark)

    Fonseca, Rasmus; Paluszewski, Martin; Winter, Pawel

    2010-01-01

    of the proteins structure, an energy potential and some optimization algorithm that ¿nds the structure with minimal energy. Bee Colony Optimization (BCO) is a relatively new approach to solving opti- mization problems based on the foraging behaviour of bees. Several variants of BCO have been suggested...

  12. Statistical analysis of unstructured amino acid residues in protein structures.

    Science.gov (United States)

    Lobanov, M Yu; Garbuzynskiy, S O; Galzitskaya, O V

    2010-02-01

    We have performed a statistical analysis of unstructured amino acid residues in protein structures available in the databank of protein structures. Data on the occurrence of disordered regions at the ends and in the middle part of protein chains have been obtained: in the regions near the ends (at distance less than 30 residues from the N- or C-terminus), there are 66% of unstructured residues (38% are near the N-terminus and 28% are near the C-terminus), although these terminal regions include only 23% of the amino acid residues. The frequencies of occurrence of unstructured residues have been calculated for each of 20 types in different positions in the protein chain. It has been shown that relative frequencies of occurrence of unstructured residues of 20 types at the termini of protein chains differ from the ones in the middle part of the protein chain; amino acid residues of the same type have different probabilities to be unstructured in the terminal regions and in the middle part of the protein chain. The obtained frequencies of occurrence of unstructured residues in the middle part of the protein chain have been used as a scale for predicting disordered regions from amino acid sequence using the method (FoldUnfold) previously developed by us. This scale of frequencies of occurrence of unstructured residues correlates with the contact scale (previously developed by us and used for the same purpose) at a level of 95%. Testing the new scale on a database of 427 unstructured proteins and 559 completely structured proteins has shown that this scale can be successfully used for the prediction of disordered regions in protein chains.

  13. Automated High Throughput Protein Crystallization Screening at Nanoliter Scale and Protein Structural Study on Lactate Dehydrogenase

    Energy Technology Data Exchange (ETDEWEB)

    Li, Fenglei [Iowa State Univ., Ames, IA (United States)

    2006-08-09

    The purposes of our research were: (1) To develop an economical, easy to use, automated, high throughput system for large scale protein crystallization screening. (2) To develop a new protein crystallization method with high screening efficiency, low protein consumption and complete compatibility with high throughput screening system. (3) To determine the structure of lactate dehydrogenase complexed with NADH by x-ray protein crystallography to study its inherent structural properties. Firstly, we demonstrated large scale protein crystallization screening can be performed in a high throughput manner with low cost, easy operation. The overall system integrates liquid dispensing, crystallization and detection and serves as a whole solution to protein crystallization screening. The system can dispense protein and multiple different precipitants in nanoliter scale and in parallel. A new detection scheme, native fluorescence, has been developed in this system to form a two-detector system with a visible light detector for detecting protein crystallization screening results. This detection scheme has capability of eliminating common false positives by distinguishing protein crystals from inorganic crystals in a high throughput and non-destructive manner. The entire system from liquid dispensing, crystallization to crystal detection is essentially parallel, high throughput and compatible with automation. The system was successfully demonstrated by lysozyme crystallization screening. Secondly, we developed a new crystallization method with high screening efficiency, low protein consumption and compatibility with automation and high throughput. In this crystallization method, a gas permeable membrane is employed to achieve the gentle evaporation required by protein crystallization. Protein consumption is significantly reduced to nanoliter scale for each condition and thus permits exploring more conditions in a phase diagram for given amount of protein. In addition

  14. PDBalert: automatic, recurrent remote homology tracking and protein structure prediction

    Directory of Open Access Journals (Sweden)

    Söding Johannes

    2008-11-01

    Full Text Available Abstract Background During the last years, methods for remote homology detection have grown more and more sensitive and reliable. Automatic structure prediction servers relying on these methods can generate useful 3D models even below 20% sequence identity between the protein of interest and the known structure (template. When no homologs can be found in the protein structure database (PDB, the user would need to rerun the same search at regular intervals in order to make timely use of a template once it becomes available. Results PDBalert is a web-based automatic system that sends an email alert as soon as a structure with homology to a protein in the user's watch list is released to the PDB database or appears among the sequences on hold. The mail contains links to the search results and to an automatically generated 3D homology model. The sequence search is performed with the same software as used by the very sensitive and reliable remote homology detection server HHpred, which is based on pairwise comparison of Hidden Markov models. Conclusion PDBalert will accelerate the information flow from the PDB database to all those who can profit from the newly released protein structures for predicting the 3D structure or function of their proteins of interest.

  15. Develop Infrared Structural Biology for Probing Structural Dynamics of Protein Functions

    Science.gov (United States)

    Xie, Aihua; Kang, Zhouyang; Causey, Oliver; Liu, Charle

    2015-03-01

    Protein functions are carried out through a series of structural transitions. Lack of knowledge on functionally important structural motions of proteins impedes our understanding of protein functions. Infrared structural biology is an emerging technology with powerful applications for protein structural dynamics. One key element of infrared structural biology is the development of vibrational structural marker (VSM) database library that translates infrared spectroscopic signals into specific structural information. We report the development of VSM for probing the type, geometry and strength of hydrogen bonding interactions of buried COO- side chains of Asp and Glu in proteins. Quantum theory based first principle computational studies combined with bioinformatic hydrogen bond analysis are employed in this study. We will discuss the applications of VSM in mechanistic studies of protein functions. Infrared structural biology is expected to emerge as a powerful technique for elucidating the functional mechanism of a broad range of proteins, including water soluble and membrane proteins. This work is supported by OCAST HR10-078 and NSF DBI1338097.

  16. Computing energy landscape maps and structural excursions of proteins.

    Science.gov (United States)

    Sapin, Emmanuel; Carr, Daniel B; De Jong, Kenneth A; Shehu, Amarda

    2016-08-18

    Structural excursions of a protein at equilibrium are key to biomolecular recognition and function modulation. Protein modeling research is driven by the need to aid wet laboratories in characterizing equilibrium protein dynamics. In principle, structural excursions of a protein can be directly observed via simulation of its dynamics, but the disparate temporal scales involved in such excursions make this approach computationally impractical. On the other hand, an informative representation of the structure space available to a protein at equilibrium can be obtained efficiently via stochastic optimization, but this approach does not directly yield information on equilibrium dynamics. We present here a novel methodology that first builds a multi-dimensional map of the energy landscape that underlies the structure space of a given protein and then queries the computed map for energetically-feasible excursions between structures of interest. An evolutionary algorithm builds such maps with a practical computational budget. Graphical techniques analyze a computed multi-dimensional map and expose interesting features of an energy landscape, such as basins and barriers. A path searching algorithm then queries a nearest-neighbor graph representation of a computed map for energetically-feasible basin-to-basin excursions. Evaluation is conducted on intrinsically-dynamic proteins of importance in human biology and disease. Visual statistical analysis of the maps of energy landscapes computed by the proposed methodology reveals features already captured in the wet laboratory, as well as new features indicative of interesting, unknown thermodynamically-stable and semi-stable regions of the equilibrium structure space. Comparison of maps and structural excursions computed by the proposed methodology on sequence variants of a protein sheds light on the role of equilibrium structure and dynamics in the sequence-function relationship. Applications show that the proposed methodology

  17. Relationship between Molecular Structure Characteristics of Feed Proteins and Protein Digestibility and Solubility

    Directory of Open Access Journals (Sweden)

    Mingmei Bai

    2016-08-01

    Full Text Available The nutritional value of feed proteins and their utilization by livestock are related not only to the chemical composition but also to the structure of feed proteins, but few studies thus far have investigated the relationship between the structure of feed proteins and their solubility as well as digestibility in monogastric animals. To address this question we analyzed soybean meal, fish meal, corn distiller’s dried grains with solubles, corn gluten meal, and feather meal by Fourier transform infrared (FTIR spectroscopy to determine the protein molecular spectral band characteristics for amides I and II as well as α-helices and β-sheets and their ratios. Protein solubility and in vitro digestibility were measured with the Kjeldahl method using 0.2% KOH solution and the pepsin-pancreatin two-step enzymatic method, respectively. We found that all measured spectral band intensities (height and area of feed proteins were correlated with their the in vitro digestibility and solubility (p≤0.003; moreover, the relatively quantitative amounts of α-helices, random coils, and α-helix to β-sheet ratio in protein secondary structures were positively correlated with protein in vitro digestibility and solubility (p≤0.004. On the other hand, the percentage of β-sheet structures was negatively correlated with protein in vitro digestibility (p<0.001 and solubility (p = 0.002. These results demonstrate that the molecular structure characteristics of feed proteins are closely related to their in vitro digestibility at 28 h and solubility. Furthermore, the α-helix-to-β-sheet ratio can be used to predict the nutritional value of feed proteins.

  18. Structures and Interactions of Proteins in the Brain

    DEFF Research Database (Denmark)

    Nielsen, Lau Dalby

    coding for Arc protein has been domesticated from the same branch of genes that has given rise to retroviruses. We show that even despite the large evolutional distance between Arc and retroviruses. Despite large evolutionary distance Arc still self-assemble into higher order structures that resembles......The protein low density lipoprotein receptor related protein 1 (LRP1) plays multiple roles in the biology of amyloid β peptide (Aβ) and Alzheimer’s disease. LRP1 is very important for clearance of Aβ both in the brain and by facilitating Aβ export over the blood brain barrier. In spite...... the primary nucleation is increased. The data furthermore indicates that there is an interaction with Aβ oligomer state and possible also the fibrils. Another brain protein is the neuronal protein Activity-regulated cytoskeletonassociated protein (Arc) which is important for learning and memory. The gene...

  19. Perspective: Structural fluctuation of protein and Anfinsen's thermodynamic hypothesis.

    Science.gov (United States)

    Hirata, Fumio; Sugita, Masatake; Yoshida, Masasuke; Akasaka, Kazuyuki

    2018-01-14

    The thermodynamics hypothesis, casually referred to as "Anfinsen's dogma," is described theoretically in terms of a concept of the structural fluctuation of protein or the first moment (average structure) and the second moment (variance and covariance) of the structural distribution. The new theoretical concept views the unfolding and refolding processes of protein as a shift of the structural distribution induced by a thermodynamic perturbation, with the variance-covariance matrix varying. Based on the theoretical concept, a method to characterize the mechanism of folding (or unfolding) is proposed. The transition state, if any, between two stable states is interpreted as a gap in the distribution, which is created due to an extensive reorganization of hydrogen bonds among back-bone atoms of protein and with water molecules in the course of conformational change. Further perspective to applying the theory to the computer-aided drug design, and to the material science, is briefly discussed.

  20. Topological properties of four networks in protein structures

    Science.gov (United States)

    Min, Seungsik; Kim, Kyungsik; Chang, Ki-Ho; Ha, Deok-Ho; Lee, Jun-Ho

    2017-11-01

    In this paper, we investigate the complex networks of interacting amino acids in protein structures. The cellular networks and their random controls are treated for the four threshold distances between atoms. The numerical simulation and analysis are relevant to the topological properties of the complex networks in the structural classification of proteins, and we mainly estimate the network's metrics from the resultant network. The cellular network is shown to exhibit a small-world feature regardless of their structural class. The protein structure presents the positive assortative coefficients, when the topological property is described as a tendency for connectivity of high-degree nodes. We particularly show that both the modularity and the small-wordness are significantly followed the increasing function against nodes.

  1. Structure and Dynamic Properties of Membrane Proteins using NMR

    DEFF Research Database (Denmark)

    Rösner, Heike; Kragelund, Birthe

    2012-01-01

    conformational changes. Their structural and functional decoding is challenging and has imposed demanding experimental development. Solution nuclear magnetic resonance (NMR) spectroscopy is one of the techniques providing the capacity to make a significant difference in the deciphering of the membrane protein...... structure-function paradigm. The method has evolved dramatically during the last decade resulting in a plethora of new experiments leading to a significant increase in the scientific repertoire for studying membrane proteins. Besides solving the three-dimensional structures using state-of-the-art approaches......-populated states, this review seeks to introduce the vast possibilities solution NMR can offer to the study of membrane protein structure-function analyses with special focus on applicability. © 2012 American Physiological Society. Compr Physiol 2:1491-1539, 2012....

  2. Backbone Solution Structures of Proteins Using Residual Dipolar Couplings: Application to a Novel Structural Genomics Target

    Science.gov (United States)

    Valafar, H.; Mayer, K. L.; Bougault, C. M.; LeBlond, P. D.; Jenney, F. E.; Brereton, P. S.; Adams, M.W.W.; Prestegard, J.H.

    2006-01-01

    Structural genomics (or proteomics) activities are critically dependent on the availability of high-throughput structure determination methodology. Development of such methodology has been a particular challenge for NMR based structure determination because of the demands for isotopic labeling of proteins and the requirements for very long data acquisition times. We present here a methodology that gains efficiency from a focus on determination of backbone structures of proteins as opposed to full structures with all side chains in place. This focus is appropriate given the presumption that many protein structures in the future will be built using computational methods that start from representative fold family structures and replace as many as 70% of the side chains in the course of structure determination. The methodology we present is based primarily on residual dipolar couplings (RDCs), readily accessible NMR observables that constrain the orientation of backbone fragments irrespective of separation in space. A new software tool is described for the assembly of backbone fragments under RDC constraints and an application to a structural genomics target is presented. The target is an 8.7 kDa protein from Pyrococcus furiosus, PF1061, that was previously not well annotated, and had a nearest structurally characterized neighbor with only 33% sequence identity. The structure produced shows structural similarity to this sequence homologue, but also shows similarity to other proteins that suggests a functional role in sulfur transfer. Given the backbone structure and a possible functional link this should be an ideal target for development of modeling methods. PMID:15704012

  3. Protein micro-structuring as a tool to texturize protein foods

    NARCIS (Netherlands)

    Purwanti, N.; Peters, J.P.C.M.; Goot, van der A.J.

    2013-01-01

    Structuring protein foods to control the textural properties receives growing attention nowadays. It requires decoupling of the product properties such as water holding capacity and the mechanical properties from the actual protein concentration in the product. From an application point of view,

  4. Tertiary alphabet for the observable protein structural universe.

    Science.gov (United States)

    Mackenzie, Craig O; Zhou, Jianfu; Grigoryan, Gevorg

    2016-11-22

    Here, we systematically decompose the known protein structural universe into its basic elements, which we dub tertiary structural motifs (TERMs). A TERM is a compact backbone fragment that captures the secondary, tertiary, and quaternary environments around a given residue, comprising one or more disjoint segments (three on average). We seek the set of universal TERMs that capture all structure in the Protein Data Bank (PDB), finding remarkable degeneracy. Only ∼600 TERMs are sufficient to describe 50% of the PDB at sub-Angstrom resolution. However, more rare geometries also exist, and the overall structural coverage grows logarithmically with the number of TERMs. We go on to show that universal TERMs provide an effective mapping between sequence and structure. We demonstrate that TERM-based statistics alone are sufficient to recapitulate close-to-native sequences given either NMR or X-ray backbones. Furthermore, sequence variability predicted from TERM data agrees closely with evolutionary variation. Finally, locations of TERMs in protein chains can be predicted from sequence alone based on sequence signatures emergent from TERM instances in the PDB. For multisegment motifs, this method identifies spatially adjacent fragments that are not contiguous in sequence-a major bottleneck in structure prediction. Although all TERMs recur in diverse proteins, some appear specialized for certain functions, such as interface formation, metal coordination, or even water binding. Structural biology has benefited greatly from previously observed degeneracies in structure. The decomposition of the known structural universe into a finite set of compact TERMs offers exciting opportunities toward better understanding, design, and prediction of protein structure.

  5. PDB-UF: database of predicted enzymatic functions for unannotated protein structures from structural genomics

    Directory of Open Access Journals (Sweden)

    Rychlewski Leszek

    2006-02-01

    Full Text Available Abstract Background The number of protein structures from structural genomics centers dramatically increases in the Protein Data Bank (PDB. Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using only structural similarity. Results Here we present the PDB-UF database, a web-accessible collection of predictions of enzymatic properties using structure-function relationship. The assignments were conducted for three-dimensional protein structures of unknown function that come from structural genomics initiatives. We show that 4 hypothetical proteins (with PDB accession codes: 1VH0, 1NS5, 1O6D, and 1TO0, for which standard BLAST tools such as PSI-BLAST or RPS-BLAST failed to assign any function, are probably methyltransferase enzymes. Conclusion We suggest that the structure-based prediction of an EC number should be conducted having the different similarity score cutoff for different protein folds. Moreover, performing the annotation using two different algorithms can reduce the rate of false positive assignments. We believe, that the presented web-based repository will help to decrease the number of protein structures that have functions marked as "unknown" in the PDB file. Availability http://paradox.harvard.edu/PDB-UF and http://bioinfo.pl/PDB-UF

  6. Protein 3D Structure Computed from Evolutionary Sequence Variation

    Science.gov (United States)

    Sheridan, Robert; Hopf, Thomas A.; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

    2011-01-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein

  7. Clustering of protein structures using hydrophobic free energy and solvent accessibility of proteins.

    Science.gov (United States)

    Yu, Z G; Anh, V V; Lau, K S; Zhou, L Q

    2006-03-01

    The hydrophobic free energy and solvent accessibility of amino acids are used to study the relationship between the primary structure and structural classification of large proteins. A measure representation and a Z curve representation of protein sequences are proposed. Fractal analysis of the measure and Z curve representations of proteins and multifractal analysis of their hydrophobic free energy and solvent accessibility sequences indicate that the protein sequences possess correlations and multifractal scaling. The parameters from the fractal and multifractal analyses on these sequences are used to construct some parameter spaces. Each protein is represented by a point in these spaces. A method is proposed to distinguish and cluster proteins from the alpha, beta, alpha + beta, and alpha/beta structural classes in these parameter spaces. Fisher's linear discriminant algorithm is used to give a quantitative assessment of our clustering on the selected proteins. Numerical results indicate that the discriminant accuracies are satisfactory. In particular, they reach 94.12% and 88.89% in separating proteins from {alpha, alpha + beta, alpha/beta} proteins in a three-dimensional space.

  8. Structural and Functional Modeling of Artificial Bioactive Proteins

    Directory of Open Access Journals (Sweden)

    Nikola Štambuk

    2017-03-01

    Full Text Available A total of 32 synthetic proteins designed by Michael Hecht and co-workers was investigated using standard bioinformatics tools for the structure and function modeling. The dataset consisted of 15 artificial α-proteins (Hecht_α designed to fold into 102-residue four-helix bundles and 17 artificial six-stranded β-sheet proteins (Hecht_β. We compared the experimentally-determined properties of the sequences investigated with the results of computational methods for protein structure and bioactivity prediction. The conclusion reached is that the dataset of Michael Hecht and co-workers could be successfully used both to test current methods and to develop new ones for the characterization of artificially-designed molecules based on the specific binary patterns of amino acid polarity. The comparative investigations of the bioinformatics methods on the datasets of both de novo proteins and natural ones may lead to: (1 improvement of the existing tools for protein structure and function analysis; (2 new algorithms for the construction of de novo protein subsets; and (3 additional information on the complex natural sequence space and its relation to the individual subspaces of de novo sequences. Additional investigations on different and varied datasets are needed to confirm the general applicability of this concept.

  9. Structural Elements Regulating AAA+ Protein Quality Control Machines

    Directory of Open Access Journals (Sweden)

    Chiung-Wen Chang

    2017-05-01

    Full Text Available Members of the ATPases Associated with various cellular Activities (AAA+ superfamily participate in essential and diverse cellular pathways in all kingdoms of life by harnessing the energy of ATP binding and hydrolysis to drive their biological functions. Although most AAA+ proteins share a ring-shaped architecture, AAA+ proteins have evolved distinct structural elements that are fine-tuned to their specific functions. A central question in the field is how ATP binding and hydrolysis are coupled to substrate translocation through the central channel of ring-forming AAA+ proteins. In this mini-review, we will discuss structural elements present in AAA+ proteins involved in protein quality control, drawing similarities to their known role in substrate interaction by AAA+ proteins involved in DNA translocation. Elements to be discussed include the pore loop-1, the Inter-Subunit Signaling (ISS motif, and the Pre-Sensor I insert (PS-I motif. Lastly, we will summarize our current understanding on the inter-relationship of those structural elements and propose a model how ATP binding and hydrolysis might be coupled to polypeptide translocation in protein quality control machines.

  10. Organization, Structure and Activity of Proteins in Monolayers

    Energy Technology Data Exchange (ETDEWEB)

    Boucher,J.; Trudel, E.; Methot, M.; Desmeules, P.; Salesse, C.

    2007-01-01

    Many different processes take place at the cell membrane interface. Indeed, for instance, ligands bind membrane proteins which in turn activate peripheral membrane proteins, some of which are enzymes whose action is also located at the membrane interface. Native cell membranes are difficult to use to gain information on the activity of individual proteins at the membrane interface because of the large number of different proteins involved in membranous processes. Model membrane systems, such as monolayers at the air-water interface, have thus been extensively used during the last 50 years to reconstitute proteins and to gain information on their organization, structure and activity in membranes. In the present paper, we review the recent work we have performed with membrane and peripheral proteins as well as enzymes in monolayers at the air-water interface. We show that the structure and orientation of gramicidin has been determined by combining different methods. Furthermore, we demonstrate that the secondary structure of rhodopsin and bacteriorhodopsin is indistinguishable from that in native membranes when appropriate conditions are used. We also show that the kinetics and extent of monolayer binding of myristoylated recoverin is much faster than that of the nonmyristoylated form and that this binding is highly favored by the presence polyunsaturated phospholipids. Moreover, we show that the use of fragments of RPE65 allow determine which region of this protein is most likely involved in membrane binding. Monomolecular films were also used to further understand the hydrolysis of organized phospholipids by phospholipases A2 and C.

  11. Structural determinants of the eosinophil cationic protein antimicrobial activity.

    Science.gov (United States)

    Boix, Ester; Salazar, Vivian A; Torrent, Marc; Pulido, David; Nogués, M Victòria; Moussaoui, Mohammed

    2012-08-01

    Antimicrobial RNases are small cationic proteins belonging to the vertebrate RNase A superfamily and endowed with a wide range of antipathogen activities. Vertebrate RNases, while sharing the active site architecture, are found to display a variety of noncatalytical biological properties, providing an excellent example of multitask proteins. The antibacterial activity of distant related RNases suggested that the family evolved from an ancestral host-defence function. The review provides a structural insight into antimicrobial RNases, taking as a reference the human RNase 3, also named eosinophil cationic protein (ECP). A particular high binding affinity against bacterial wall structures mediates the protein action. In particular, the interaction with the lipopolysaccharides at the Gram-negative outer membrane correlates with the protein antimicrobial and specific cell agglutinating activity. Although a direct mechanical action at the bacteria wall seems to be sufficient to trigger bacterial death, a potential intracellular target cannot be discarded. Indeed, the cationic clusters at the protein surface may serve both to interact with nucleic acids and cell surface heterosaccharides. Sequence determinants for ECP activity were screened by prediction tools, proteolysis and peptide synthesis. Docking results are complementing the structural analysis to delineate the protein anchoring sites for anionic targets of biological significance.

  12. Models of protein-ligand crystal structures: trust, but verify

    Science.gov (United States)

    Deller, Marc C.; Rupp, Bernhard

    2015-09-01

    X-ray crystallography provides the most accurate models of protein-ligand structures. These models serve as the foundation of many computational methods including structure prediction, molecular modelling, and structure-based drug design. The success of these computational methods ultimately depends on the quality of the underlying protein-ligand models. X-ray crystallography offers the unparalleled advantage of a clear mathematical formalism relating the experimental data to the protein-ligand model. In the case of X-ray crystallography, the primary experimental evidence is the electron density of the molecules forming the crystal. The first step in the generation of an accurate and precise crystallographic model is the interpretation of the electron density of the crystal, typically carried out by construction of an atomic model. The atomic model must then be validated for fit to the experimental electron density and also for agreement with prior expectations of stereochemistry. Stringent validation of protein-ligand models has become possible as a result of the mandatory deposition of primary diffraction data, and many computational tools are now available to aid in the validation process. Validation of protein-ligand complexes has revealed some instances of overenthusiastic interpretation of ligand density. Fundamental concepts and metrics of protein-ligand quality validation are discussed and we highlight software tools to assist in this process. It is essential that end users select high quality protein-ligand models for their computational and biological studies, and we provide an overview of how this can be achieved.

  13. Fluorinated proteins: from design and synthesis to structure and stability.

    Science.gov (United States)

    Marsh, E Neil G

    2014-10-21

    recent detailed thermodynamic and structural studies in our laboratory have uncovered the basis for the remarkably general ability of fluorinated side chains to stabilize protein structure. Crystal structures of α4H and its fluorinated analogues show that the fluorinated residues fit into the hydrophobic core with remarkably little perturbation to the structure. This is explained by the fact that fluorinated side chains, although larger, very closely preserve the shape of the hydrophobic amino acids they replace. Thus, an increase in buried hydrophobic surface area in the folded state is responsible for the additional thermodynamic stability of the fluorinated protein. Measurements of ΔG°, ΔH°, ΔS°, and ΔCp° for unfolding demonstrate that the "fluorous" stabilization of these protein arises from the hydrophobic effect in the same way that hydrophobic partitioning stabilizes natural proteins.

  14. Dynamic protein interaction networks and new structural paradigms in signaling

    Science.gov (United States)

    Csizmok, Veronika; Follis, Ariele Viacava; Kriwacki, Richard W.; Forman-Kay, Julie D.

    2017-01-01

    Understanding signaling and other complex biological processes requires elucidating the critical roles of intrinsically disordered proteins and regions (IDPs/IDRs), which represent ~30% of the proteome and enable unique regulatory mechanisms. In this review we describe the structural heterogeneity of disordered proteins that underpins these mechanisms and the latest progress in obtaining structural descriptions of ensembles of disordered proteins that are needed for linking structure and dynamics to function. We describe the diverse interactions of IDPs that can have unusual characteristics such as “ultrasensitivity” and “regulated folding and unfolding”. We also summarize the mounting data showing that large-scale assembly and protein phase separation occurs within a variety of signaling complexes and cellular structures. In addition, we discuss efforts to therapeutically target disordered proteins with small molecules. Overall, we interpret the remodeling of disordered state ensembles due to binding and post-translational modifications within an expanded framework for allostery that provides significant insights into how disordered proteins transmit biological information. PMID:26922996

  15. Protein-Protein Interactions: Structurally Conserved Residues Distinguish between Binding Sites and Exposed Protein Surfaces

    National Research Council Canada - National Science Library

    Buyong Ma; Tal Elkayam; Haim Wolfson; Ruth Nussinov

    2003-01-01

    Polar residue hot spots have been observed at protein-protein binding sites. Here we show that hot spots occur predominantly at the interfaces of macromolecular complexes, distinguishing binding sites from the remainder of the surface...

  16. Constraint Logic Programming approach to protein structure prediction

    Directory of Open Access Journals (Sweden)

    Fogolari Federico

    2004-11-01

    Full Text Available Abstract Background The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Results Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. Conclusions The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

  17. Multiple structure single parameter: analysis of a single protein nano environment descriptor characterizing a shared loci on structurally aligned proteins.

    Science.gov (United States)

    Salim, José Augusto; Borro, Luiz; Mazoni, Ivan; Yano, Inácio; Jardine, José G; Neshich, Goran

    2016-06-15

    A graphical representation of physicochemical and structural descriptors attributed to amino acid residues occupying the same topological position in different, structurally aligned proteins can provide a more intuitive way to associate possible functional implications to identified variations in structural characteristics. This could be achieved by observing selected characteristics of amino acids and of their corresponding nano environments, described by the numerical value of matching descriptor. For this purpose, a web-based tool called multiple structure single parameter (MSSP) was developed and here presented. MSSP produces a two-dimensional plot of a single protein descriptor for a number of structurally aligned protein chains. From a total of 150 protein descriptors available in MSSP, selected of >1500 parameters stored in the STING database, it is possible to create easily readable and highly informative XY-plots, where X-axis contains the amino acid position in the multiple structural alignment, and Y-axis contains the descriptor's numerical values for each aligned structure. To illustrate one of possible MSSP contributions to the investigation of changes in physicochemical and structural properties of mutants, comparing them with the cognate wild-type structure, the oncogenic mutation of M918T in RET kinase is presented. The comparative analysis of wild-type and mutant structures shows great changes in their electrostatic potential. These variations are easily depicted at the MSSP-generated XY-plot. The web server is freely available at http://www.cbi.cnptia.embrapa.br/SMS/STINGm/MPA/index.html Web server implemented in Perl, Java and JavaScript and JMol or Protein Viewer as structure visualizers. goran.neshich@embrapa.br or gneshich@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Improved T1 mapping by motion correction and template based B1 correction in 3T MRI brain studies

    Science.gov (United States)

    Castro, Marcelo A.; Yao, Jianhua; Lee, Christabel; Pang, Yuxi; Baker, Eva; Butman, John; Thomasson, David

    2009-02-01

    Accurate estimation of relaxation time T1 from MRI images is increasingly important for some clinical applications. Low noise, high resolution, fast and accurate T1 maps from MRI images of the brain can be performed using a dual flip angle method. However, accuracy is limited by the scanners ability to deliver the prescribed flip angle due to the B1 inhomogeneity, particularly at high field strengths (e.g. 3T). One of the most accurate methods to correct that inhomogeneity is to acquire a subject-specific B1 map. However, since B1 map acquisition takes up precious scanning time and most retrospective studies do not have B1 map, it would be desirable to perform that correction from a template. For this work a dual repetition time method was used for B1 map acquisition in five normal subjects. Inaccuracies due to misregistration of acquired T1-weighted images were corrected by rigid registration, and the effects of misalignment were compared to those of B1 inhomogeneity. T1-intensity histograms were produced and three-Gaussian curves were fitted for every fully-, partially- and non-corrected histogram in order to estimate and compare the white and gray matter peaks. In addition, in order to reduce the scanning time we designed a template based correction strategy. Images from different subjects were aligned using a twelve-parameter affine registration, and B1 maps were aligned according to that transformation. Recomputed T1 maps showed a significant improvement with respect to non-corrected ones. These results are very promising and have the potential for clinical application.

  19. A Memetic Algorithm for 3-D Protein Structure Prediction Problem.

    Science.gov (United States)

    Correa, Leonardo; Borguesan, Bruno; Farfan, Camilo; Inostroza-Ponta, Mario; Dorn, Marcio

    2016-12-02

    Memetic Algorithms are population-based metaheuristics intrinsically concerned with exploiting all available knowledge about the problem under study. The incorporation of problem domain knowledge is not an optional mechanism, but a fundamental feature of the Memetic Algorithms. In this paper, we present a Memetic Algorithm to tackle the three-dimensional protein structure prediction problem. The method uses a structured population and incorporates a Simulated Annealing algorithm as a local search strategy, as well as ad-hoc crossover and mutation operators to deal with the problem. It takes advantage of structural knowledge stored in the Protein Data Bank, by using an Angle Probability List that helps to reduce the search space and to guide the search strategy. The proposed algorithm was tested on nineteen protein sequences of amino acid residues, and the results show the ability of the algorithm to find native-like protein structures. Experimental results have revealed that the proposed algorithm can find good solutions regarding root-mean-square deviation and global distance total score test in comparison with the experimental protein structures. We also show that our results are comparable in terms of folding organization with state-of-the-art prediction methods, corroborating the effectiveness of our proposal.

  20. Protein-RNA interactions: structural biology and computational modeling techniques.

    Science.gov (United States)

    Jones, Susan

    2016-12-01

    RNA-binding proteins are functionally diverse within cells, being involved in RNA-metabolism, translation, DNA damage repair, and gene regulation at both the transcriptional and post-transcriptional levels. Much has been learnt about their interactions with RNAs through structure determination techniques and computational modeling. This review gives an overview of the structural data currently available for protein-RNA complexes, and discusses the technical issues facing structural biologists working to solve their structures. The review focuses on three techniques used to solve the 3-dimensional structure of protein-RNA complexes at atomic resolution, namely X-ray crystallography, solution nuclear magnetic resonance (NMR) and cryo-electron microscopy (cryo-EM). The review then focuses on the main computational modeling techniques that use these atomic resolution data: discussing the prediction of RNA-binding sites on unbound proteins, docking proteins, and RNAs, and modeling the molecular dynamics of the systems. In conclusion, the review looks at the future directions this field of research might take.

  1. Delineation of protein structure classes from multivariate analysis of protein Raman optical activity data.

    Science.gov (United States)

    Zhu, Fujiang; Tranter, George E; Isaacs, Neil W; Hecht, Lutz; Barron, Laurence D

    2006-10-13

    Vibrational Raman optical activity (ROA), measured as a small difference in the intensity of Raman scattering from chiral molecules in right and left-circularly polarized incident light, or as the intensity of a small circularly polarized component in the scattered light, is a powerful probe of the aqueous solution structure of proteins. On account of the large number of structure-sensitive bands in protein ROA spectra, multivariate analysis techniques such as non-linear mapping (NLM) are especially favourable for determining structural relationships between different proteins. Here NLM is used to map a dataset of 80 polypeptide, protein and virus ROA spectra, considered as points in a multidimensional space with axes representing the digitized wavenumbers, into readily visualizable two and three-dimensional spaces in which points close to or distant from each other, respectively, represent similar or dissimilar structures. Discrete clusters are observed which correspond to the seven structure classes all alpha, mainly alpha, alphabeta, mainly beta, all beta, mainly disordered/irregular and all disordered/irregular. The average standardised ROA spectra of the proteins falling within each structure class have distinct features characteristic of each class. A distinct cluster containing the wheat protein A-gliadin and the plant viruses potato virus X, narcissus mosaic virus, papaya mosaic virus and tobacco rattle virus, all of which appear in the mainly alpha cluster in the two-dimensional representation, becomes clearly separated in the direction of increasing disorder in the three-dimensional representation. This suggests that the corresponding five proteins, none of which to date has yielded high-resolution X-ray structures, consist mainly of alpha-helix and disordered structure with little or no beta-sheet. This combination of structural elements may have functional significance, such as facilitating disorder-to-order transitions (and vice versa) and suppressing

  2. Distance matrix-based approach to protein structure prediction.

    Science.gov (United States)

    Kloczkowski, Andrzej; Jernigan, Robert L; Wu, Zhijun; Song, Guang; Yang, Lei; Kolinski, Andrzej; Pokarowski, Piotr

    2009-03-01

    Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r(ij)(2)] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r (ij) is greater or less than a cutoff value r (cutoff). We have performed spectral decomposition of the distance matrices D = sigma lambda(k)V(k)V(kT), in terms of eigenvalues lambda kappa and the corresponding eigenvectors v kappa and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r (2)--the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r (2) from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 A, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 A. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the

  3. Lipid nanotechnologies for structural studies of membrane-associated proteins.

    Science.gov (United States)

    Stoilova-McPhie, Svetla; Grushin, Kirill; Dalm, Daniela; Miller, Jaimy

    2014-11-01

    We present a methodology of lipid nanotubes (LNT) and nanodisks technologies optimized in our laboratory for structural studies of membrane-associated proteins at close to physiological conditions. The application of these lipid nanotechnologies for structure determination by cryo-electron microscopy (cryo-EM) is fundamental for understanding and modulating their function. The LNTs in our studies are single bilayer galactosylceramide based nanotubes of ∼20 nm inner diameter and a few microns in length, that self-assemble in aqueous solutions. The lipid nanodisks (NDs) are self-assembled discoid lipid bilayers of ∼10 nm diameter, which are stabilized in aqueous solutions by a belt of amphipathic helical scaffold proteins. By combining LNT and ND technologies, we can examine structurally how the membrane curvature and lipid composition modulates the function of the membrane-associated proteins. As proof of principle, we have engineered these lipid nanotechnologies to mimic the activated platelet's phosphtaidylserine rich membrane and have successfully assembled functional membrane-bound coagulation factor VIII in vitro for structure determination by cryo-EM. The macromolecular organization of the proteins bound to ND and LNT are further defined by fitting the known atomic structures within the calculated three-dimensional maps. The combination of LNT and ND technologies offers a means to control the design and assembly of a wide range of functional membrane-associated proteins and complexes for structural studies by cryo-EM. The presented results confirm the suitability of the developed methodology for studying the functional structure of membrane-associated proteins, such as the coagulation factors, at a close to physiological environment. © 2014 Wiley Periodicals, Inc.

  4. A resource for benchmarking the usefulness of protein structure models.

    KAUST Repository

    Carbajo, Daniel

    2012-08-02

    BACKGROUND: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. RESULTS: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. CONCLUSIONS: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by

  5. Protein-associated water and secondary structure effect removal of blood proteins from metallic substrates.

    Science.gov (United States)

    Anand, Gaurav; Zhang, Fuming; Linhardt, Robert J; Belfort, Georges

    2011-03-01

    Removing adsorbed protein from metals has significant health and industrial consequences. There are numerous protein-adsorption studies using model self-assembled monolayers or polymeric substrates but hardly any high-resolution measurements of adsorption and removal of proteins on industrially relevant transition metals. Surgeons and ship owners desire clean metal surfaces to reduce transmission of disease via surgical instruments and minimize surface fouling (to reduce friction and corrosion), respectively. A major finding of this work is that, besides hydrophobic interaction adhesion energy, water content in an adsorbed protein layer and secondary structure of proteins determined the access and hence ability to remove adsorbed proteins from metal surfaces with a strong alkaline-surfactant solution (NaOH and 5 mg/mL SDS in PBS at pH 11). This is demonstrated with three blood proteins (bovine serum albumin, immunoglobulin, and fibrinogen) and four transition metal substrates and stainless steel (platinum (Pt), gold (Au), tungsten (W), titanium (Ti), and 316 grade stainless steel (SS)). All the metallic substrates were checked for chemical contaminations like carbon and sulfur and were characterized using X-ray photoelectron spectroscopy (XPS). While Pt and Au surfaces were oxide-free (fairly inert elements), W, Ti, and SS substrates were associated with native oxide. Difference measurements between a quartz crystal microbalance with dissipation (QCM-D) and surface plasmon resonance spectroscopy (SPR) provided a measure of the water content in the protein-adsorbed layers. Hydrophobic adhesion forces, obtained with atomic force microscopy, between the proteins and the metals correlated with the amount of the adsorbed protein-water complex. Thus, the amount of protein adsorbed decreased with Pt, Au, W, Ti and SS, in this order. Neither sessile contact angle nor surface roughness of the metal substrates was useful as predictors here. All three globular proteins

  6. From Ramachandran Maps to Tertiary Structures of Proteins.

    Science.gov (United States)

    DasGupta, Debarati; Kaushik, Rahul; Jayaram, B

    2015-08-27

    Sequence to structure of proteins is an unsolved problem. A possible coarse grained resolution to this entails specification of all the torsional (Φ, Ψ) angles along the backbone of the polypeptide chain. The Ramachandran map quite elegantly depicts the allowed conformational (Φ, Ψ) space of proteins which is still very large for the purposes of accurate structure generation. We have divided the allowed (Φ, Ψ) space in Ramachandran maps into 27 distinct conformations sufficient to regenerate a structure to within 5 Å from the native, at least for small proteins, thus reducing the structure prediction problem to a specification of an alphanumeric string, i.e., the amino acid sequence together with one of the 27 conformations preferred by each amino acid residue. This still theoretically results in 27(n) conformations for a protein comprising "n" amino acids. We then investigated the spatial correlations at the two-residue (dipeptide) and three-residue (tripeptide) levels in what may be described as higher order Ramachandran maps, with the premise that the allowed conformational space starts to shrink as we introduce neighborhood effects. We found, for instance, for a tripeptide which potentially can exist in any of the 27(3) "allowed" conformations, three-fourths of these conformations are redundant to the 95% confidence level, suggesting sequence context dependent preferred conformations. We then created a look-up table of preferred conformations at the tripeptide level and correlated them with energetically favorable conformations. We found in particular that Boltzmann probabilities calculated from van der Waals energies for each conformation of tripeptides correlate well with the observed populations in the structural database (the average correlation coefficient is ∼0.8). An alpha-numeric string and hence the tertiary structure can be generated for any sequence from the look-up table within minutes on a single processor and to a higher level of accuracy

  7. Structure of mega-hemocyanin reveals protein origami in snails.

    Science.gov (United States)

    Gatsogiannis, Christos; Hofnagel, Oliver; Markl, Jürgen; Raunser, Stefan

    2015-01-06

    Mega-hemocyanin is a 13.5 MDa oxygen transporter found in the hemolymph of some snails. Similar to typical gastropod hemocyanins, it is composed of 400 kDa building blocks but has additional 550 kDa subunits. Together, they form a large, completely filled cylinder. The structural basis for this highly complex protein packing is not known so far. Here, we report the electron cryomicroscopy (cryo-EM) structure of mega-hemocyanin complexes from two different snail species. The structures reveal that mega-hemocyanin is composed of flexible building blocks that differ in their conformation, but not in their primary structure. Like a protein origami, these flexible blocks are optimally packed, implementing different local symmetries and pseudosymmetries. A comparison between the two structures suggests a surprisingly simple evolutionary mechanism leading to these large oxygen transporters. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Structural Conservation of the Myoviridae Phage Tail Sheath Protein Fold

    Energy Technology Data Exchange (ETDEWEB)

    Aksyuk, Anastasia A.; Kurochkina, Lidia P.; Fokine, Andrei; Forouhar, Farhad; Mesyanzhinov, Vadim V.; Tong, Liang; Rossmann, Michael G. (SOIBC); (Purdue); (Columbia)

    2012-02-21

    Bacteriophage phiKZ is a giant phage that infects Pseudomonas aeruginosa, a human pathogen. The phiKZ virion consists of a 1450 {angstrom} diameter icosahedral head and a 2000 {angstrom}-long contractile tail. The structure of the whole virus was previously reported, showing that its tail organization in the extended state is similar to the well-studied Myovirus bacteriophage T4 tail. The crystal structure of a tail sheath protein fragment of phiKZ was determined to 2.4 {angstrom} resolution. Furthermore, crystal structures of two prophage tail sheath proteins were determined to 1.9 and 3.3 {angstrom} resolution. Despite low sequence identity between these proteins, all of these structures have a similar fold. The crystal structure of the phiKZ tail sheath protein has been fitted into cryo-electron-microscopy reconstructions of the extended tail sheath and of a polysheath. The structural rearrangement of the phiKZ tail sheath contraction was found to be similar to that of phage T4.

  9. Structure-dependent sequence alignment for remotely related proteins.

    Science.gov (United States)

    Yang, An-Suei

    2002-12-01

    The quality of a model structure derived from a comparative modeling procedure is dictated by the accuracy of the predicted sequence-template alignment. As the sequence-template pairs are increasingly remote in sequence relationship, the prediction of the sequence-template alignments becomes increasingly problematic with sequence alignment methods. Structural information of the template, used in connection with the sequence relationship of the sequence-template pair, could significantly improve the accuracy of the sequence-template alignment. In this paper, we describe a sequence-template alignment method that integrates sequence and structural information to enhance the accuracy of sequence-template alignments for distantly related protein pairs. The structure-dependent sequence alignment (SDSA) procedure was optimized for coverage and accuracy on a training set of 412 protein pairs; the structures for each of the training pairs are similar (RMSDSDSA procedure was then applied to extend PSI-BLAST local alignments by calculating the global alignments under the constraint of the residue pairs in the local alignments. This composite alignment procedure was assessed with a testing set of 1421 protein pairs, of which the pair-wise structures are similar (RMSD< approximately 4A) but the sequences are marginally related at best in each pair (average pair-wise sequence identity = 13%). The assessment showed that the composite alignment procedure predicted more aligned residues pairs with an average of 27% increase in correctly aligned residues over the standard PSI-BLAST alignments for the protein pairs in the testing set.

  10. [Structural and Functional Studies on Photoactive Retinal Proteins: Light Becomes Drugs with Proteins].

    Science.gov (United States)

    Sudo, Yuki

    2016-01-01

    Retinal proteins possess vitamin A aldehyde (retinal) as a chromophore within seven transmembrane α-helices. Visible light absorption of them triggers trans-cis photoisomerization of the retinal chromophore and induces structural changes in the protein moiety, resulting in a variety of biological functions such as vision, ion transportation, and photosensing. Environmental genomics revealed that retinal proteins are widely distributed through all three biological kingdoms, eukarya, bacteria, and archaea, indicating the biological significance of their light energy conversion. In addition to their biological aspect, retinal proteins have become a focus of interest in part because of applications for optogenetics. On the basis of our results and other findings, we highlight the recent progress in structural and functional studies on retinal proteins.

  11. Control of Cellular Structural Networks Through Unstructured Protein Domains

    Science.gov (United States)

    2016-07-01

    transport receptor binding avidity triggers a self - healing collapse transition in FG-nucleoporin molecular brushes. Proc Natl Acad Sci U S A 109...L. & Minor, W. Data mining of metal ion environments present in protein structures. J Inorg Biochem 102, 1765-1776 (2008). 2872550 2. Harding...M.M. The architecture of metal coordination groups in proteins. Acta Crystallogr D Biol Crystallogr 60, 849-859 (2004). 3. Ho, Y., Yang, M., Chen, L

  12. Elasticity, structure, and relaxation of extended proteins under force

    OpenAIRE

    Stirnemann, Guillaume; Giganti, David; Fernandez, Julio M.; Berne, B. J.

    2013-01-01

    Force spectroscopies have emerged as a powerful and unprecedented tool to study and manipulate biomolecules directly at a molecular level. Usually, protein and DNA behavior under force is described within the framework of the worm-like chain (WLC) model for polymer elasticity. Although it has been surprisingly successful for the interpretation of experimental data, especially at high forces, the WLC model lacks structural and dynamical molecular details associated with protein relaxation unde...

  13. A protein structural classes prediction method based on predicted secondary structure and PSI-BLAST profile.

    Science.gov (United States)

    Ding, Shuyan; Li, Yan; Shi, Zhuoxing; Yan, Shoujiang

    2014-02-01

    Knowledge of protein secondary structural classes plays an important role in understanding protein folding patterns. In this paper, 25 features based on position-specific scoring matrices are selected to reflect evolutionary information. In combination with other 11 rational features based on predicted protein secondary structure sequences proposed by the previous researchers, a 36-dimensional representation feature vector is presented to predict protein secondary structural classes for low-similarity sequences. ASTRALtraining dataset is used to train and design our method, other three low-similarity datasets ASTRALtest, 25PDB and 1189 are used to test the proposed method. Comparisons with other methods show that our method is effective to predict protein secondary structural classes. Stand alone version of the proposed method (PSSS-PSSM) is written in MATLAB language and it can be downloaded from http://letsgob.com/bioinfo_PSSS_PSSM/. Copyright © 2013 Elsevier Masson SAS. All rights reserved.

  14. Filovirus proteins for antiviral drug discovery: Structure/function of proteins involved in assembly and budding.

    Science.gov (United States)

    Martin, Baptiste; Reynard, Olivier; Volchkov, Viktor; Decroly, Etienne

    2018-02-01

    There are no approved medications for the treatment of Marburg or Ebola virus infection. In two previous articles (Martin et al., 2016, Martin et al., 2017), we reviewed surface glycoprotein and replication proteins structure/function relationship to decipher the molecular mechanisms of filovirus life cycle and identify antiviral strategies. In the present article, we recapitulate knowledge about the viral proteins involved in filovirus assembly and budding. First we describe the structural data available for viral proteins associated with virus assembly and virion egress and then, we integrate the structural features of these proteins in the functional context of the viral replication cycle. Finally, we summarize recent advances in the development of innovative antiviral strategies to target filovirus assembly and egress. The development of such prophylactic or post-exposure treatments could help controlling future filovirus outbreaks. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. PROGRAM SYSTEM AND INFORMATION METADATA BANK OF TERTIARY PROTEIN STRUCTURES

    Directory of Open Access Journals (Sweden)

    T. A. Nikitin

    2013-01-01

    Full Text Available The article deals with the architecture of metadata storage model for check results of three-dimensional protein structures. Concept database model was built. The service and procedure of database update as well as data transformation algorithms for protein structures and their quality were presented. Most important information about entries and their submission forms to store, access, and delivery to users were highlighted. Software suite was developed for the implementation of functional tasks using Java programming language in the NetBeans v.7.0 environment and JQL to query and interact with the database JavaDB. The service was tested and results have shown system effectiveness while protein structures filtration.

  16. Quantifying side-chain conformational variations in protein structure

    Science.gov (United States)

    Miao, Zhichao; Cao, Yang

    2016-11-01

    Protein side-chain conformation is closely related to their biological functions. The side-chain prediction is a key step in protein design, protein docking and structure optimization. However, side-chain polymorphism comprehensively exists in protein as various types and has been long overlooked by side-chain prediction. But such conformational variations have not been quantitatively studied and the correlations between these variations and residue features are vague. Here, we performed statistical analyses on large scale data sets and found that the side-chain conformational flexibility is closely related to the exposure to solvent, degree of freedom and hydrophilicity. These analyses allowed us to quantify different types of side-chain variabilities in PDB. The results underscore that protein side-chain conformation prediction is not a single-answer problem, leading us to reconsider the assessment approaches of side-chain prediction programs.

  17. Structure and Modification of Electrode Materials for Protein Electrochemistry.

    Science.gov (United States)

    Jeuken, Lars J C

    The interactions between proteins and electrode surfaces are of fundamental importance in bioelectrochemistry, including photobioelectrochemistry. In order to optimise the interaction between electrode and redox protein, either the electrode or the protein can be engineered, with the former being the most adopted approach. This tutorial review provides a basic description of the most commonly used electrode materials in bioelectrochemistry and discusses approaches to modify these surfaces. Carbon, gold and transparent electrodes (e.g. indium tin oxide) are covered, while approaches to form meso- and macroporous structured electrodes are also described. Electrode modifications include the chemical modification with (self-assembled) monolayers and the use of conducting polymers in which the protein is imbedded. The proteins themselves can either be in solution, electrostatically adsorbed on the surface or covalently bound to the electrode. Drawbacks and benefits of each material and its modifications are discussed. Where examples exist of applications in photobioelectrochemistry, these are highlighted.

  18. On Ramachandran angles, closed strings and knots in protein structure

    Science.gov (United States)

    Chen, Si; Niemi, Antti J.

    2016-08-01

    The Ramachandran angles (φ,\\psi ) of a protein backbone form the vertices of a piecewise geodesic curve on the surface of a torus. When the ends of the curve are connected to each other similarly, by a geodesic, the result is a closed string that in general wraps around the torus a number of times both in the meridional and the longitudinal directions. The two wrapping numbers are global characteristics of the protein structure. A statistical analysis of the wrapping numbers in terms of crystallographic x-ray structures in the protein data bank (PDB) reveals that proteins have no net chirality in the ϕ direction but in the ψ direction, proteins prefer to display chirality. A comparison between the wrapping numbers and the concept of folding index discloses a non-linearity in their relationship. Thus these three integer valued invariants can be used in tandem, to scrutinize and classify the global loop structure of individual PDB proteins, in terms of the overall fold topology.

  19. Computational structural analysis: multiple proteins bound to DNA.

    Directory of Open Access Journals (Sweden)

    Andrija Tomovic

    Full Text Available BACKGROUND: With increasing numbers of crystal structures of proteinratioDNA and proteinratioproteinratioDNA complexes publically available, it is now possible to extract sufficient structural, physical-chemical and thermodynamic parameters to make general observations and predictions about their interactions. In particular, the properties of macromolecular assemblies of multiple proteins bound to DNA have not previously been investigated in detail. METHODOLOGY/PRINCIPAL FINDINGS: We have performed computational structural analyses on macromolecular assemblies of multiple proteins bound to DNA using a variety of different computational tools: PISA; PROMOTIF; X3DNA; ReadOut; DDNA and DCOMPLEX. Additionally, we have developed and employed an algorithm for approximate collision detection and overlapping volume estimation of two macromolecules. An implementation of this algorithm is available at http://promoterplot.fmi.ch/Collision1/. The results obtained are compared with structural, physical-chemical and thermodynamic parameters from proteinratioprotein and single proteinratioDNA complexes. Many of interface properties of multiple proteinratioDNA complexes were found to be very similar to those observed in binary proteinratioDNA and proteinratioprotein complexes. However, the conformational change of the DNA upon protein binding is significantly higher when multiple proteins bind to it than is observed when single proteins bind. The water mediated contacts are less important (found in less quantity between the interfaces of components in ternary (proteinratioproteinratioDNA complexes than in those of binary complexes (proteinratioprotein and proteinratioDNA.The thermodynamic stability of ternary complexes is also higher than in the binary interactions. Greater specificity and affinity of multiple proteins binding to DNA in comparison with binary protein-DNA interactions were observed. However, protein-protein binding affinities are stronger in

  20. Prediction of protein-destabilizing polymorphisms by manual curation with protein structure.

    Directory of Open Access Journals (Sweden)

    Craig Alan Gough

    Full Text Available The relationship between sequence polymorphisms and human disease has been studied mostly in terms of effects of single nucleotide polymorphisms (SNPs leading to single amino acid substitutions that change protein structure and function. However, less attention has been paid to more drastic sequence polymorphisms which cause premature termination of a protein's sequence or large changes, insertions, or deletions in the sequence. We have analyzed a large set (n = 512 of insertions and deletions (indels and single nucleotide polymorphisms causing premature termination of translation in disease-related genes. Prediction of protein-destabilization effects was performed by graphical presentation of the locations of polymorphisms in the protein structure, using the Genomes TO Protein (GTOP database, and manual annotation with a set of specific criteria. Protein-destabilization was predicted for 44.4% of the nonsense SNPs, 32.4% of the frameshifting indels, and 9.1% of the non-frameshifting indels. A prediction of nonsense-mediated decay allowed to infer which truncated proteins would actually be translated as defective proteins. These cases included the proteins linked to diseases inherited dominantly, suggesting a relation between these diseases and toxic aggregation. Our approach would be useful in identifying potentially aggregation-inducing polymorphisms that may have pathological effects.

  1. Protein-protein complex structure predictions by multimeric threading and template recombination

    Science.gov (United States)

    Mukherjee, Srayanta; Zhang, Yang

    2011-01-01

    Summary The number of protein-protein complex structures is nearly 6-times smaller than that of tertiary structures in PDB which limits the power of homology-based approaches to complex structure modeling. We present a new threading-recombination approach, COTH, to boost the protein complex structure library by combining tertiary structure templates with complex alignments. The query sequences are first aligned to complex templates using a modified dynamic programming algorithm, guided by ab initio binding-site predictions. The monomer alignments are then shifted to the multimeric template framework by structural alignments. COTH was tested on 500 non-homologous dimeric proteins, which can successfully detect correct templates for half of the cases after homologous templates are excluded, which significantly outperforms conventional homology modeling algorithms. It also shows a higher accuracy in interface modeling than rigid-body docking of unbound structures from ZDOCK although with lower coverage. These data demonstrate new avenues to model complex structures from non-homologous templates. PMID:21742262

  2. Interactions of hepatotoxic agents with proteins and subcellular structures.

    Science.gov (United States)

    Faulstich, H

    1980-01-01

    Two proteins with high affinity for amatoxins have been characterized in calf thymus nuclei, the RNA-polymerase II (or B) and a 100 K protein of unknown function. Most of the toxic effects of amatoxins are based on the inhibited synthesis of mRNA. The 100 K protein may be involved in functions of cytokinesis as suggested by experiments with PtK1 cells and a fluorescent labelled amatoxin. The molecular toxicity of phallotoxins can be understood in terms of their affinity for actin. By interaction with rabbit muscle actin the concentration of actin monomers is decreased. In hepatocytes, the phallotoxins change the structure of the microfilamentous web.

  3. Understanding the structural ensembles of a highly extended disordered protein.

    Science.gov (United States)

    Daughdrill, Gary W; Kashtanov, Stepan; Stancik, Amber; Hill, Shannon E; Helms, Gregory; Muschol, Martin; Receveur-Bréchot, Véronique; Ytreberg, F Marty

    2012-01-01

    Developing a comprehensive description of the equilibrium structural ensembles for intrinsically disordered proteins (IDPs) is essential to understanding their function. The p53 transactivation domain (p53TAD) is an IDP that interacts with multiple protein partners and contains numerous phosphorylation sites. Multiple techniques were used to investigate the equilibrium structural ensemble of p53TAD in its native and chemically unfolded states. The results from these experiments show that the native state of p53TAD has dimensions similar to a classical random coil while the chemically unfolded state is more extended. To investigate the molecular properties responsible for this behavior, a novel algorithm that generates diverse and unbiased structural ensembles of IDPs was developed. This algorithm was used to generate a large pool of plausible p53TAD structures that were reweighted to identify a subset of structures with the best fit to small angle X-ray scattering data. High weight structures in the native state ensemble show features that are localized to protein binding sites and regions with high proline content. The features localized to the protein binding sites are mostly eliminated in the chemically unfolded ensemble; while, the regions with high proline content remain relatively unaffected. Data from NMR experiments support these results, showing that residues from the protein binding sites experience larger environmental changes upon unfolding by urea than regions with high proline content. This behavior is consistent with the urea-induced exposure of nonpolar and aromatic side-chains in the protein binding sites that are partially excluded from solvent in the native state ensemble.

  4. Critical Features of Fragment Libraries for Protein Structure Prediction

    Science.gov (United States)

    dos Santos, Karina Baptista

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction. PMID:28085928

  5. Critical Features of Fragment Libraries for Protein Structure Prediction.

    Science.gov (United States)

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  6. Pre-fusion structure of a human coronavirus spike protein.

    Science.gov (United States)

    Kirchdoerfer, Robert N; Cottrell, Christopher A; Wang, Nianshuang; Pallesen, Jesper; Yassine, Hadi M; Turner, Hannah L; Corbett, Kizzmekia S; Graham, Barney S; McLellan, Jason S; Ward, Andrew B

    2016-03-03

    HKU1 is a human betacoronavirus that causes mild yet prevalent respiratory disease, and is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. Cell tropism and host range is determined in part by the coronavirus spike (S) protein, which binds cellular receptors and mediates membrane fusion. As the largest known class I fusion protein, its size and extensive glycosylation have hindered structural studies of the full ectodomain, thus preventing a molecular understanding of its function and limiting development of effective interventions. Here we present the 4.0 Å resolution structure of the trimeric HKU1 S protein determined using single-particle cryo-electron microscopy. In the pre-fusion conformation, the receptor-binding subunits, S1, rest above the fusion-mediating subunits, S2, preventing their conformational rearrangement. Surprisingly, the S1 C-terminal domains are interdigitated and form extensive quaternary interactions that occlude surfaces known in other coronaviruses to bind protein receptors. These features, along with the location of the two protease sites known to be important for coronavirus entry, provide a structural basis to support a model of membrane fusion mediated by progressive S protein destabilization through receptor binding and proteolytic cleavage. These studies should also serve as a foundation for the structure-based design of betacoronavirus vaccine immunogens.

  7. RACK1, A Multifaceted Scaffolding Protein: Structure and Function

    LENUS (Irish Health Repository)

    Adams, David R

    2011-10-06

    Abstract The Receptor for Activated C Kinase 1 (RACK1) is a member of the tryptophan-aspartate repeat (WD-repeat) family of proteins and shares significant homology to the β subunit of G-proteins (Gβ). RACK1 adopts a seven-bladed β-propeller structure which facilitates protein binding. RACK1 has a significant role to play in shuttling proteins around the cell, anchoring proteins at particular locations and in stabilising protein activity. It interacts with the ribosomal machinery, with several cell surface receptors and with proteins in the nucleus. As a result, RACK1 is a key mediator of various pathways and contributes to numerous aspects of cellular function. Here, we discuss RACK1 gene and structure and its role in specific signaling pathways, and address how posttranslational modifications facilitate subcellular location and translocation of RACK1. This review condenses several recent studies suggesting a role for RACK1 in physiological processes such as development, cell migration, central nervous system (CN) function and circadian rhythm as well as reviewing the role of RACK1 in disease.

  8. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  9. The structure and function of G-protein-coupled receptors

    DEFF Research Database (Denmark)

    Rosenbaum, Daniel M; Rasmussen, Søren Gøgsig Faarup; Kobilka, Brian K

    2009-01-01

    G-protein-coupled receptors (GPCRs) mediate most of our physiological responses to hormones, neurotransmitters and environmental stimulants, and so have great potential as therapeutic targets for a broad spectrum of diseases. They are also fascinating molecules from the perspective of membrane......-protein structure and biology. Great progress has been made over the past three decades in understanding diverse GPCRs, from pharmacology to functional characterization in vivo. Recent high-resolution structural studies have provided insights into the molecular mechanisms of GPCR activation and constitutive...

  10. Optimizing an emperical scoring function for transmembrane protein structure determination.

    Energy Technology Data Exchange (ETDEWEB)

    Young, Malin M.; Sale, Kenneth L.; Gray, Genetha Anne; Kolda, Tamara Gibson

    2003-10-01

    We examine the problem of transmembrane protein structure determination. Like many other questions that arise in biological research, this problem cannot be addressed by traditional laboratory experimentation alone. An approach that integrates experiment and computation is required. We investigate a procedure which states the transmembrane protein structure determination problem as a bound constrained optimization problem using a special empirical scoring function, called Bundler, as the objective function. In this paper, we describe the optimization problem and some of its mathematical properties. We compare and contrast results obtained using two different derivative free optimization algorithms.

  11. PASSML: combining evolutionary inference and protein secondary structure prediction.

    Science.gov (United States)

    Liò, P; Goldman, N; Thorne, J L; Jones3, D T

    1998-01-01

    Evolutionary models of amino acid sequences can be adapted to incorporate structure information; protein structure biologists can use phylogenetic relationships among species to improve prediction accuracy. Results : A computer program called PASSML ('Phylogeny and Secondary Structure using Maximum Likelihood') has been developed to implement an evolutionary model that combines protein secondary structure and amino acid replacement. The model is related to that of Dayhoff and co-workers, but we distinguish eight categories of structural environment: alpha helix, beta sheet, turn and coil, each further classified according to solvent accessibility, i.e. buried or exposed. The model of sequence evolution for each of the eight categories is a Markov process with discrete states in continuous time, and the organization of structure along protein sequences is described by a hidden Markov model. This paper describes the PASSML software and illustrates how it allows both the reconstruction of phylogenies and prediction of secondary structure from aligned amino acid sequences. PASSML 'ANSI C' source code and the example data sets described here are available at http://ng-dec1.gen.cam.ac.uk/hmm/Passml.html and 'downstream' Web pages. P.Lio@gen.cam.ac.uk

  12. Protein structural perturbation and aggregation on homogeneous surfaces.

    Science.gov (United States)

    Sethuraman, Ananthakrishnan; Belfort, Georges

    2005-02-01

    We have demonstrated that globular proteins, such as hen egg lysozyme in phosphate buffered saline at room temperature, lose native structural stability and activity when adsorbed onto well-defined homogeneous solid surfaces. This structural loss is evident by alpha-helix to turns/random during the first 30 min and followed by a slow alpha-helix to beta-sheet transition. Increase in intramolecular and intermolecular beta-sheet content suggests conformational rearrangement and aggregation between different protein molecules, respectively. Amide I band attenuated total reflection/Fourier transformed infrared (ATR/FTIR) spectroscopy was used to quantify the secondary structure content of lysozyme adsorbed on six different self-assembled alkanethiol monolayer surfaces with -CH3, -OPh, -CF3, -CN, -OCH3, and -OH exposed functional end groups. Activity measurements of adsorbed lysozyme were in good agreement with the structural perturbations. Both surface chemistry (type of functional groups, wettability) and adsorbate concentration (i.e., lateral interactions) are responsible for the observed structural changes during adsorption. A kinetic model is proposed to describe secondary structural changes that occur in two dynamic phases. The results presented in this article demonstrate the utility of the ATR/FTIR spectroscopic technique for in situ characterization of protein secondary structures during adsorption on flat surfaces.

  13. Improving the accuracy of protein secondary structure prediction using structural alignment

    Directory of Open Access Journals (Sweden)

    Gallin Warren J

    2006-06-01

    Full Text Available Abstract Background The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3 of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences, the probability of a newly identified sequence having a structural homologue is actually quite high. Results We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25% onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics indicate that this new method can achieve a Q3 score approaching 88%. Conclusion By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at http://wishart.biology.ualberta.ca/proteus. For high throughput or batch sequence analyses, the PROTEUS programs

  14. Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.

    Science.gov (United States)

    Sharma, Anuj; Manolakos, Elias S

    2015-01-01

    Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.

  15. Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures

    Science.gov (United States)

    Manolakos, Elias S.

    2015-01-01

    Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332

  16. The Semantics of the Modular Architecture of Protein Structures.

    Science.gov (United States)

    Hleap, Jose Sergio; Blouin, Christian

    2016-01-01

    Protein structures can be conceptualized as context-aware self-organizing systems. One of its emerging properties is a modular architecture. Such modular architecture has been identified as domains and defined as its units of evolution and function. However, this modular architecture is not exclusively defined by domains. Also, the definition of a domain is an ongoing debate. Here we propose differentiating structural, evolutionary and functional domains as distinct concepts. Defining domains or modules is confounded by diverse definitions of the concept, and also by other elements inherent to protein structures. An apparent hierarchy in protein structure architecture is one of these elements, where lower level interactions may create noise for the definition of higher levels. Diverse modularity-molding factors such as folding, function, and selection, can have a misleading effect when trying to define a given type of module. It is thus important to keep in mind this complexity when defining modularity in protein structures and interpreting the outcome modularity inference approaches.

  17. Improving classification in protein structure databases using text mining

    Directory of Open Access Journals (Sweden)

    Jones David T

    2009-05-01

    Full Text Available Abstract Background The classification of protein domains in the CATH resource is primarily based on structural comparisons, sequence similarity and manual analysis. One of the main bottlenecks in the processing of new entries is the evaluation of 'borderline' cases by human curators with reference to the literature, and better tools for helping both expert and non-expert users quickly identify relevant functional information from text are urgently needed. A text based method for protein classification is presented, which complements the existing sequence and structure-based approaches, especially in cases exhibiting low similarity to existing members and requiring manual intervention. The method is based on the assumption that textual similarity between sets of documents relating to proteins reflects biological function similarities and can be exploited to make classification decisions. Results An optimal strategy for the text comparisons was identified by using an established gold standard enzyme dataset. Filtering of the abstracts using a machine learning approach to discriminate sentences containing functional, structural and classification information that are relevant to the protein classification task improved performance. Testing this classification scheme on a dataset of 'borderline' protein domains that lack significant sequence or structure similarity to classified proteins showed that although, as expected, the structural similarity classifiers perform better on average, there is a significant benefit in incorporating text similarity in logistic regression models, indicating significant orthogonality in this additional information. Coverage was significantly increased especially at low error rates, which is important for routine classification tasks: 15.3% for the combined structure and text classifier compared to 10% for the structural classifier alone, at 10-3 error rate. Finally when only the highest scoring predictions were used

  18. Structure and Protein-Protein Interaction Studies on Chlamydia trachomatis Protein CT670 (YscO Homolog)

    Energy Technology Data Exchange (ETDEWEB)

    Lorenzini, Emily; Singer, Alexander; Singh, Bhag; Lam, Robert; Skarina, Tatiana; Chirgadze, Nickolay Y.; Savchenko, Alexei; Gupta, Radhey S. (Toronto); (McMaster U.); (OCI)

    2010-07-28

    Comparative genomic studies have identified many proteins that are found only in various Chlamydiae species and exhibit no significant sequence similarity to any protein in organisms that do not belong to this group. The CT670 protein of Chlamydia trachomatis is one of the proteins whose genes are in one of the type III secretion gene clusters but whose cellular functions are not known. CT670 shares several characteristics with the YscO protein of Yersinia pestis, including the neighboring genes, size, charge, and secondary structure, but the structures and/or functions of these proteins remain to be determined. Although a BLAST search with CT670 did not identify YscO as a related protein, our analysis indicated that these two proteins exhibit significant sequence similarity. In this paper, we report that the CT670 crystal, solved at a resolution of 2 {angstrom}, consists of a single coiled coil containing just two long helices. Gel filtration and analytical ultracentrifugation studies showed that in solution CT670 exists in both monomeric and dimeric forms and that the monomer predominates at lower protein concentrations. We examined the interaction of CT670 with many type III secretion system-related proteins (viz., CT091, CT665, CT666, CT667, CT668, CT669, CT671, CT672, and CT673) by performing bacterial two-hybrid assays. In these experiments, CT670 was found to interact only with the CT671 protein (YscP homolog), whose gene is immediately downstream of ct670. A specific interaction between CT670 and CT671 was also observed when affinity chromatography pull-down experiments were performed. These results suggest that CT670 and CT671 are putative homologs of the YcoO and YscP proteins, respectively, and that they likely form a chaperone-effector pair.

  19. Protein Primary Structure of the Vaccinia Virion at Increased Resolution

    Science.gov (United States)

    Ngo, Tuan; Mirzakhanyan, Yeva; Moussatche, Nissin; Gershon, Paul David

    2016-11-01

    Here we examine the protein covalent structure of the vaccinia virus virion. Within two virion preparations, >88% of the theoretical vaccinia virus-encoded proteome was detected with high confidence, including the first detection of products from 27 open reading frames (ORFs) previously designated "predicted," "uncharacterized," "inferred," or "hypothetical" polypeptides containing as few as 39 amino acids (aa) and six proteins whose detection required nontryptic proteolysis. We also detected the expression of four short ORFs, each of which was located within an ORF ("ORF-within-ORF"), including one not previously recognized or known to be expressed. Using quantitative mass spectrometry (MS), between 58 and 74 proteins were determined to be packaged. A total of 63 host proteins were also identified as candidates for packaging. Evidence is provided that some portion of virion proteins are "nicked" via a combination of endoproteolysis and concerted exoproteolysis in a manner, and at sites, independent of virus origin or laboratory procedures. The size of the characterized virion phosphoproteome was doubled from 189 (J. Matson, W. Chou, T. Ngo, and P. D. Gershon, Virology 452-453:310-323, 2014, doi:http://dx.doi.org/10.1016/j.virol.2014.01.012) to 396 confident, unique phosphorylation sites, 268 of which were within the packaged proteome. This included the unambiguous identification of phosphorylation "hot spots" within virion proteins. Using isotopically enriched ATP, 23 sites of intravirion kinase phosphorylation were detected within nine virion proteins, all at sites already partially occupied within the virion preparations. The clear phosphorylation of proteins RAP94 and RP19 was consistent with the roles of these proteins in intravirion early gene transcription. In a blind search for protein modifications, cysteine glutathionylation and O-linked glycosylation featured prominently. We provide evidence for the phosphoglycosylation of vaccinia virus proteins

  20. Predicting and validating protein interactions using network structure.

    Directory of Open Access Journals (Sweden)

    Pao-Yang Chen

    2008-07-01

    Full Text Available Protein interactions play a vital part in the function of a cell. As experimental techniques for detection and validation of protein interactions are time consuming, there is a need for computational methods for this task. Protein interactions appear to form a network with a relatively high degree of local clustering. In this paper we exploit this clustering by suggesting a score based on triplets of observed protein interactions. The score utilises both protein characteristics and network properties. Our score based on triplets is shown to complement existing techniques for predicting protein interactions, outperforming them on data sets which display a high degree of clustering. The predicted interactions score highly against test measures for accuracy. Compared to a similar score derived from pairwise interactions only, the triplet score displays higher sensitivity and specificity. By looking at specific examples, we show how an experimental set of interactions can be enriched and validated. As part of this work we also examine the effect of different prior databases upon the accuracy of prediction and find that the interactions from the same kingdom give better results than from across kingdoms, suggesting that there may be fundamental differences between the networks. These results all emphasize that network structure is important and helps in the accurate prediction of protein interactions. The protein interaction data set and the program used in our analysis, and a list of predictions and validations, are available at http://www.stats.ox.ac.uk/bioinfo/resources/PredictingInteractions.

  1. How round is a protein? Exploring protein structures for globularity using conformal mapping.

    Directory of Open Access Journals (Sweden)

    Joel eHass

    2014-12-01

    Full Text Available We present a new algorithm that automatically computes a measure of the geometric difference between the surface of a protein and a round sphere. The algorithm takes as input two triangulated genus zero surfaces representing the protein and the round sphere, respectively, and constructs a discrete conformal map between these surfaces. The conformal map is chosen to minimize a symmetric elastic energy that measures the distance of the constructed conformal map from an isometry. We illustrate our approach on a set of basic sample problems and then on a dataset of diverse protein structures. We show first that the symmetric elastic energy is able to quantify the roundness of the Platonic solids and that for these surfaces it replicates well traditional measures of roundness such as the sphericity. We then demonstrate that the symmetric elastic energy captures both global and local differences between two surfaces, showing that our method identifies the presence of protruding regions in protein structures and quantifies how these regions make the shape of a protein deviate from globularity. Based on these results, we show that the symmetric elastic energy serves as a probe of the limits of the application of conformal mappings to parametrize protein shapes. We identify limitations of the method and discuss its extension to achieving automatic registration of protein structures based on their surface geometry.

  2. Prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoEM structures

    Directory of Open Access Journals (Sweden)

    Srinivasan Narayanaswamy

    2010-06-01

    Full Text Available Abstract Background Dengue virus along with the other members of the flaviviridae family has reemerged as deadly human pathogens. Understanding the mechanistic details of these infections can be highly rewarding in developing effective antivirals. During maturation of the virus inside the host cell, the coat proteins E and M undergo conformational changes, altering the morphology of the viral coat. However, due to low resolution nature of the available 3-D structures of viral assemblies, the atomic details of these changes are still elusive. Results In the present analysis, starting from Cα positions of low resolution cryo electron microscopic structures the residue level details of protein-protein interaction interfaces of dengue virus coat proteins have been predicted. By comparing the preexisting structures of virus in different phases of life cycle, the changes taking place in these predicted protein-protein interaction interfaces were followed as a function of maturation process of the virus. Besides changing the current notion about the presence of only homodimers in the mature viral coat, the present analysis indicated presence of a proline-rich motif at the protein-protein interaction interface of the coat protein. Investigating the conservation status of these seemingly functionally crucial residues across other members of flaviviridae family enabled dissecting common mechanisms used for infections by these viruses. Conclusions Thus, using computational approach the present analysis has provided better insights into the preexisting low resolution structures of virus assemblies, the findings of which can be made use of in designing effective antivirals against these deadly human pathogens.

  3. Covalent bond symmetry breaking and protein secondary structure

    OpenAIRE

    Lundgren, Martin; Niemi, Antti J.

    2011-01-01

    Both symmetry and organized breaking of symmetry have a pivotal r\\^ole in our understanding of structure and pattern formation in physical systems, including the origin of mass in the Universe and the chiral structure of biological macromolecules. Here we report on a new symmetry breaking phenomenon that takes place in all biologically active proteins, thus this symmetry breaking relates to the inception of life. The unbroken symmetry determines the covalent bond geometry of a sp3 hybridized ...

  4. Systematic comparison of crystal and NMR protein structures deposited in the protein data bank.

    Science.gov (United States)

    Sikic, Kresimir; Tomic, Sanja; Carugo, Oliviero

    2010-09-03

    Nearly all the macromolecular three-dimensional structures deposited in Protein Data Bank were determined by either crystallographic (X-ray) or Nuclear Magnetic Resonance (NMR) spectroscopic methods. This paper reports a systematic comparison of the crystallographic and NMR results deposited in the files of the Protein Data Bank, in order to find out to which extent these information can be aggregated in bioinformatics. A non-redundant data set containing 109 NMR - X-ray structure pairs of nearly identical proteins was derived from the Protein Data Bank. A series of comparisons were performed by focusing the attention towards both global features and local details. It was observed that: (1) the RMDS values between NMR and crystal structures range from about 1.5 Å to about 2.5 Å; (2) the correlation between conformational deviations and residue type reveals that hydrophobic amino acids are more similar in crystal and NMR structures than hydrophilic amino acids; (3) the correlation between solvent accessibility of the residues and their conformational variability in solid state and in solution is relatively modest (correlation coefficient = 0.462); (4) beta strands on average match better between NMR and crystal structures than helices and loops; (5) conformational differences between loops are independent of crystal packing interactions in the solid state; (6) very seldom, side chains buried in the protein interior are observed to adopt different orientations in the solid state and in solution.

  5. The Structure and Function of Non-Collagenous Bone Proteins

    Science.gov (United States)

    Hook, Magnus

    1997-01-01

    The long-term goal for this program is to determine the structural and functional relationships of bone proteins and proteins that interact with bone. This information will used to design useful pharmacological compounds that will have a beneficial effect in osteoporotic patients and in the osteoporotic-like effects experienced on long duration space missions. The first phase of this program, funded under a cooperative research agreement with NASA through the Texas Medical Center, aimed to develop powerful recombinant expression systems and purification methods for production of large amounts of target proteins. Proteins expressed in sufficient'amount and purity would be characterized by a variety of structural methods, and made available for crystallization studies. In order to increase the likelihood of crystallization and subsequent high resolution solution of structures, we undertook to develop expression of normal and mutant forms of proteins by bacterial and mammalian cells. In addition to the main goals of this program, we would also be able to provide reagents for other related studies, including development of anti-fibrotic and anti-metastatic therapeutics.

  6. Protein Flexibility Facilitates Quaternary Structure Assembly and Evolution

    Science.gov (United States)

    Marsh, Joseph A.; Teichmann, Sarah A.

    2014-01-01

    The intrinsic flexibility of proteins allows them to undergo large conformational fluctuations in solution or upon interaction with other molecules. Proteins also commonly assemble into complexes with diverse quaternary structure arrangements. Here we investigate how the flexibility of individual protein chains influences the assembly and evolution of protein complexes. We find that flexibility appears to be particularly conducive to the formation of heterologous (i.e., asymmetric) intersubunit interfaces. This leads to a strong association between subunit flexibility and homomeric complexes with cyclic and asymmetric quaternary structure topologies. Similarly, we also observe that the more nonhomologous subunits that assemble together within a complex, the more flexible those subunits tend to be. Importantly, these findings suggest that subunit flexibility should be closely related to the evolutionary history of a complex. We confirm this by showing that evolutionarily more recent subunits are generally more flexible than evolutionarily older subunits. Finally, we investigate the very different explorations of quaternary structure space that have occurred in different evolutionary lineages. In particular, the increased flexibility of eukaryotic proteins appears to enable the assembly of heteromeric complexes with more unique components. PMID:24866000

  7. Structure and Pathology of Tau Protein in Alzheimer Disease

    Directory of Open Access Journals (Sweden)

    Michala Kolarova

    2012-01-01

    Full Text Available Alzheimer's disease (AD is the most common type of dementia. In connection with the global trend of prolonging human life and the increasing number of elderly in the population, the AD becomes one of the most serious health and socioeconomic problems of the present. Tau protein promotes assembly and stabilizes microtubules, which contributes to the proper function of neuron. Alterations in the amount or the structure of tau protein can affect its role as a stabilizer of microtubules as well as some of the processes in which it is implicated. The molecular mechanisms governing tau aggregation are mainly represented by several posttranslational modifications that alter its structure and conformational state. Hence, abnormal phosphorylation and truncation of tau protein have gained attention as key mechanisms that become tau protein in a pathological entity. Evidences about the clinicopathological significance of phosphorylated and truncated tau have been documented during the progression of AD as well as their capacity to exert cytotoxicity when expressed in cell and animal models. This paper describes the normal structure and function of tau protein and its major alterations during its pathological aggregation in AD.

  8. The challenge of protein structure determination—lessons from structural genomics

    Science.gov (United States)

    Slabinski, Lukasz; Jaroszewski, Lukasz; Rodrigues, Ana P.C.; Rychlewski, Leszek; Wilson, Ian A.; Lesley, Scott A.; Godzik, Adam

    2007-01-01

    The process of experimental determination of protein structure is marred with a high ratio of failures at many stages. With availability of large quantities of data from high-throughput structure determination in structural genomics centers, we can now learn to recognize protein features correlated with failures; thus, we can recognize proteins more likely to succeed and eventually learn how to modify those that are less likely to succeed. Here, we identify several protein features that correlate strongly with successful protein production and crystallization and combine them into a single score that assesses “crystallization feasibility.” The formula derived here was tested with a jackknife procedure and validated on independent benchmark sets. The “crystallization feasibility” score described here is being applied to target selection in the Joint Center for Structural Genomics, and is now contributing to increasing the success rate, lowering the costs, and shortening the time for protein structure determination. Analyses of PDB depositions suggest that very similar features also play a role in non-high-throughput structure determination, suggesting that this crystallization feasibility score would also be of significant interest to structural biology, as well as to molecular and biochemistry laboratories. PMID:17962404

  9. Protein flexibility: coordinate uncertainties and interpretation of structural differences

    Energy Technology Data Exchange (ETDEWEB)

    Rashin, Alexander A., E-mail: alexander-rashin@hotmail.com [BioChemComp Inc., 543 Sagamore Avenue, Teaneck, NJ 07666 (United States); LH Baker Center for Bioinformatics and Department of Biochemistry, Biophysics and Molecular Biology, 112 Office and Lab Building, Iowa State University, Ames, IA 50011-3020 (United States); Rashin, Abraham H. L. [BioChemComp Inc., 543 Sagamore Avenue, Teaneck, NJ 07666 (United States); Rutgers, The State University of New Jersey, 22371 BPO WAY, Piscataway, NJ 08854-8123 (United States); Jernigan, Robert L. [LH Baker Center for Bioinformatics and Department of Biochemistry, Biophysics and Molecular Biology, 112 Office and Lab Building, Iowa State University, Ames, IA 50011-3020 (United States); BioChemComp Inc., 543 Sagamore Avenue, Teaneck, NJ 07666 (United States)

    2009-11-01

    Criteria for the interpretability of coordinate differences and a new method for identifying rigid-body motions and nonrigid deformations in protein conformational changes are developed and applied to functionally induced and crystallization-induced conformational changes. Valid interpretations of conformational movements in protein structures determined by X-ray crystallography require that the movement magnitudes exceed their uncertainty threshold. Here, it is shown that such thresholds can be obtained from the distance difference matrices (DDMs) of 1014 pairs of independently determined structures of bovine ribonuclease A and sperm whale myoglobin, with no explanations provided for reportedly minor coordinate differences. The smallest magnitudes of reportedly functional motions are just above these thresholds. Uncertainty thresholds can provide objective criteria that distinguish between true conformational changes and apparent ‘noise’, showing that some previous interpretations of protein coordinate changes attributed to external conditions or mutations may be doubtful or erroneous. The use of uncertainty thresholds, DDMs, the newly introduced CDDMs (contact distance difference matrices) and a novel simple rotation algorithm allows a more meaningful classification and description of protein motions, distinguishing between various rigid-fragment motions and nonrigid conformational deformations. It is also shown that half of 75 pairs of identical molecules, each from the same asymmetric crystallographic cell, exhibit coordinate differences that range from just outside the coordinate uncertainty threshold to the full magnitude of large functional movements. Thus, crystallization might often induce protein conformational changes that are comparable to those related to or induced by the protein function.

  10. Trimeric structure for an essential protein in L1 retrotransposition.

    Science.gov (United States)

    Martin, Sandra L; Branciforte, Dan; Keller, David; Bain, David L

    2003-11-25

    Two proteins are encoded by the mammalian retrotransposon long interspersed nuclear element 1 (LINE-1 or L1); both are essential for retrotransposition. The function of the protein encoded by the 5'-most ORF, ORF1p, is incompletely understood, although the ORF1p from mouse L1 is known to bind single-stranded nucleic acids and function as a nucleic acid chaperone. ORF1p self-associates by means of a long coiled-coil domain in the N-terminal region of the protein, and the basic, C-terminal region (C-1/3 domain) contains the nucleic acid binding activity. The full-length and C-1/3 domains of ORF1p were purified to near homogeneity then analyzed by gel filtration chromatography and analytical ultracentrifugation. Both proteins were structurally homogeneous and asymmetric in solution, with the full-length version forming a stable trimer and the C-1/3 domain remaining a monomer. Examination of the full-length protein by atomic force microscopy revealed an asymmetric dumbbell shape, congruent with the chromatography and ultracentrifugation results. These structural features are compatible with the nucleic acid binding and chaperone activities of L1 ORF1p and offer further insight into the functions of this unique protein during LINE-1 retrotransposition.

  11. PCI-SS: MISO dynamic nonlinear protein secondary structure prediction

    Directory of Open Access Journals (Sweden)

    Aboul-Magd Mohammed O

    2009-07-01

    Full Text Available Abstract Background Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing α-helices, β-strands, and non-regular structures from primary sequence data which makes use of Parallel Cascade Identification (PCI, a powerful technique from the field of nonlinear system identification. Results Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at http://bioinf.sce.carleton.ca/PCISS. In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input

  12. Identification of similar regions of protein structures using integrated sequence and structure analysis tools

    Directory of Open Access Journals (Sweden)

    Heiland Randy

    2006-03-01

    Full Text Available Abstract Background Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Results Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. Conclusion With structural genomics initiatives determining structures with little, if any, functional characterization

  13. Identification of similar regions of protein structures using integrated sequence and structure analysis tools.

    Science.gov (United States)

    Peters, Brandon; Moad, Charles; Youn, Eunseog; Buffington, Kris; Heiland, Randy; Mooney, Sean

    2006-03-09

    Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO) ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. With structural genomics initiatives determining structures with little, if any, functional characterization, development of protein structure and function analysis tools are a

  14. Structure of haze forming proteins in white wines: Vitis vinifera thaumatin-like proteins.

    Directory of Open Access Journals (Sweden)

    Matteo Marangon

    Full Text Available Grape thaumatin-like proteins (TLPs play roles in plant-pathogen interactions and can cause protein haze in white wine unless removed prior to bottling. Different isoforms of TLPs have different hazing potential and aggregation behavior. Here we present the elucidation of the molecular structures of three grape TLPs that display different hazing potential. The three TLPs have very similar structures despite belonging to two different classes (F2/4JRU is a thaumatin-like protein while I/4L5H and H2/4MBT are VVTL1, and having different unfolding temperatures (56 vs. 62°C, with protein F2/4JRU being heat unstable and forming haze, while I/4L5H does not. These differences in properties are attributable to the conformation of a single loop and the amino acid composition of its flanking regions.

  15. Protein solution photomodification analysis by means of craquelure structures

    Science.gov (United States)

    Malov, Alexander N.; Neupokoeva, Anna V.; Morozov, Alexey N.; Timoshenko, Elena A.

    2016-11-01

    A craquelure structure of protein film as indicator of macromolecule state is discussing. Craquelure is a network of fine cracks or crackles on the surface of a painting, caused chiefly by shrinkage of paint film or varnish. The actions of laser radiation in the red and green spectral region on the protein film craquelure structure by the example of albumin are considering. It is experimentally shown that after drying the protein layer a craquelure pattern (variety of cracks in the layer) is formed with the parameters strongly modified by the laser action and depending on the time (energy density) of exposure. The threshold energy of laser action is defined; it does not depend on wavelength significantly.

  16. Folding of a large protein at high structural resolution.

    Science.gov (United States)

    Walters, Benjamin T; Mayne, Leland; Hinshaw, James R; Sosnick, Tobin R; Englander, S Walter

    2013-11-19

    Kinetic folding of the large two-domain maltose binding protein (MBP; 370 residues) was studied at high structural resolution by an advanced hydrogen-exchange pulse-labeling mass-spectrometry method (HX MS). Dilution into folding conditions initiates a fast molecular collapse into a polyglobular conformation (rest of the folding process. It contains the sites of three previously reported destabilizing mutations that greatly slow folding. These results indicate that the intermediate is an obligatory step on the MBP folding pathway. MBP then folds to the native state on a longer time scale (~100 s), suggestively in more than one step, the first of which forms structure adjacent to the 7-s intermediate. These results add a large protein to the list of proteins known to fold through distinct native-like intermediates in distinct pathways.

  17. Iron Sulfur Proteins and their Synthetic Analogues: Structure ...

    Indian Academy of Sciences (India)

    ... Public Lectures · Lecture Workshops · Refresher Courses · Symposia. Home; Journals; Resonance – Journal of Science Education; Volume 3; Issue 6. Iron Sulfur Proteins and their Synthetic Analogues: Structure, Reactivity and Redox Properties. B N Anand. General Article Volume 3 Issue 6 June 1998 pp 52-61 ...

  18. Evaluation of Software for Introducing Protein Structure: Visualization and Simulation

    Science.gov (United States)

    White, Brian; Kahriman, Azmin; Luberice, Lois; Idleh, Farhia

    2010-01-01

    Communicating an understanding of the forces and factors that determine a protein's structure is an important goal of many biology and biochemistry courses at a variety of levels. Many educators use computer software that allows visualization of these complex molecules for this purpose. Although visualization is in wide use and has been associated…

  19. Correlated mutations in protein sequences: Phylogenetic and structural effects

    Energy Technology Data Exchange (ETDEWEB)

    Lapedes, A.S. [Los Alamos National Lab., NM (United States). Theoretical Div.]|[Santa Fe Inst., NM (United States); Giraud, B.G. [C.E.N. Saclay, Gif/Yvette (France). Service Physique Theorique; Liu, L.C. [Los Alamos National Lab., NM (United States). Theoretical Div.; Stormo, G.D. [Univ. of Colorado, Boulder, CO (United States). Dept. of Molecular, Cellular and Developmental Biology

    1998-12-01

    Covariation analysis of sets of aligned sequences for RNA molecules is relatively successful in elucidating RNA secondary structure, as well as some aspects of tertiary structure. Covariation analysis of sets of aligned sequences for protein molecules is successful in certain instances in elucidating certain structural and functional links, but in general, pairs of sites displaying highly covarying mutations in protein sequences do not necessarily correspond to sites that are spatially close in the protein structure. In this paper the authors identify two reasons why naive use of covariation analysis for protein sequences fails to reliably indicate sequence positions that are spatially proximate. The first reason involves the bias introduced in calculation of covariation measures due to the fact that biological sequences are generally related by a non-trivial phylogenetic tree. The authors present a null-model approach to solve this problem. The second reason involves linked chains of covariation which can result in pairs of sites displaying significant covariation even though they are not spatially proximate. They present a maximum entropy solution to this classic problem of causation versus correlation. The methodologies are validated in simulation.

  20. Protein quaternary structure and aggregation in relation to allergenicity

    NARCIS (Netherlands)

    Boxtel, van E.L.

    2007-01-01

    In order to induce systemic food allergic reactions in humans, proteins after digestion in the human gastro-intestinal tract should still be able to bind IgE. The aim of the work presented in this thesis was to determine the effects of heating on the structure and digestibility of cupin and prolamin

  1. Iron Sulfur Proteins and their Synthetic Analogues: Structure ...

    Indian Academy of Sciences (India)

    The understanding of structures and functions of iron sulfur proteins is an area ofbio-inorganic chemistry which has developed into a subject of great significance over the last two decades. This group of non-heme iron-sulfur (Fe-S) compounds are involved in electron transfer reactions in biological systems and are thus.

  2. Neural network definitions of highly predictable protein secondary structure classes

    Energy Technology Data Exchange (ETDEWEB)

    Lapedes, A. [Los Alamos National Lab., NM (United States)]|[Santa Fe Inst., NM (United States); Steeg, E. [Toronto Univ., ON (Canada). Dept. of Computer Science; Farber, R. [Los Alamos National Lab., NM (United States)

    1994-02-01

    We use two co-evolving neural networks to determine new classes of protein secondary structure which are significantly more predictable from local amino sequence than the conventional secondary structure classification. Accurate prediction of the conventional secondary structure classes: alpha helix, beta strand, and coil, from primary sequence has long been an important problem in computational molecular biology. Neural networks have been a popular method to attempt to predict these conventional secondary structure classes. Accuracy has been disappointingly low. The algorithm presented here uses neural networks to similtaneously examine both sequence and structure data, and to evolve new classes of secondary structure that can be predicted from sequence with significantly higher accuracy than the conventional classes. These new classes have both similarities to, and differences with the conventional alpha helix, beta strand and coil.

  3. A probabilistic fragment-based protein structure prediction algorithm.

    Science.gov (United States)

    Simoncini, David; Berenger, Francois; Shrestha, Rojan; Zhang, Kam Y J

    2012-01-01

    Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of [Formula: see text] proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold's decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software.html [corrected].

  4. Structure and Function of Caltrin (cium ansport hibitor Proteins

    Directory of Open Access Journals (Sweden)

    Ernesto Javier Grasso

    2017-12-01

    Full Text Available Caltrin ( cal cium tr ansport in hibitor is a family of small and basic proteins of the mammalian seminal plasma which bind to sperm cells during ejaculation and inhibit the extracellular Ca 2+ uptake, preventing the premature acrosomal exocytosis and hyperactivation when sperm cells ascend through the female reproductive tract. The binding of caltrin proteins to specific areas of the sperm surface suggests the existence of caltrin receptors, or precise protein-phospholipid arrangements in the sperm membrane, distributed in the regions where Ca 2+ influx may take place. However, the molecular mechanisms of recognition and interaction between caltrin and spermatozoa have not been elucidated. Therefore, the aim of this article is to describe in depth the known structural features and functional properties of caltrin proteins, to find out how they may possibly interact with the sperm membranes to control the intracellular signaling that trigger physiological events required for fertilization.

  5. Holo- And Apo- Structures of Bacterial Periplasmic Heme Binding Proteins

    Energy Technology Data Exchange (ETDEWEB)

    Ho, W.W.; Li, H.; Eakanunkul, S.; Tong, Y.; Wilks, A.; Guo, M.; Poulos, T.L.

    2009-06-01

    An essential component of heme transport in Gram-negative bacterial pathogens is the periplasmic protein that shuttles heme between outer and inner membranes. We have solved the first crystal structures of two such proteins, ShuT from Shigella dysenteriae and PhuT from Pseudomonas aeruginosa. Both share a common architecture typical of Class III periplasmic binding proteins. The heme binds in a narrow cleft between the N- and C-terminal binding domains and is coordinated by a Tyr residue. A comparison of the heme-free (apo) and -bound (holo) structures indicates little change in structure other than minor alterations in the heme pocket and movement of the Tyr heme ligand from an 'in' position where it can coordinate the heme iron to an 'out' orientation where it points away from the heme pocket. The detailed architecture of the heme pocket is quite different in ShuT and PhuT. Although Arg{sup 228} in PhuT H-bonds with a heme propionate, in ShuT a peptide loop partially takes up the space occupied by Arg{sup 228}, and there is no Lys or Arg H-bonding with the heme propionates. A comparison of PhuT/ShuT with the vitamin B{sub 12}-binding protein BtuF and the hydroxamic-type siderophore-binding protein FhuD, the only two other structurally characterized Class III periplasmic binding proteins, demonstrates that PhuT/ShuT more closely resembles BtuF, which reflects the closer similarity in ligands, heme and B{sub 12}, compared with ligands for FhuD, a peptide siderophore.

  6. Compare local pocket and global protein structure models by small structure patterns

    KAUST Repository

    Cui, Xuefeng

    2015-09-09

    Researchers proposed several criteria to assess the quality of predicted protein structures because it is one of the essential tasks in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) competitions. Popular criteria include root mean squared deviation (RMSD), MaxSub score, TM-score, GDT-TS and GDT-HA scores. All these criteria require calculation of rigid transformations to superimpose the the predicted protein structure to the native protein structure. Yet, how to obtain the rigid transformations is unknown or with high time complexity, and, hence, heuristic algorithms were proposed. In this work, we carefully design various small structure patterns, including the ones specifically tuned for local pockets. Such structure patterns are biologically meaningful, and address the issue of relying on a sufficient number of backbone residue fragments for existing methods. We sample the rigid transformations from these small structure patterns; and the optimal superpositions yield by these small structures are refined and reported. As a result, among 11; 669 pairs of predicted and native local protein pocket models from the CASP10 dataset, the GDT-TS scores calculated by our method are significantly higher than those calculated by LGA. Moreover, our program is computationally much more efficient. Source codes and executables are publicly available at http://www.cbrc.kaust.edu.sa/prosta/

  7. Cloud prediction of protein structure and function with PredictProtein for Debian.

    Science.gov (United States)

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome.

  8. Lanthanide labeling offers fast NMR approach to 3D structure determinations of protein-protein complexes.

    Science.gov (United States)

    Pintacuda, Guido; Park, Ah Young; Keniry, Max A; Dixon, Nicholas E; Otting, Gottfried

    2006-03-22

    A novel nuclear magnetic resonance (NMR) strategy based on labeling with lanthanides achieves rapid determinations of accurate three-dimensional (3D) structures of protein-protein complexes. The method employs pseudocontact shifts (PCS) induced by a site-specifically bound lanthanide ion to anchor the coordinate system of the magnetic susceptibility tensor in the molecular frames of the two molecules. Simple superposition of the tensors detected in the two protein molecules brings them together in a 3D model of the protein-protein complex. The method is demonstrated with the 30 kDa complex between two subunits of Escherichia coli polymerase III, comprising the N-terminal domain of the exonuclease subunit epsilon and the subunit theta. The 3D structures of the individual molecules were docked based on a limited number of PCS observed in 2D 15N-heteronuclear single quantum coherence spectra. Degeneracies in the mutual orientation of the protein structures were resolved by the use of two different lanthanide ions, Dy3+ and Er3+.

  9. Comparison of tertiary structures of proteins in protein-protein complexes with unbound forms suggests prevalence of allostery in signalling proteins

    Directory of Open Access Journals (Sweden)

    Swapna Lakshmipuram S

    2012-05-01

    Full Text Available Abstract Background Most signalling and regulatory proteins participate in transient protein-protein interactions during biological processes. They usually serve as key regulators of various cellular processes and are often stable in both protein-bound and unbound forms. Availability of high-resolution structures of their unbound and bound forms provides an opportunity to understand the molecular mechanisms involved. In this work, we have addressed the question “What is the nature, extent, location and functional significance of structural changes which are associated with formation of protein-protein complexes?” Results A database of 76 non-redundant sets of high resolution 3-D structures of protein-protein complexes, representing diverse functions, and corresponding unbound forms, has been used in this analysis. Structural changes associated with protein-protein complexation have been investigated using structural measures and Protein Blocks description. Our study highlights that significant structural rearrangement occurs on binding at the interface as well as at regions away from the interface to form a highly specific, stable and functional complex. Notably, predominantly unaltered interfaces interact mainly with interfaces undergoing substantial structural alterations, revealing the presence of at least one structural regulatory component in every complex. Interestingly, about one-half of the number of complexes, comprising largely of signalling proteins, show substantial localized structural change at surfaces away from the interface. Normal mode analysis and available information on functions on some of these complexes suggests that many of these changes are allosteric. This change is largely manifest in the proteins whose interfaces are altered upon binding, implicating structural change as the possible trigger of allosteric effect. Although large-scale studies of allostery induced by small-molecule effectors are available in

  10. Thermal green protein, an extremely stable, nonaggregating fluorescent protein created by structure-guided surface engineering.

    Science.gov (United States)

    Close, Devin W; Paul, Craig Don; Langan, Patricia S; Wilce, Matthew C J; Traore, Daouda A K; Halfmann, Randal; Rocha, Reginaldo C; Waldo, Geoffery S; Payne, Riley J; Rucker, Joseph B; Prescott, Mark; Bradbury, Andrew R M

    2015-07-01

    In this article, we describe the engineering and X-ray crystal structure of Thermal Green Protein (TGP), an extremely stable, highly soluble, non-aggregating green fluorescent protein. TGP is a soluble variant of the fluorescent protein eCGP123, which despite being highly stable, has proven to be aggregation-prone. The X-ray crystal structure of eCGP123, also determined within the context of this paper, was used to carry out rational surface engineering to improve its solubility, leading to TGP. The approach involved simultaneously eliminating crystal lattice contacts while increasing the overall negative charge of the protein. Despite intentional disruption of lattice contacts and introduction of high entropy glutamate side chains, TGP crystallized readily in a number of different conditions and the X-ray crystal structure of TGP was determined to 1.9 Å resolution. The structural reasons for the enhanced stability of TGP and eCGP123 are discussed. We demonstrate the utility of using TGP as a fusion partner in various assays and significantly, in amyloid assays in which the standard fluorescent protein, EGFP, is undesirable because of aberrant oligomerization. © 2014 Wiley Periodicals, Inc.

  11. Water polygons in high-resolution protein crystal structures

    Science.gov (United States)

    Lee, Jonas; Kim, Sung-Hou

    2009-01-01

    We have analyzed the interstitial water (ISW) structures in 1500 protein crystal structures deposited in the Protein Data Bank that have greater than 1.5 Å resolution with less than 90% sequence similarity with each other. We observed varieties of polygonal water structures composed of three to eight water molecules. These polygons may represent the time- and space-averaged structures of “stable” water oligomers present in liquid water, and their presence as well as relative population may be relevant in understanding physical properties of liquid water at a given temperature. On an average, 13% of ISWs are localized enough to be visible by X-ray diffraction. Of those, averages of 78% are water molecules in the first water layer on the protein surface. Of the localized ISWs beyond the first layer, almost half of them form water polygons such as trigons, tetragons, as well as expected pentagons, hexagons, higher polygons, partial dodecahedrons, and disordered networks. Most of the octagons and nanogons are formed by fusion of smaller polygons. The trigons are most commonly observed. We suggest that our observation provides an experimental basis for including these water polygon structures in correlating and predicting various water properties in liquid state. PMID:19551896

  12. Integrated visual analysis of protein structures, sequences, and feature data.

    Science.gov (United States)

    Stolte, Christian; Sabir, Kenneth S; Heinrich, Julian; Hammang, Christopher J; Schafferhans, Andrea; O'Donoghue, Seán I

    2015-01-01

    To understand the molecular mechanisms that give rise to a protein's function, biologists often need to (i) find and access all related atomic-resolution 3D structures, and (ii) map sequence-based features (e.g., domains, single-nucleotide polymorphisms, post-translational modifications) onto these structures. To streamline these processes we recently developed Aquaria, a resource offering unprecedented access to protein structure information based on an all-against-all comparison of SwissProt and PDB sequences. In this work, we provide a requirements analysis for several frequently occuring tasks in molecular biology and describe how design choices in Aquaria meet these requirements. Finally, we show how the interface can be used to explore features of a protein and gain biologically meaningful insights in two case studies conducted by domain experts. The user interface design of Aquaria enables biologists to gain unprecedented access to molecular structures and simplifies the generation of insight. The tasks involved in mapping sequence features onto structures can be conducted easier and faster using Aquaria.

  13. Structural and evolutionary versatility in protein complexes with uneven stoichiometry.

    Science.gov (United States)

    Marsh, Joseph A; Rees, Holly A; Ahnert, Sebastian E; Teichmann, Sarah A

    2015-03-16

    Proteins assemble into complexes with diverse quaternary structures. Although most heteromeric complexes of known structure have even stoichiometry, a significant minority have uneven stoichiometry--that is, differing numbers of each subunit type. To adopt this uneven stoichiometry, sequence-identical subunits must be asymmetric with respect to each other, forming different interactions within the complex. Here we first investigate the occurrence of uneven stoichiometry, demonstrating that it is common in vitro and is likely to be common in vivo. Next, we elucidate the structural determinants of uneven stoichiometry, identifying six different mechanisms by which it can be achieved. Finally, we study the frequency of uneven stoichiometry across evolution, observing a significant enrichment in bacteria compared with eukaryotes. We show that this arises due to a general increased tendency for bacterial proteins to self-assemble and form homomeric interactions, even within the context of a heteromeric complex.

  14. Beyond Membrane Protein Structure: Drug Discovery, Dynamics and Difficulties.

    Science.gov (United States)

    Biggin, Philip C; Aldeghi, Matteo; Bodkin, Michael J; Heifetz, Alexander

    2016-01-01

    Most of the previous content of this book has focused on obtaining the structures of membrane proteins. In this chapter we explore how those structures can be further used in two key ways. The first is their use in structure based drug design (SBDD) and the second is how they can be used to extend our understanding of their functional activity via the use of molecular dynamics. Both aspects now heavily rely on computations. This area is vast, and alas, too large to consider in depth in a single book chapter. Thus where appropriate we have referred the reader to recent reviews for deeper assessment of the field. We discuss progress via the use of examples from two main drug target areas; G-protein coupled receptors (GPCRs) and ion channels. We end with a discussion of some of the main challenges in the area.

  15. Protein kinase CK2 in health and disease: Protein kinase CK2: from structures to insights

    DEFF Research Database (Denmark)

    Niefind, K; Raaf, J; Issinger, Olaf-Georg

    2009-01-01

    Within the last decade, 40 crystal structures corresponding to protein kinase CK2 (former name 'casein kinase 2'), to its catalytic subunit CK2alpha and to its regulatory subunit CK2beta were published. Together they provide a valuable, yet by far not complete basis to rationalize the biochemical...... the critical region of CK2alpha recruitment is pre-formed in the unbound state. In CK2alpha the activation segment - a key element of protein kinase regulation - adapts invariably the typical conformation of the active enzymes. Recent structures of human CK2alpha revealed a surprising plasticity in the ATP...

  16. Structural Evolution of the Protein Kinase-Like Superfamily.

    Directory of Open Access Journals (Sweden)

    2005-10-01

    Full Text Available The protein kinase family is large and important, but it is only one family in a larger superfamily of homologous kinases that phosphorylate a variety of substrates and play important roles in all three superkingdoms of life. We used a carefully constructed structural alignment of selected kinases as the basis for a study of the structural evolution of the protein kinase-like superfamily. The comparison of structures revealed a "universal core" domain consisting only of regions required for ATP binding and the phosphotransfer reaction. Remarkably, even within the universal core some kinase structures display notable changes, while still retaining essential activity. Hence, the protein kinase-like superfamily has undergone substantial structural and sequence revision over long evolutionary timescales. We constructed a phylogenetic tree for the superfamily using a novel approach that allowed for the combination of sequence and structure information into a unified quantitative analysis. When considered against the backdrop of species distribution and other metrics, our tree provides a compelling scenario for the development of the various kinase families from a shared common ancestor. We propose that most of the so-called "atypical kinases" are not intermittently derived from protein kinases, but rather diverged early in evolution to form a distinct phyletic group. Within the atypical kinases, the aminoglycoside and choline kinase families appear to share the closest relationship. These two families in turn appear to be the most closely related to the protein kinase family. In addition, our analysis suggests that the actin-fragmin kinase, an atypical protein kinase, is more closely related to the phosphoinositide-3 kinase family than to the protein kinase family. The two most divergent families, alpha-kinases and phosphatidylinositol phosphate kinases (PIPKs, appear to have distinct evolutionary histories. While the PIPKs probably have an

  17. Analysis of residue conformations in peptides in Cambridge structural database and protein-peptide structural complexes.

    Science.gov (United States)

    Raghavender, Upadhyayula Surya

    2017-03-01

    A comprehensive statistical analysis of the geometric parameters of peptide chains in a reduced dataset of protein-peptide complexes in Protein Data Bank (PDB) is presented. The angular variables describing the backbone conformations of amino acid residues in peptide chains shed insights into the conformational preferences of peptide residues interacting with protein partners. Nonparametric statistical approaches are employed to evaluate the interrelationships and associations in structural variables. Grouping of residues based on their structure into chemical classes reveals characteristic trends in parameter relationships. A comparison of canonical amino acid residues in free peptide structures in Cambridge structural database (CSD) with identical residues in PDB complexes, suggests that the information can be integrated from both the structural repositories enabling efficient and accurate modeling of biologically active peptides. © 2016 John Wiley & Sons A/S.

  18. Structuring detergents for extracting and stabilizing functional membrane proteins.

    Directory of Open Access Journals (Sweden)

    Rima Matar-Merheb

    Full Text Available BACKGROUND: Membrane proteins are privileged pharmaceutical targets for which the development of structure-based drug design is challenging. One underlying reason is the fact that detergents do not stabilize membrane domains as efficiently as natural lipids in membranes, often leading to a partial to complete loss of activity/stability during protein extraction and purification and preventing crystallization in an active conformation. METHODOLOGY/PRINCIPAL FINDINGS: Anionic calix[4]arene based detergents (C4Cn, n=1-12 were designed to structure the membrane domains through hydrophobic interactions and a network of salt bridges with the basic residues found at the cytosol-membrane interface of membrane proteins. These compounds behave as surfactants, forming micelles of 5-24 nm, with the critical micellar concentration (CMC being as expected sensitive to pH ranging from 0.05 to 1.5 mM. Both by 1H NMR titration and Surface Tension titration experiments, the interaction of these molecules with the basic amino acids was confirmed. They extract membrane proteins from different origins behaving as mild detergents, leading to partial extraction in some cases. They also retain protein functionality, as shown for BmrA (Bacillus multidrug resistance ATP protein, a membrane multidrug-transporting ATPase, which is particularly sensitive to detergent extraction. These new detergents allow BmrA to bind daunorubicin with a Kd of 12 µM, a value similar to that observed after purification using dodecyl maltoside (DDM. They preserve the ATPase activity of BmrA (which resets the protein to its initial state after drug efflux much more efficiently than SDS (sodium dodecyl sulphate, FC12 (Foscholine 12 or DDM. They also maintain in a functional state the C4Cn-extracted protein upon detergent exchange with FC12. Finally, they promote 3D-crystallization of the membrane protein. CONCLUSION/SIGNIFICANCE: These compounds seem promising to extract in a functional state

  19. A resource for benchmarking the usefulness of protein structure models

    Directory of Open Access Journals (Sweden)

    Carbajo Daniel

    2012-08-01

    Full Text Available Abstract Background Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. Results This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. Conclusions The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. Implementation, availability and requirements Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php. Operating system(s: Platform independent. Programming language: Perl-BioPerl (program; mySQL, Perl DBI and DBD modules (database; php, JavaScript, Jmol scripting (web server. Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet and PSAIA. License: Free. Any

  20. Using chemical shifts to assess transient secondary structure and generate ensemble structures of intrinsically disordered proteins.

    Science.gov (United States)

    Kashtanov, Stepan; Borcherds, Wade; Wu, Hongwei; Daughdrill, Gary W; Ytreberg, F Marty

    2012-01-01

    The chemical shifts of backbone atoms in polypeptides are sensitive to the dihedral angles phi and psi and can be used to estimate transient secondary structure and to generate structural ensembles of intrinsically disordered proteins (IDPs). In this chapter, several of the random coil reference databases used to estimate transient secondary structure are described, and the procedure is outlined for using these databases to estimate transient secondary structure. A new protocol is also presented for generating a diverse ensemble of structures for an IDP and reweighting these structures to optimize the fit between simulated and experimental chemical shift values.

  1. Structure and assembly of scalable porous protein cages

    Science.gov (United States)

    Sasaki, Eita; Böhringer, Daniel; van de Waterbeemd, Michiel; Leibundgut, Marc; Zschoche, Reinhard; Heck, Albert J. R.; Ban, Nenad; Hilvert, Donald

    2017-03-01

    Proteins that self-assemble into regular shell-like polyhedra are useful, both in nature and in the laboratory, as molecular containers. Here we describe cryo-electron microscopy (EM) structures of two versatile encapsulation systems that exploit engineered electrostatic interactions for cargo loading. We show that increasing the number of negative charges on the lumenal surface of lumazine synthase, a protein that naturally assembles into a ~1-MDa dodecahedron composed of 12 pentamers, induces stepwise expansion of the native protein shell, giving rise to thermostable ~3-MDa and ~6-MDa assemblies containing 180 and 360 subunits, respectively. Remarkably, these expanded particles assume unprecedented tetrahedrally and icosahedrally symmetric structures constructed entirely from pentameric units. Large keyhole-shaped pores in the shell, not present in the wild-type capsid, enable diffusion-limited encapsulation of complementarily charged guests. The structures of these supercharged assemblies demonstrate how programmed electrostatic effects can be effectively harnessed to tailor the architecture and properties of protein cages.

  2. Detection of a fourth orbivirus non-structural protein.

    Directory of Open Access Journals (Sweden)

    Mourad Belhouchet

    Full Text Available The genus Orbivirus includes both insect and tick-borne viruses. The orbivirus genome, composed of 10 segments of dsRNA, encodes 7 structural proteins (VP1-VP7 and 3 non-structural proteins (NS1-NS3. An open reading frame (ORF that spans almost the entire length of genome segment-9 (Seg-9 encodes VP6 (the viral helicase. However, bioinformatic analysis recently identified an overlapping ORF (ORFX in Seg-9. We show that ORFX encodes a new non-structural protein, identified here as NS4. Western blotting and confocal fluorescence microscopy, using antibodies raised against recombinant NS4 from Bluetongue virus (BTV, which is insect-borne, or Great Island virus (GIV, which is tick-borne, demonstrate that these proteins are synthesised in BTV or GIV infected mammalian cells, respectively. BTV NS4 is also expressed in Culicoides insect cells. NS4 forms aggregates throughout the cytoplasm as well as in the nucleus, consistent with identification of nuclear localisation signals within the NS4 sequence. Bioinformatic analyses indicate that NS4 contains coiled-coils, is related to proteins that bind nucleic acids, or are associated with membranes and shows similarities to nucleolar protein UTP20 (a processome subunit. Recombinant NS4 of GIV protects dsRNA from degradation by endoribonucleases of the RNAse III family, indicating that it interacts with dsRNA. However, BTV NS4, which is only half the putative size of the GIV NS4, did not protect dsRNA from RNAse III cleavage. NS4 of both GIV and BTV protect DNA from degradation by DNAse. NS4 was found to associate with lipid droplets in cells infected with BTV or GIV or transfected with a plasmid expressing NS4.

  3. Sequence analysis and structural implications of rotavirus capsid proteins.

    Science.gov (United States)

    Parbhoo, N; Dewar, J B; Gildenhuys, S

    Rotavirus is the major cause of severe virus-associated gastroenteritis worldwide in children aged 5 and younger. Many children lose their lives annually due to this infection and the impact is particularly pronounced in developing countries. The mature rotavirus is a non-enveloped triple-layered nucleocapsid containing 11 double stranded RNA segments. Here a global view on the sequence and structure of the three main capsid proteins, VP2, VP6 and VP7 is shown by generating a consensus sequence for each of these rotavirus proteins, for each species obtained from published data of representative rotavirus genotypes from across the world and across species. Degree of conservation between species was represented on homology models for each of the proteins. VP7 shows the highest level of variation with 14-45 amino acids showing conservation of less than 60%. These changes are localised to the outer surface alluding to a possible mechanism in evading the immune system. The middle layer, VP6 shows lower variability with only 14-32 sites having lower than 70% conservation. The inner structural layer made up of VP2 showed the lowest variability with only 1-16 sites having less than 70% conservation across species. The results correlate with each protein's multiple structural roles in the infection cycle. Thus, although the nucleotide sequences vary due to the error-prone nature of replication and lack of proof reading, the corresponding amino acid sequence of VP2, 6 and 7 remain relatively conserved. Benefits of this knowledge about the conservation include the ability to target proteins at sites that cannot undergo mutational changes without influencing viral fitness; as well as possibility to study systems that are highly evolved for structure and function in order to determine how to generate and manipulate such systems for use in various biotechnological applications.

  4. Post-translational regulation and modifications of flavivirus structural proteins.

    Science.gov (United States)

    Roby, Justin A; Setoh, Yin Xiang; Hall, Roy A; Khromykh, Alexander A

    2015-07-01

    Flaviviruses are a group of single-stranded, positive-sense RNA viruses that generally circulate between arthropod vectors and susceptible vertebrate hosts, producing significant human and veterinary disease burdens. Intensive research efforts have broadened our scientific understanding of the replication cycles of these viruses and have revealed several elegant and tightly co-ordinated post-translational modifications that regulate the activity of viral proteins. The three structural proteins in particular - capsid (C), pre-membrane (prM) and envelope (E) - are subjected to strict regulatory modifications as they progress from translation through virus particle assembly and egress. The timing of proteolytic cleavage events at the C-prM junction directly influences the degree of genomic RNA packaging into nascent virions. Proteolytic maturation of prM by host furin during Golgi transit facilitates rearrangement of the E proteins at the virion surface, exposing the fusion loop and thus increasing particle infectivity. Specific interactions between the prM and E proteins are also important for particle assembly, as prM acts as a chaperone, facilitating correct conformational folding of E. It is only once prM/E heterodimers form that these proteins can be secreted efficiently. The addition of branched glycans to the prM and E proteins during virion transit also plays a key role in modulating the rate of secretion, pH sensitivity and infectivity of flavivirus particles. The insights gained from research into post-translational regulation of structural proteins are beginning to be applied in the rational design of improved flavivirus vaccine candidates and make attractive targets for the development of novel therapeutics.

  5. Automatic classification of protein structures using physicochemical parameters.

    Science.gov (United States)

    Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam

    2014-09-01

    Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.

  6. Compact structure and proteins of pasta retard in vitro digestive evolution of branched starch molecular structure.

    Science.gov (United States)

    Zou, Wei; Sissons, Mike; Warren, Frederick J; Gidley, Michael J; Gilbert, Robert G

    2016-11-05

    The roles that the compact structure and proteins in pasta play in retarding evolution of starch molecular structure during in vitro digestion are explored, using four types of cooked samples: whole pasta, pasta powder, semolina (with proteins) and extracted starch without proteins. These were subjected to in vitro digestion with porcine α-amylase, collecting samples at different times and characterizing the weight distribution of branched starch molecules using size-exclusion chromatography. Measurement of α-amylase activity showed that a protein (or proteins) from semolina or pasta powder interacted with α-amylase, causing reduced enzymatic activity and retarding digestion of branched starch molecules with hydrodynamic radius (Rh)100nm. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Structural model of dodecameric heat-shock protein Hsp21

    DEFF Research Database (Denmark)

    Rutsdottir, Gudrun; Härmark, Johan; Weide, Yoran

    2017-01-01

    Small heat-shock proteins (sHsps) prevent aggregation of thermosensitive client proteins in a first line of defense against cellular stress. The mechanisms by which they perform this function have been hard to define due to limited structural information; currently, there is only one high-resolution...... from previous undefined or inwardly facing arms. To test the importance of the IXVXI motif, we created the point mutant V181A, which, as expected, disrupts the Hsp21 dodecamer and decreases chaperone activity. Finally, our data emphasize that sHsp chaperone efficiency depends on oligomerization...

  8. Structural interface parameters are discriminatory in recognising near-native poses of protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Sony Malhotra

    Full Text Available Interactions at the molecular level in the cellular environment play a very crucial role in maintaining the physiological functioning of the cell. These molecular interactions exist at varied levels viz. protein-protein interactions, protein-nucleic acid interactions or protein-small molecules interactions. Presently in the field, these interactions and their mechanisms mark intensively studied areas. Molecular interactions can also be studied computationally using the approach named as Molecular Docking. Molecular docking employs search algorithms to predict the possible conformations for interacting partners and then calculates interaction energies. However, docking proposes number of solutions as different docked poses and hence offers a serious challenge to identify the native (or near native structures from the pool of these docked poses. Here, we propose a rigorous scoring scheme called DockScore which can be used to rank the docked poses and identify the best docked pose out of many as proposed by docking algorithm employed. The scoring identifies the optimal interactions between the two protein partners utilising various features of the putative interface like area, short contacts, conservation, spatial clustering and the presence of positively charged and hydrophobic residues. DockScore was first trained on a set of 30 protein-protein complexes to determine the weights for different parameters. Subsequently, we tested the scoring scheme on 30 different protein-protein complexes and native or near-native structure were assigned the top rank from a pool of docked poses in 26 of the tested cases. We tested the ability of DockScore to discriminate likely dimer interactions that differ substantially within a homologous family and also demonstrate that DOCKSCORE can distinguish correct pose for all 10 recent CAPRI targets.

  9. DNABII proteins play a central role in UPEC biofilm structure.

    Science.gov (United States)

    Devaraj, Aishwarya; Justice, Sheryl S; Bakaletz, Lauren O; Goodman, Steven D

    2015-06-01

    Most chronic and recurrent bacterial infections involve a biofilm component, the foundation of which is the extracellular polymeric substance (EPS). Extracellular DNA (eDNA) is a conserved and key component of the EPS of pathogenic biofilms. The DNABII protein family includes integration host factor (IHF) and histone-like protein (HU); both are present in the extracellular milieu. We have shown previously that the DNABII proteins are often found in association with eDNA and are critical for the structural integrity of bacterial communities that utilize eDNA as a matrix component. Here, we demonstrate that uropathogenic Escherichia coli (UPEC) strain UTI89 incorporates eDNA within its biofilm matrix and that the DNABII proteins are not only important for biofilm growth, but are limiting; exogenous addition of these proteins promotes biofilm formation that is dependent on eDNA. In addition, we show that both subunits of IHF, yet only one subunit of HU (HupB), are critical for UPEC biofilm development. We discuss the roles of these proteins in context of the UPEC EPS. © 2015 John Wiley & Sons Ltd.

  10. DNABII proteins play a central role in UPEC biofilm structure

    Science.gov (United States)

    Devaraj, Aishwarya; Justice, Sheryl S.; Bakaletz, Lauren O.; Goodman, Steven D.

    2015-01-01

    Summary Most chronic and recurrent bacterial infections involve a biofilm component, the foundation of which is the extracellular polymeric substance (EPS). Extracellular DNA (eDNA) is a conserved and key component of the EPS of pathogenic biofilms. The DNABII protein family includes integration host factor (IHF) and Histone-like protein (HU); both are present in the extracellular milieu. We have shown previously that the DNABII proteins are often found in association with eDNA and are critical for the structural integrity of bacterial communities that utilize eDNA as a matrix component. Here, we demonstrated that Uropathogenic E. coli (UPEC) strain UTI89 incorporates eDNA within its biofilm matrix and that the DNABII proteins are not only important for biofilm growth, but are limiting; exogenous addition of these proteins promotes biofilm formation that is dependent on eDNA. In addition, we show that both subunits of IHF, yet only one subunit of HU (HupB), are critical for UPEC biofilm development. We discuss the roles of these proteins in context of the UPEC EPS. PMID:25757804

  11. Elasticity, structure, and relaxation of extended proteins under force

    Science.gov (United States)

    Stirnemann, Guillaume; Giganti, David; Fernandez, Julio M.; Berne, B. J.

    2013-01-01

    Force spectroscopies have emerged as a powerful and unprecedented tool to study and manipulate biomolecules directly at a molecular level. Usually, protein and DNA behavior under force is described within the framework of the worm-like chain (WLC) model for polymer elasticity. Although it has been surprisingly successful for the interpretation of experimental data, especially at high forces, the WLC model lacks structural and dynamical molecular details associated with protein relaxation under force that are key to the understanding of how force affects protein flexibility and reactivity. We use molecular dynamics simulations of ubiquitin to provide a deeper understanding of protein relaxation under force. We find that the WLC model successfully describes the simulations of ubiquitin, especially at higher forces, and we show how protein flexibility and persistence length, probed in the force regime of the experiments, are related to how specific classes of backbone dihedral angles respond to applied force. Although the WLC model is an average, backbone model, we show how the protein side chains affect the persistence length. Finally, we find that the diffusion coefficient of the protein’s end-to-end distance is on the order of 108 nm2/s, is position and side-chain dependent, but is independent of the length and independent of the applied force, in contrast with other descriptions. PMID:23407163

  12. Structural organization of G-protein-coupled receptors

    Science.gov (United States)

    Lomize, Andrei L.; Pogozheva, Irina D.; Mosberg, Henry I.

    1999-07-01

    Atomic-resolution structures of the transmembrane 7-α-helical domains of 26 G-protein-coupled receptors (GPCRs) (including opsins, cationic amine, melatonin, purine, chemokine, opioid, and glycoprotein hormone receptors and two related proteins, retinochrome and Duffy erythrocyte antigen) were calculated by distance geometry using interhelical hydrogen bonds formed by various proteins from the family and collectively applied as distance constraints, as described previously [Pogozheva et al., Biophys. J., 70 (1997) 1963]. The main structural features of the calculated GPCR models are described and illustrated by examples. Some of the features reflect physical interactions that are responsible for the structural stability of the transmembrane α-bundle: the formation of extensive networks of interhelical H-bonds and sulfur-aromatic clusters that are spatially organized as 'polarity gradients' the close packing of side-chains throughout the transmembrane domain; and the formation of interhelical disulfide bonds in some receptors and a plausible Zn2+ binding center in retinochrome. Other features of the models are related to biological function and evolution of GPCRs: the formation of a common 'minicore' of 43 evolutionarily conserved residues; a multitude of correlated replacements throughout the transmembrane domain; an Na+-binding site in some receptors, and excellent complementarity of receptor binding pockets to many structurally dissimilar, conformationally constrained ligands, such as retinal, cyclic opioid peptides, and cationic amine ligands. The calculated models are in good agreement with numerous experimental data.

  13. Structural plasticity in human heterochromatin protein 1β.

    Directory of Open Access Journals (Sweden)

    Francesca Munari

    Full Text Available As essential components of the molecular machine assembling heterochromatin in eukaryotes, HP1 (Heterochromatin Protein 1 proteins are key regulators of genome function. While several high-resolution structures of the two globular regions of HP1, chromo and chromoshadow domains, in their free form or in complex with recognition-motif peptides are available, less is known about the conformational behavior of the full-length protein. Here, we used NMR spectroscopy in combination with small angle X-ray scattering and dynamic light scattering to characterize the dynamic and structural properties of full-length human HP1β (hHP1β in solution. We show that the hinge region is highly flexible and enables a largely unrestricted spatial search by the two globular domains for their binding partners. In addition, the binding pockets within the chromo and chromoshadow domains experience internal dynamics that can be useful for the versatile recognition of different binding partners. In particular, we provide evidence for the presence of a distinct structural propensity in free hHP1β that prepares a binding-competent interface for the formation of the intermolecular β-sheet with methylated histone H3. The structural plasticity of hHP1β supports its ability to bind and connect a wide variety of binding partners in epigenetic processes.

  14. A minimal sequence code for switching protein structure and function.

    Science.gov (United States)

    Alexander, Patrick A; He, Yanan; Chen, Yihong; Orban, John; Bryan, Philip N

    2009-12-15

    We present here a structural and mechanistic description of how a protein changes its fold and function, mutation by mutation. Our approach was to create 2 proteins that (i) are stably folded into 2 different folds, (ii) have 2 different functions, and (iii) are very similar in sequence. In this simplified sequence space we explore the mutational path from one fold to another. We show that an IgG-binding, 4beta+alpha fold can be transformed into an albumin-binding, 3-alpha fold via a mutational pathway in which neither function nor native structure is completely lost. The stabilities of all mutants along the pathway are evaluated, key high-resolution structures are determined by NMR, and an explanation of the switching mechanism is provided. We show that the conformational switch from 4beta+alpha to 3-alpha structure can occur via a single amino acid substitution. On one side of the switch point, the 4beta+alpha fold is >90% populated (pH 7.2, 20 degrees C). A single mutation switches the conformation to the 3-alpha fold, which is >90% populated (pH 7.2, 20 degrees C). We further show that a bifunctional protein exists at the switch point with affinity for both IgG and albumin.

  15. Predicting protein structures with a multiplayer online game.

    Science.gov (United States)

    Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran; Players, Foldit

    2010-08-05

    People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully 'crowd-sourced' through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.

  16. Conformational analysis of protein structures derived from NMR data.

    Science.gov (United States)

    MacArthur, M W; Thornton, J M

    1993-11-01

    A study is presented of the conformational characteristics of NMR-derived protein structures in the Protein Data Bank compared to X-ray structures. Both ensemble and energy-minimized average structures are analyzed. We have addressed the problem using the methods developed for crystal structures by examining the distribution of phi, psi, and chi angles as indicators of global conformational irregularity. All these features in NMR structures occur to varying degrees in multiple conformational states. Some measures of local geometry are very tightly constrained by the methods used to generate the structure, e.g., proline phi angles, alpha-helix phi,psi angles, omega angles, and C alpha chirality. The more lightly restrained torsion angles do show increased clustering as the number of overall experimental observations increases. phi, psi, and chi 1 angle conformational heterogeneity is strongly correlated with accessibility but shows additional differences which reflect the differing number of observations possible in NMR for the various side chains (e.g., many for Trp, few for Ser). In general, we find that the core is defined to a notional resolution of 2.0 to 2.3 A. Of real interest is the behavior of surface residues and in particular the side chains where multiple rotameric states in different structures can vary from 10% to 88%. Later generation structures show a much tighter definition which correlates with increasing use of J-coupling information, stereospecific assignments, and heteronuclear techniques. A suite of programs is being developed to address the special needs of NMR-derived structures which will take into account the existence of increased mobility in solution.

  17. Dengue Virus Non-structural Protein 1 Modulates Infectious Particle Production via Interaction with the Structural Proteins.

    Directory of Open Access Journals (Sweden)

    Pietro Scaturro

    Full Text Available Non-structural protein 1 (NS1 is one of the most enigmatic proteins of the Dengue virus (DENV, playing distinct functions in immune evasion, pathogenesis and viral replication. The recently reported crystal structure of DENV NS1 revealed its peculiar three-dimensional fold; however, detailed information on NS1 function at different steps of the viral replication cycle is still missing. By using the recently reported crystal structure, as well as amino acid sequence conservation, as a guide for a comprehensive site-directed mutagenesis study, we discovered that in addition to being essential for RNA replication, DENV NS1 is also critically required for the production of infectious virus particles. Taking advantage of a trans-complementation approach based on fully functional epitope-tagged NS1 variants, we identified previously unreported interactions between NS1 and the structural proteins Envelope (E and precursor Membrane (prM. Interestingly, coimmunoprecipitation revealed an additional association with capsid, arguing that NS1 interacts via the structural glycoproteins with DENV particles. Results obtained with mutations residing either in the NS1 Wing domain or in the β-ladder domain suggest that NS1 might have two distinct functions in the assembly of DENV particles. By using a trans-complementation approach with a C-terminally KDEL-tagged ER-resident NS1, we demonstrate that the secretion of NS1 is dispensable for both RNA replication and infectious particle production. In conclusion, our results provide an extensive genetic map of NS1 determinants essential for viral RNA replication and identify a novel role of NS1 in virion production that is mediated via interaction with the structural proteins. These studies extend the list of NS1 functions and argue for a central role in coordinating replication and assembly/release of infectious DENV particles.

  18. Structural basis for target protein recognition by the protein disulfide reductase thioredoxin

    DEFF Research Database (Denmark)

    Maeda, Kenji; Hägglund, Per; Finnie, Christine

    2006-01-01

    of this mixed disulfide shows a conserved hydrophobic motif in thioredoxin interacting with a sequence of residues from BASI through van der Waals contacts and backbone-backbone hydrogen bonds. The observed structural complementarity suggests that the recognition of features around protein disulfides plays...

  19. Nuclear spectrin-like proteins are structural actin-binding proteins in plants.

    Science.gov (United States)

    Pérez-Munive, Clara; Moreno Díaz de la Espina, Susana

    2011-03-01

    Although actin is a relevant component of the plant nucleus, only three nuclear ABPs (actin-binding proteins) have been identified in plants to date: cofilin, profilin and nuclear myosin I. Although plants lack orthologues of the main structural nuclear ABPs in animals, such as lamins, lamin-associated proteins and nesprins, their genome does contain sequences with spectrin repeats and N-terminal calponin homology domains for actin binding that might be distant relatives of spectrin. We investigated here whether spectrin-like proteins could act as structural nuclear ABPs in plants. We have investigated the presence of spectrins in Allium cepa meristematic nuclei by Western blotting, confocal and electron microscopy, using antibodies against α- and β-spectrin chains that cross-react in plant nuclei. Their role as nuclear ABPs was analysed by co-immunoprecipitation and IF (immunofluorescence) co-localization and their association with the nuclear matrix was investigated by sequential extraction of nuclei with non-ionic detergent, and in low- and high-salt buffers after nuclease digestion. Our results demonstrate the existence of several spectrin-like proteins in the nucleus of onion cells that have different intranuclear distributions in asynchronous meristematic populations and associate with the nuclear matrix. These nuclear proteins co-immunoprecipitate and co-localize with actin. These results reveal that the plant nucleus contains spectrin-like proteins that are structural nuclear components and function as ABPs. Their intranuclear distribution suggests that plant nuclear spectrin-like proteins could be involved in multiple nuclear functions.

  20. Structural principles within the human-virus protein-protein interaction network.

    Science.gov (United States)

    Franzosa, Eric A; Xia, Yu

    2011-06-28

    General properties of the antagonistic biomolecular interactions between viruses and their hosts (exogenous interactions) remain poorly understood, and may differ significantly from known principles governing the cooperative interactions within the host (endogenous interactions). Systems biology approaches have been applied to study the combined interaction networks of virus and human proteins, but such efforts have so far revealed only low-resolution patterns of host-virus interaction. Here, we layer curated and predicted 3D structural models of human-virus and human-human protein complexes on top of traditional interaction networks to reconstruct the human-virus structural interaction network. This approach reveals atomic resolution, mechanistic patterns of host-virus interaction, and facilitates systematic comparison with the host's endogenous interactions. We find that exogenous interfaces tend to overlap with and mimic endogenous interfaces, thereby competing with endogenous binding partners. The endogenous interfaces mimicked by viral proteins tend to participate in multiple endogenous interactions which are transient and regulatory in nature. While interface overlap in the endogenous network results largely from gene duplication followed by divergent evolution, viral proteins frequently achieve interface mimicry without any sequence or structural similarity to an endogenous binding partner. Finally, while endogenous interfaces tend to evolve more slowly than the rest of the protein surface, exogenous interfaces--including many sites of endogenous-exogenous overlap--tend to evolve faster, consistent with an evolutionary "arms race" between host and pathogen. These significant biophysical, functional, and evolutionary differences between host-pathogen and within-host protein-protein interactions highlight the distinct consequences of antagonism versus cooperation in biological networks.

  1. The E4 protein; structure, function and patterns of expression

    Energy Technology Data Exchange (ETDEWEB)

    Doorbar, John, E-mail: jdoorba@nimr.mrc.ac.uk

    2013-10-15

    The papillomavirus E4 open reading frame (ORF) is contained within the E2 ORF, with the primary E4 gene-product (E1{sup ∧}E4) being translated from a spliced mRNA that includes the E1 initiation codon and adjacent sequences. E4 is located centrally within the E2 gene, in a region that encodes the E2 protein′s flexible hinge domain. Although a number of minor E4 transcripts have been reported, it is the product of the abundant E1{sup ∧}E4 mRNA that has been most extensively analysed. During the papillomavirus life cycle, the E1{sup ∧}E4 gene products generally become detectable at the onset of vegetative viral genome amplification as the late stages of infection begin. E4 contributes to genome amplification success and virus synthesis, with its high level of expression suggesting additional roles in virus release and/or transmission. In general, E4 is easily visualised in biopsy material by immunostaining, and can be detected in lesions caused by diverse papillomavirus types, including those of dogs, rabbits and cattle as well as humans. The E4 protein can serve as a biomarker of active virus infection, and in the case of high-risk human types also disease severity. In some cutaneous lesions, E4 can be expressed at higher levels than the virion coat proteins, and can account for as much as 30% of total lesional protein content. The E4 proteins of the Beta, Gamma and Mu HPV types assemble into distinctive cytoplasmic, and sometimes nuclear, inclusion granules. In general, the E4 proteins are expressed before L2 and L1, with their structure and function being modified, first by kinases as the infected cell progresses through the S and G2 cell cycle phases, but also by proteases as the cell exits the cell cycle and undergoes true terminal differentiation. The kinases that regulate E4 also affect other viral proteins simultaneously, and include protein kinase A, Cyclin-dependent kinase, members of the MAP Kinase family and protein kinase C. For HPV16 E1{sup

  2. Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive.

    Science.gov (United States)

    Burley, Stephen K; Berman, Helen M; Kleywegt, Gerard J; Markley, John L; Nakamura, Haruki; Velankar, Sameer

    2017-01-01

    The Protein Data Bank (PDB)--the single global repository of experimentally determined 3D structures of biological macromolecules and their complexes--was established in 1971, becoming the first open-access digital resource in the biological sciences. The PDB archive currently houses ~130,000 entries (May 2017). It is managed by the Worldwide Protein Data Bank organization (wwPDB; wwpdb.org), which includes the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu). The four wwPDB partners operate a unified global software system that enforces community-agreed data standards and supports data Deposition, Biocuration, and Validation of ~11,000 new PDB entries annually (deposit.wwpdb.org). The RCSB PDB currently acts as the archive keeper, ensuring disaster recovery of PDB data and coordinating weekly updates. wwPDB partners disseminate the same archival data from multiple FTP sites, while operating complementary websites that provide their own views of PDB data with selected value-added information and links to related data resources. At present, the PDB archives experimental data, associated metadata, and 3D-atomic level structural models derived from three well-established methods: crystallography, nuclear magnetic resonance spectroscopy (NMR), and electron microscopy (3DEM). wwPDB partners are working closely with experts in related experimental areas (small-angle scattering, chemical cross-linking/mass spectrometry, Forster energy resonance transfer or FRET, etc.) to establish a federation of data resources that will support sustainable archiving and validation of 3D structural models and experimental data derived from integrative or hybrid methods.

  3. Modeling Protein Structures Based on Density Maps at Intermediate Resolutions

    Science.gov (United States)

    Ma, Jianpeng

    Structural biology is now in a special era in which increasingly more complex biomolecules are being studied. For many of them, only low- or intermediateresolution density maps (6-10 Å) can be obtained by, for instance, electron cryomicroscopy (cryo-EM) (Bottcher et al., 1997; Conway et al., 1997; DeRosier and Harrison, 1997; Kuhn et al., 2002; Li et al., 2002; Mancini et al., 2000; Zhang et al., 2000; Zhou et al., 2000, 2001a,b). In certain cases, analysis in terms of intermediateresolution density maps is also inevitable in X-ray crystallography as exemplified in the lengthy process of structural determination of the 50S ribosomal subunit that incremented from 9 Å, 5 Å, to 2.4 Å (Ban et al., 1998, 1999, 2000). As a common feature in all these cases, it is usually impossible, with conventional methods, to construct reasonably accurate atomic models from density maps. However, for the purpose of structural analysis, it would still be very helpful if one can build some kind of pseudo-atomic models from the density maps because this will not only facilitate the structural determination to higher resolutions, but also assist further biochemical studies and functional interpretation. For example, significant insights into the architecture and organization of proteins can often be learned if one can roughly locate the major secondary structural elements such as α-helices and β-sheets. This rationale is supported by the fact that the knowledge of protein folds can be obtained primarily from the spatial arrangement of the secondary structural elements independent of the sequence identity of the proteins, as different sequences can have the same fold.

  4. Structural Properties of Potexvirus Coat Proteins Detected by Optical Methods.

    Science.gov (United States)

    Semenyuk, P I; Karpova, O V; Ksenofontov, A L; Kalinina, N O; Dobrov, E N; Makarov, V V

    2016-12-01

    It has been shown by X-ray analysis that cores of coat proteins (CPs) from three potexviruses, flexible helical RNA-containing plant viruses, have similar α-helical structure. However, this similarity cannot explain structural lability of potexvirus virions, which is believed to determine their biological activity. Here, we used circular dichroism (CD) spectroscopy in the far UV region to compare optical properties of CPs from three potexviruses with the same morphology and similar structure. CPs from Alternanthera mosaic virus (AltMV), potato aucuba mosaic virus (PAMV), and potato virus X (PVX) have been studied in a free state and in virions. The CD spectrum of AltMV virions was similar to the previously obtained CD spectrum of papaya mosaic virus (PapMV) virions, but differed significantly from the CD spectrum of PAMV virions. The CD spectrum of PAMV virions resembled in its basic characteristics the CD spectrum of PVX virions characterized by molar ellipticity that is abnormally low for α-helical proteins. Homology modeling of the CP structures in AltMV, PAMV, and PVX virions was based on the known high-resolution structures of CPs from papaya mosaic virus and bamboo mosaic virus and confirmed that the structures of the CP cores in all three viruses were nearly identical. Comparison of amino acid sequences of different potexvirus CPs and prediction of unstructured regions in these proteins revealed a possible correlation between specific features in the virion CD spectra and the presence of disordered N-terminal segments in the CPs.

  5. Quality assessment of modeled protein structure using physicochemical properties.

    Science.gov (United States)

    Rana, Prashant Singh; Sharma, Harish; Bhattacharya, Mahua; Shukla, Anupam

    2015-04-01

    Physicochemical properties of proteins always guide to determine the quality of the protein structure, therefore it has been rigorously used to distinguish native or native-like structure from other predicted structures. In this work, we explore nine machine learning methods with six physicochemical properties to predict the Root Mean Square Deviation (RMSD), Template Modeling (TM-score), and Global Distance Test (GDT_TS-score) of modeled protein structure in the absence of its true native state. Physicochemical properties namely total surface area, euclidean distance (ED), total empirical energy, secondary structure penalty (SS), sequence length (SL), and pair number (PN) are used. There are a total of 95,091 modeled structures of 4896 native targets. A real coded Self-adaptive Differential Evolution algorithm (SaDE) is used to determine the feature importance. The K-fold cross validation is used to measure the robustness of the best predictive method. Through the intensive experiments, it is found that Random Forest method outperforms over other machine learning methods. This work makes the prediction faster and inexpensive. The performance result shows the prediction of RMSD, TM-score, and GDT_TS-score on Root Mean Square Error (RMSE) as 1.20, 0.06, and 0.06 respectively; correlation scores are 0.96, 0.92, and 0.91 respectively; R(2) are 0.92, 0.85, and 0.84 respectively; and accuracy are 78.82% (with ± 0.1 err), 86.56% (with ± 0.1 err), and 87.37% (with ± 0.1 err) respectively on the testing data set. The data set used in the study is available as supplement at http://bit.ly/RF-PCP-DataSets.

  6. Structural Basis for Target Protein Regcognition by Thiredoxin

    DEFF Research Database (Denmark)

    Maeda, Kenji

    2007-01-01

    Thioredoxin (Trx) is an ubiquitous protein disulfide reductase that possesses two redox active cysteines in the conserved active site sequence motif, Trp-CysN-Gly/Pro-Pro-CysC situated in the so called Trx-fold. The lack of insight into the protein substrate recognition mechanism of Trx has to date......Ser) and a mutant of an in vitro substrate alpha-amylase/subtilisin inhibitor (BASI) (Cys144Ser), as a reaction intermediate-mimic of Trx-catalyzed disulfide reduction. The resultant structure showed a sequence of BASI residues along a conserved hydrophobic groove constituted of three loop segments...... on HvTrxh2 surface, associated through several van der Waals contacts and three backbone-backbone hydrogen bonds resembling beta-sheet formation. Moreover, a pattern of interactions essentially identical to that in HvTrxh2-S-S-BASI was observed in the structure of HvTrxh1 crystallized in the oxidized...

  7. Validation of Structures in the Protein Data Bank.

    Science.gov (United States)

    Gore, Swanand; Sanz García, Eduardo; Hendrickx, Pieter M S; Gutmanas, Aleksandras; Westbrook, John D; Yang, Huanwang; Feng, Zukang; Baskaran, Kumaran; Berrisford, John M; Hudson, Brian P; Ikegawa, Yasuyo; Kobayashi, Naohiro; Lawson, Catherine L; Mading, Steve; Mak, Lora; Mukhopadhyay, Abhik; Oldfield, Thomas J; Patwardhan, Ardan; Peisach, Ezra; Sahni, Gaurav; Sekharan, Monica R; Sen, Sanchayita; Shao, Chenghua; Smart, Oliver S; Ulrich, Eldon L; Yamashita, Reiko; Quesada, Martha; Young, Jasmine Y; Nakamura, Haruki; Markley, John L; Berman, Helen M; Burley, Stephen K; Velankar, Sameer; Kleywegt, Gerard J

    2017-12-05

    The Worldwide PDB recently launched a deposition, biocuration, and validation tool: OneDep. At various stages of OneDep data processing, validation reports for three-dimensional structures of biological macromolecules are produced. These reports are based on recommendations of expert task forces representing crystallography, nuclear magnetic resonance, and cryoelectron microscopy communities. The reports provide useful metrics with which depositors can evaluate the quality of the experimental data, the structural model, and the fit between them. The validation module is also available as a stand-alone web server and as a programmatically accessible web service. A growing number of journals require the official wwPDB validation reports (produced at biocuration) to accompany manuscripts describing macromolecular structures. Upon public release of the structure, the validation report becomes part of the public PDB archive. Geometric quality scores for proteins in the PDB archive have improved over the past decade. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  8. Solvation structure of ice-binding antifreeze proteins

    Science.gov (United States)

    Hansen-Goos, Hendrik; Wettlaufer, John

    2009-03-01

    Antifreeze proteins (AFPs) can be found in organisms which survive at subzero temperatures. They were first discovered in polar fishes since the 1950's [1] and have been isolated meanwhile also from insects, plants, and bacteria. While AFPs shift the freezing point of water below the bulk melting point and hence can prevent recrystallization; the effect is non-colligative and there is a pronounced hysteresis between freezing and melting. For many AFPs it is generally accepted that they function through an irreversible binding to the ice-water interface which leads to a piecewise convex growth front with a lower nonequilibrium freezing point due to the Kelvin effect. Recent molecular dynamics simulations of the AFP from Choristoneura fumiferana reveal that the solvation structures of water at ice-binding and non-ice-binding faces of the protein are crucial for understanding how the AFP binds to the ice surface and how it is protected from being overgrown [2]. We use density functional theory of classical fluids in order to assess the microscopic solvent structure in the vicinity of protein faces with different surface properties. With our method, binding energies of different protein faces to the water-ice-interface can be computed efficiently in a simplified model. [1] Y. Yeh and R.E. Feeney, Chem. Rev. 96, 601 (1996). [2] D.R. Nutt and J.C. Smith, J. Am. Chem. Soc. 130, 13066 (2008).

  9. NMR Structure of the Myristylated Feline Immunodeficiency Virus Matrix Protein

    Directory of Open Access Journals (Sweden)

    Lola A. Brown

    2015-04-01

    Full Text Available Membrane targeting by the Gag proteins of the human immunodeficiency viruses (HIV types-1 and -2 is mediated by Gag’s N-terminally myristylated matrix (MA domain and is dependent on cellular phosphatidylinositol-4,5-bisphosphate [PI(4,5P2]. To determine if other lentiviruses employ a similar membrane targeting mechanism, we initiated studies of the feline immunodeficiency virus (FIV, a widespread feline pathogen with potential utility for development of human therapeutics. Bacterial co-translational myristylation was facilitated by mutation of two amino acids near the amino-terminus of the protein (Q5A/G6S; myrMAQ5A/G6S. These substitutions did not affect virus assembly or release from transfected cells. NMR studies revealed that the myristyl group is buried within a hydrophobic pocket in a manner that is structurally similar to that observed for the myristylated HIV-1 protein. Comparisons with a recent crystal structure of the unmyristylated FIV protein [myr(-MA] indicate that only small changes in helix orientation are required to accommodate the sequestered myr group. Depletion of PI(4,5P2 from the plasma membrane of FIV-infected CRFK cells inhibited production of FIV particles, indicating that, like HIV, FIV hijacks the PI(4,5P2 cellular signaling system to direct intracellular Gag trafficking during virus assembly.

  10. Structure determination of T-cell protein-tyrosine phosphatase

    DEFF Research Database (Denmark)

    Iversen, L.F.; Møller, K. B.; Pedersen, A.K.

    2002-01-01

    homologous T cell protein-tyrosine phosphatase (TC-PTP) has received much less attention, and no x-ray structure has been provided. We have previously co-crystallized PTP1B with a number of low molecular weight inhibitors that inhibit TC-PTP with similar efficiency. Unexpectedly, we were not able to co...... the high degree of functional and structural similarity between TC-PTP and PTP1B, we have been able to identify areas close to the active site that might be addressed to develop selective inhibitors of each enzyme....

  11. Site-specific electronic structure of bacterial surface protein layers

    Science.gov (United States)

    Vyalikh, D. V.; Kummer, K.; Kade, A.; Blüher, A.; Katzschner, B.; Mertig, M.; Molodtsov, S. L.

    2009-03-01

    We applied resonant photoemission and X-ray absorption spectroscopy for a detailed characterization of the valence electronic structure of the regular two-dimensional bacterial surface protein layer of Bacillus sphaericus NCTC 9602. Using this approach, we detected valence electron emission from specific chemical sites. In particular, it was found that electrons from the π clouds of aromatic systems make large contributions to the highest occupied molecular orbitals.

  12. Optimizing weights of protein energy function to improve ab initio protein structure prediction

    CERN Document Server

    Wang, Chao; Liu, Juntao; Zhang, Haicang; Ling, Bin; Li, Shuai Cheng; Zheng, Wei-Mou; Bu, Dongbo

    2013-01-01

    Predicting protein 3D structure from amino acid sequence remains as a challenge in the field of computational biology. If protein structure homologues are not found, one has to construct 3D structural conformations from the very beginning by the so-called ab initio approach, using some empirical energy functions. A successful algorithm in this category, Rosetta, creates an ensemble of decoy conformations by assembling selected best short fragments of known protein structures and then recognizes the native state as the highly populated one with a very low energy. Typically, an energy function is a combination of a variety of terms characterizing different structural features, say hydrophobic interactions, van der Waals force, hydrogen bonding, etc. It is critical for an energy function to be capable to distinguish native-like conformations from non-native ones and to drive most initial conformations assembled from fragments to a native-like one in a conformation search process. In this paper we propose a linea...

  13. Unconstrained Structure Formation in Coarse-Grained Protein Simulations

    Science.gov (United States)

    Bereau, Tristan

    The ability of proteins to fold into well-defined structures forms the basis of a wide variety of biochemical functions in and out of the cell membrane. Many of these processes, however, operate at time- and length-scales that are currently unattainable by all-atom computer simulations. To cope with this difficulty, increasingly more accurate and sophisticated coarse-grained models are currently being developed. In the present thesis, we introduce a solvent-free coarse-grained model for proteins. Proteins are modeled by four beads per amino acid, providing enough backbone resolution to allow for accurate sampling of local conformations. It relies on simple interactions that emphasize structure, such as hydrogen bonds and hydrophobicity. Realistic alpha/beta content is achieved by including an effective nearest-neighbor dipolar interaction. Parameters are tuned to reproduce both local conformations and tertiary structures. By studying both helical and extended conformations we make sure the force field is not biased towards any particular secondary structure. Without any further adjustments or bias a realistic oligopeptide aggregation scenario is observed. The model is subsequently applied to various biophysical problems: (i) kinetics of folding of two model peptides, (ii) large-scale amyloid-beta oligomerization, and (iii) protein folding cooperativity. The last topic---defined by the nature of the finite-size thermodynamic transition exhibited upon folding---was investigated from a microcanonical perspective: the accurate evaluation of the density of states can unambiguously characterize the nature of the transition, unlike its corresponding canonical analysis. Extending the results of lattice simulations and theoretical models, we find that it is the interplay between secondary structure and the loss of non-native tertiary contacts which determines the nature of the transition. Finally, we combine the peptide model with a high-resolution, solvent-free, lipid

  14. Structure-Energy Relationships of Halogen Bonds in Proteins.

    Science.gov (United States)

    Scholfield, Matthew R; Ford, Melissa Coates; Carlsson, Anna-Carin C; Butta, Hawera; Mehl, Ryan A; Ho, P Shing

    2017-06-06

    The structures and stabilities of proteins are defined by a series of weak noncovalent electrostatic, van der Waals, and hydrogen bond (HB) interactions. In this study, we have designed and engineered halogen bonds (XBs) site-specifically to study their structure-energy relationship in a model protein, T4 lysozyme. The evidence for XBs is the displacement of the aromatic side chain toward an oxygen acceptor, at distances that are equal to or less than the sums of their respective van der Waals radii, when the hydroxyl substituent of the wild-type tyrosine is replaced by a halogen. In addition, thermal melting studies show that the iodine XB rescues the stabilization energy from an otherwise destabilizing substitution (at an equivalent noninteracting site), indicating that the interaction is also present in solution. Quantum chemical calculations show that the XB complements an HB at this site and that solvent structure must also be considered in trying to design molecular interactions such as XBs into biological systems. A bromine substitution also shows displacement of the side chain, but the distances and geometries do not indicate formation of an XB. Thus, we have dissected the contributions from various noncovalent interactions of halogens introduced into proteins, to drive the application of XBs, particularly in biomolecular design.

  15. Prebiotic Alternatives to Proteins: Structure and Function of Hyperbranched Polyesters

    Science.gov (United States)

    Mamajanov, Irena; Callahan, Michael P.; Dworkin, Jason P.; Cody, George D.

    2015-06-01

    Proteins are responsible multiple biological functions, such as ligand binding, catalysis, and ion channeling. This functionality is enabled by proteins' three-dimensional structures that require long polypeptides. Since plausibly prebiotic synthesis of functional polypeptides has proven challenging in the laboratory, we propose that these functions may have been initially performed by alternative macromolecular constructs, namely hyperbranched polymers (HBPs), during early stages of chemical evolution. HBPs can be straightforwardly synthesized in one-pot processes, possess globular structures determined by their architecture as opposed to folding in proteins, and have documented ligand binding and catalytic properties. Our initial study focuses on glycerol-citric acid HBPs synthesized via moderate heating in the dry state. The polymerization products consisted of a mixture of isomeric structures of varying molar mass as evidenced by NMR, mass spectrometry and size-exclusion chromatography. Addition of divalent cations during polymerization resulted in increased incorporation of citric acid into the HBPs and the possible formation of cation-oligomer complexes. The chelating properties of citric acid govern the makeup of the resulting polymer, turning the polymerization system into a rudimentary smart material.

  16. Minor snake venom proteins: Structure, function and potential applications.

    Science.gov (United States)

    Boldrini-França, Johara; Cologna, Camila Takeno; Pucca, Manuela Berto; Bordon, Karla de Castro Figueiredo; Amorim, Fernanda Gobbi; Anjolette, Fernando Antonio Pino; Cordeiro, Francielle Almeida; Wiezel, Gisele Adriano; Cerni, Felipe Augusto; Pinheiro-Junior, Ernesto Lopes; Shibao, Priscila Yumi Tanaka; Ferreira, Isabela Gobbo; de Oliveira, Isadora Sousa; Cardoso, Iara Aimê; Arantes, Eliane Candiani

    2017-04-01

    Snake venoms present a great diversity of pharmacologically active compounds that may be applied as research and biotechnological tools, as well as in drug development and diagnostic tests for certain diseases. The most abundant toxins have been extensively studied in the last decades and some of them have already been used for different purposes. Nevertheless, most of the minor snake venom protein classes remain poorly explored, even presenting potential application in diverse areas. The main difficulty in studying these proteins lies on the impossibility of obtaining sufficient amounts of them for a comprehensive investigation. The advent of more sensitive techniques in the last few years allowed the discovery of new venom components and the in-depth study of some already known minor proteins. This review summarizes information regarding some structural and functional aspects of low abundant snake venom proteins classes, such as growth factors, hyaluronidases, cysteine-rich secretory proteins, nucleases and nucleotidases, cobra venom factors, vespryns, protease inhibitors, antimicrobial peptides, among others. Some potential applications of these molecules are discussed herein in order to encourage researchers to explore the full venom repertoire and to discover new molecules or applications for the already known venom components. Copyright © 2016. Published by Elsevier B.V.

  17. Ab Initio Protein Structure Assembly Using Continuous Structure Fragments and Optimized Knowledge-based Force Field

    Science.gov (United States)

    Xu, Dong; Zhang, Yang

    2012-01-01

    Ab initio protein folding is one of the major unsolved problems in computational biology due to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1–20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 non-homologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score (TM-score) >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in 1/3 cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction (CASP9) experiment, QUARK server outperformed the second and third best servers by 18% and 47% based on the cumulative Z-score of global distance test-total (GDT-TS) scores in the free modeling (FM) category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress towards the solution of the most important problem in the field. PMID:22411565

  18. Structure and application of antifreeze proteins from Antarctic bacteria.

    Science.gov (United States)

    Muñoz, Patricio A; Márquez, Sebastián L; González-Nilo, Fernando D; Márquez-Miranda, Valeria; Blamey, Jenny M

    2017-08-07

    Antifreeze proteins (AFPs) production is a survival strategy of psychrophiles in ice. These proteins have potential in frozen food industry avoiding the damage in the structure of animal or vegetal foods. Moreover, there is not much information regarding the interaction of Antarctic bacterial AFPs with ice, and new determinations are needed to understand the behaviour of these proteins at the water/ice interface. Different Antarctic places were screened for antifreeze activity and microorganisms were selected for the presence of thermal hysteresis in their crude extracts. Isolates GU1.7.1, GU3.1.1, and AFP5.1 showed higher thermal hysteresis and were characterized using a polyphasic approach. Studies using cucumber and zucchini samples showed cellular protection when samples were treated with partially purified AFPs or a commercial AFP as was determined using toluidine blue O and neutral red staining. Additionally, genome analysis of these isolates revealed the presence of genes that encode for putative AFPs. Deduced amino acids sequences from GU3.1.1 (gu3A and gu3B) and AFP5.1 (afp5A) showed high similarity to reported AFPs which crystal structures are solved, allowing then generating homology models. Modelled proteins showed a triangular prism form similar to β-helix AFPs with a linear distribution of threonine residues at one side of the prism that could correspond to the putative ice binding side. The statistically best models were used to build a protein-water system. Molecular dynamics simulations were then performed to compare the antifreezing behaviour of these AFPs at the ice/water interface. Docking and molecular dynamics simulations revealed that gu3B could have the most efficient antifreezing behavior, but gu3A could have a higher affinity for ice. AFPs from Antarctic microorganisms GU1.7.1, GU3.1.1 and AFP5.1 protect cellular structures of frozen food showing a potential for frozen food industry. Modeled proteins possess a β-helix structure, and

  19. Hypochlorous acid-mediated protein oxidation: how important are chloramine transfer reactions and protein tertiary structure?

    Science.gov (United States)

    Pattison, David I; Hawkins, Clare L; Davies, Michael J

    2007-08-28

    Hypochlorous acid (HOCl) is a powerful oxidant generated from H2O2 and Cl- by the heme enzyme myeloperoxidase, which is released from activated leukocytes. HOCl possesses potent antibacterial properties, but excessive production can lead to host tissue damage that occurs in numerous human pathologies. As proteins and amino acids are highly abundant in vivo and react rapidly with HOCl, they are likely to be major targets for HOCl. In this study, two small globular proteins, lysozyme and insulin, have been oxidized with increasing excesses of HOCl to determine whether the pattern of HOCl-mediated amino acid consumption is consistent with reported kinetic data for isolated amino acids and model compounds. Identical experiments have been carried out with mixtures of N-acetyl amino acids (to prevent reaction at the alpha-amino groups) that mimic the protein composition to examine the role of protein structure on reactivity. The results indicate that tertiary structure facilitates secondary chlorine transfer reactions of chloramines formed on His and Lys side chains. In light of these data, second-order rate constants for reactions of Lys side chain and Gly chloramines with Trp side chains and disulfide bonds have been determined, together with those for further oxidation of Met sulfoxide by HOCl and His side chain chloramines. Computational kinetic models incorporating these additional rate constants closely predict the experimentally observed amino acid consumption. These studies provide insight into the roles of chloramine formation and three-dimensional structure on the reactions of HOCl with isolated proteins and demonstrate that kinetic models can predict the outcome of HOCl-mediated protein oxidation.

  20. Structural fragment clustering reveals novel structural and functional motifs in α-helical transmembrane proteins

    Directory of Open Access Journals (Sweden)

    Vassilev Boris

    2010-04-01

    Full Text Available Abstract Background A large proportion of an organism's genome encodes for membrane proteins. Membrane proteins are important for many cellular processes, and several diseases can be linked to mutations in them. With the tremendous growth of sequence data, there is an increasing need to reliably identify membrane proteins from sequence, to functionally annotate them, and to correctly predict their topology. Results We introduce a technique called structural fragment clustering, which learns sequential motifs from 3D structural fragments. From over 500,000 fragments, we obtain 213 statistically significant, non-redundant, and novel motifs that are highly specific to α-helical transmembrane proteins. From these 213 motifs, 58 of them were assigned to function and checked in the scientific literature for a biological assessment. Seventy percent of the motifs are found in co-factor, ligand, and ion binding sites, 30% at protein interaction interfaces, and 12% bind specific lipids such as glycerol or cardiolipins. The vast majority of motifs (94% appear across evolutionarily unrelated families, highlighting the modularity of functional design in membrane proteins. We describe three novel motifs in detail: (1 a dimer interface motif found in voltage-gated chloride channels, (2 a proton transfer motif found in heme-copper oxidases, and (3 a convergently evolved interface helix motif found in an aspartate symporter, a serine protease, and cytochrome b. Conclusions Our findings suggest that functional modules exist in membrane proteins, and that they occur in completely different evolutionary contexts and cover different binding sites. Structural fragment clustering allows us to link sequence motifs to function through clusters of structural fragments. The sequence motifs can be applied to identify and characterize membrane proteins in novel genomes.

  1. Rigidity analysis of protein biological assemblies and periodic crystal structures

    Science.gov (United States)

    2013-01-01

    Background We initiate in silico rigidity-theoretical studies of biological assemblies and small crystals for protein structures. The goal is to determine if, and how, the interactions among neighboring cells and subchains affect the flexibility of a molecule in its crystallized state. We use experimental X-ray crystallography data from the Protein Data Bank (PDB). The analysis relies on an effcient graph-based algorithm. Computational experiments were performed using new protein rigidity analysis tools available in the new release of our KINARI-Web server http://kinari.cs.umass.edu. Results We provide two types of results: on biological assemblies and on crystals. We found that when only isolated subchains are considered, structural and functional information may be missed. Indeed, the rigidity of biological assemblies is sometimes dependent on the count and placement of hydrogen bonds and other interactions among the individual subchains of the biological unit. Similarly, the rigidity of small crystals may be affected by the interactions between atoms belonging to different unit cells. We have analyzed a dataset of approximately 300 proteins, from which we generated 982 crystals (some of which are biological assemblies). We identified two types of behaviors. (a) Some crystals and/or biological assemblies will aggregate into rigid bodies that span multiple unit cells/asymmetric units. Some of them create substantially larger rigid cluster in the crystal/biological assembly form, while in other cases, the aggregation has a smaller effect just at the interface between the units. (b) In other cases, the rigidity properties of the asymmetric units are retained, because the rigid bodies did not combine. We also identified two interesting cases where rigidity analysis may be correlated with the functional behavior of the protein. This type of information, identified here for the first time, depends critically on the ability to create crystals and biological assemblies

  2. Evolution and structural organization of the C proteins of paramyxovirinae.

    Directory of Open Access Journals (Sweden)

    Michael K Lo

    Full Text Available The phosphoprotein (P gene of most Paramyxovirinae encodes several proteins in overlapping frames: P and V, which share a common N-terminus (PNT, and C, which overlaps PNT. Overlapping genes are of particular interest because they encode proteins originated de novo, some of which have unknown structural folds, challenging the notion that nature utilizes only a limited, well-mapped area of fold space. The C proteins cluster in three groups, comprising measles, Nipah, and Sendai virus. We predicted that all C proteins have a similar organization: a variable, disordered N-terminus and a conserved, α-helical C-terminus. We confirmed this predicted organization by biophysically characterizing recombinant C proteins from Tupaia paramyxovirus (measles group and human parainfluenza virus 1 (Sendai group. We also found that the C of the measles and Nipah groups have statistically significant sequence similarity, indicating a common origin. Although the C of the Sendai group lack sequence similarity with them, we speculate that they also have a common origin, given their similar genomic location and structural organization. Since C is dispensable for viral replication, unlike PNT, we hypothesize that C may have originated de novo by overprinting PNT in the ancestor of Paramyxovirinae. Intriguingly, in measles virus and Nipah virus, PNT encodes STAT1-binding sites that overlap different regions of the C-terminus of C, indicating they have probably originated independently. This arrangement, in which the same genetic region encodes simultaneously a crucial functional motif (a STAT1-binding site and a highly constrained region (the C-terminus of C, seems paradoxical, since it should severely reduce the ability of the virus to adapt. The fact that it originated twice suggests that it must be balanced by an evolutionary advantage, perhaps from reducing the size of the genetic region vulnerable to mutations.

  3. FTIR protein secondary structure analysis of human ascending aortic tissues.

    Science.gov (United States)

    Bonnier, Franck; Rubin, Sylvain; Debelle, Laurent; Ventéo, Lydie; Pluot, Michel; Baehrel, Bernard; Manfait, Michel; Sockalingum, Ganesh D

    2008-08-01

    The advent of moderate dilatations in ascending aortas is often accompanied by structural modifications of the main components of the aortic tissue, elastin and collagen. In this study, we have undertaken an approach based on FTIR microscopy coupled to a curve-fitting procedure to analyze secondary structure modifications in these proteins in human normal and pathological aortic tissues. We found that the outcome of the aortic pathology is strongly influenced by these proteins, which are abundant in the media of the aortic wall, and that the advent of an aortic dilatation is generally accompanied by a decrease of parallel beta-sheet structures. Elastin, essentially composed of beta-sheet structures, seems to be directly related to these changes and therefore indicative of the elastic alteration of the aortic wall. Conventional microscopy and confocal fluorescence microscopy were used to compare FTIR microscopy results with the organization of the elastic fibers present in the tissues. This in-vitro study on 6 patients (three normal and three pathologic), suggests that such a spectroscopic marker, specific to aneurismal tissue characterization, could be important information for surgeons who face the dilemma of moderate aortic tissue dilatation of the ascending aortas.

  4. NMR structure of the integral membrane protein OmpX.

    Science.gov (United States)

    Fernández, César; Hilty, Christian; Wider, Gerhard; Güntert, Peter; Wüthrich, Kurt

    2004-03-05

    The structure of the integral membrane protein OmpX from Escherichia coli reconstituted in 60 kDa DHPC micelles (OmpX/DHPC) was calculated from 526 NOE upper limit distance constraints. The structure determination was based on complete sequence-specific assignments for the amide protons and the Val, Leu, and Ile(delta1) methyl groups in OmpX, which were selectively protonated on a perdeuterated background. The solution structure of OmpX in the DHPC micelles consists of a well-defined, eight-stranded antiparallel beta-barrel, with successive pairs of beta-strands connected by mobile loops. Several long-range NOEs observed outside of the transmembrane barrel characterize an extension of a four-stranded beta-sheet beyond the height of the barrel. This protruding beta-sheet is believed to be involved in intermolecular interactions responsible for the biological functions of OmpX. The present approach for de novo structure determination should be quite widely applicable to membrane proteins reconstituted in mixed micelles with overall molecular masses up to about 100 kDa, and may also provide a platform for additional functional studies.

  5. Modeling Protein Structure at Near Atomic Resolutions With Gorgon

    Science.gov (United States)

    Baker, Matthew L.; Abeysinghe, Sasakthi S.; Schuh, Stephen; Coleman, Ross A.; Abrams, Austin; Marsh, Michael P.; Hryc, Corey F.; Ruths, Troy; Chiu, Wah; Ju, Tao

    2011-01-01

    Electron cryo-microscopy (cryo-EM) has played an increasingly important role in elucidating the structure and function of macromolecular assemblies in near native solution conditions. Typically, however, only non-atomic resolution reconstructions have been obtained for these large complexes, necessitating computational tools for integrating and extracting structural details. With recent advances in cryo-EM, maps at near-atomic resolutions have been achieved for several macromolecular assemblies from which models have been manually constructed. In this work, we describe a new interactive modeling toolkit called Gorgon targeted at intermediate to near-atomic resolution density maps (10-3.5 Å), particularly from cryo-EM. Gorgon's de novo modeling procedure couple