WorldWideScience

Sample records for existing protein information

  1. Information assessment on predicting protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Gerstein Mark

    2004-10-01

    Full Text Available Abstract Background Identifying protein-protein interactions is fundamental for understanding the molecular machinery of the cell. Proteome-wide studies of protein-protein interactions are of significant value, but the high-throughput experimental technologies suffer from high rates of both false positive and false negative predictions. In addition to high-throughput experimental data, many diverse types of genomic data can help predict protein-protein interactions, such as mRNA expression, localization, essentiality, and functional annotation. Evaluations of the information contributions from different evidences help to establish more parsimonious models with comparable or better prediction accuracy, and to obtain biological insights of the relationships between protein-protein interactions and other genomic information. Results Our assessment is based on the genomic features used in a Bayesian network approach to predict protein-protein interactions genome-wide in yeast. In the special case, when one does not have any missing information about any of the features, our analysis shows that there is a larger information contribution from the functional-classification than from expression correlations or essentiality. We also show that in this case alternative models, such as logistic regression and random forest, may be more effective than Bayesian networks for predicting interactions. Conclusions In the restricted problem posed by the complete-information subset, we identified that the MIPS and Gene Ontology (GO functional similarity datasets as the dominating information contributors for predicting the protein-protein interactions under the framework proposed by Jansen et al. Random forests based on the MIPS and GO information alone can give highly accurate classifications. In this particular subset of complete information, adding other genomic data does little for improving predictions. We also found that the data discretizations used in the

  2. Predicting protein complexes using a supervised learning method combined with local structural information.

    Science.gov (United States)

    Dong, Yadong; Sun, Yongqi; Qin, Chao

    2018-01-01

    The existing protein complex detection methods can be broadly divided into two categories: unsupervised and supervised learning methods. Most of the unsupervised learning methods assume that protein complexes are in dense regions of protein-protein interaction (PPI) networks even though many true complexes are not dense subgraphs. Supervised learning methods utilize the informative properties of known complexes; they often extract features from existing complexes and then use the features to train a classification model. The trained model is used to guide the search process for new complexes. However, insufficient extracted features, noise in the PPI data and the incompleteness of complex data make the classification model imprecise. Consequently, the classification model is not sufficient for guiding the detection of complexes. Therefore, we propose a new robust score function that combines the classification model with local structural information. Based on the score function, we provide a search method that works both forwards and backwards. The results from experiments on six benchmark PPI datasets and three protein complex datasets show that our approach can achieve better performance compared with the state-of-the-art supervised, semi-supervised and unsupervised methods for protein complex detection, occasionally significantly outperforming such methods.

  3. 78 FR 18620 - Agency Information Collection Activities: Extension, Without Change, of an Existing Information...

    Science.gov (United States)

    2013-03-27

    ... information technology, e.g., permitting electronic submission of responses. Overview of This Information... enforcement authority under the 287(g) program. This information is used by program managers and trainers in... Information Collection Activities: Extension, Without Change, of an Existing Information Collection; Comment...

  4. 77 FR 15115 - Agency Information Collection Activities: Extension, without Change, of an Existing Information...

    Science.gov (United States)

    2012-03-14

    ..., mechanical, or other technological collection techniques or other forms of information technology, e.g...; or inquiries for additional information should be directed to: John Ramsay, Forms Program Manager, U... Information Collection Activities: Extension, without Change, of an Existing Information Collection; Comment...

  5. INFORMATION SECURITY RISK ASSESSMENT USING EXISTING LEGAL AND METHODOLOGICAL BASE

    Directory of Open Access Journals (Sweden)

    A. I. Trubei

    2015-01-01

    Full Text Available The article provides a survey of the existing regulatory framework for information security riskmanagement. Practical methods for information security risk and vulnerability assessment are proposed.

  6. Existence of life-time stable proteins in mature rats-Dating of proteins' age by repeated short-term exposure to labeled amino acids throughout age

    DEFF Research Database (Denmark)

    Bechshøft, Cecilie Leidesdorff; Schjerling, Peter; Bornø, Andreas

    2017-01-01

    In vivo turnover rates of proteins covering the processes of protein synthesis and breakdown rates have been measured in many tissues and protein pools using various techniques. Connective tissue and collagen protein turnover is of specific interest since existing results are rather diverging. Th...... living days, indicating very slow turnover. The data support the hypothesis that some proteins synthesized during the early development and growth still exist much later in life of animals and hence has a very slow turnover rate.......In vivo turnover rates of proteins covering the processes of protein synthesis and breakdown rates have been measured in many tissues and protein pools using various techniques. Connective tissue and collagen protein turnover is of specific interest since existing results are rather diverging....... The aim of this study is to investigate whether we can verify the presence of protein pools within the same tissue with very distinct turnover rates over the life-span of rats with special focus on connective tissue. Male and female Lewis rats (n = 35) were injected with five different isotopically...

  7. Quantifying information transfer by protein domains: Analysis of the Fyn SH2 domain structure

    Directory of Open Access Journals (Sweden)

    Serrano Luis

    2008-10-01

    Full Text Available Abstract Background Efficient communication between distant sites within a protein is essential for cooperative biological response. Although often associated with large allosteric movements, more subtle changes in protein dynamics can also induce long-range correlations. However, an appropriate formalism that directly relates protein structural dynamics to information exchange between functional sites is still lacking. Results Here we introduce a method to analyze protein dynamics within the framework of information theory and show that signal transduction within proteins can be considered as a particular instance of communication over a noisy channel. In particular, we analyze the conformational correlations between protein residues and apply the concept of mutual information to quantify information exchange. Mapping out changes of mutual information on the protein structure then allows visualizing how distal communication is achieved. We illustrate the approach by analyzing information transfer by the SH2 domain of Fyn tyrosine kinase, obtained from Monte Carlo dynamics simulations. Our analysis reveals that the Fyn SH2 domain forms a noisy communication channel that couples residues located in the phosphopeptide and specificity binding sites and a number of residues at the other side of the domain near the linkers that connect the SH2 domain to the SH3 and kinase domains. We find that for this particular domain, communication is affected by a series of contiguous residues that connect distal sites by crossing the core of the SH2 domain. Conclusion As a result, our method provides a means to directly map the exchange of biological information on the structure of protein domains, making it clear how binding triggers conformational changes in the protein structure. As such it provides a structural road, next to the existing attempts at sequence level, to predict long-range interactions within protein structures.

  8. Sharing information among existing data sources

    Science.gov (United States)

    Ashley, W. R., III

    1999-01-01

    The sharing of information between law enforcement agencies is a premise for the success of all jurisdictions. A wealth of information resides in both the databases and infrastructures of local, state, and regional agencies. However, this information is often not available to the law enforcement professionals who require it. When the information is, available, individual investigators must not only know that it exists, but where it resides, and how to retrieve it. In many cases, these types of cross-jurisdictional communications are limited to personal relationships that result from telephone calls, faxes, and in some cases, e-mail. As criminal elements become more sophisticated and distributed, law enforcement agencies must begin to develop infrastructures and common sharing mechanisms that address a constantly evolving criminal threat. Historically, criminals have taken advantage of the lack of communication between law enforcement agencies. Examples of this are evident in the search for stolen property and monetary dealings. Pawned property, cash transactions, and failure to supply child support are three common cross- jurisdictional crimes that could be better enforced by strengthening the lines of communication. Criminal behavior demonstrates that it is easier to profit from their actions by dealing in separate jurisdictions. For example, stolen property is sold outside of the jurisdiction of its origin. In most cases, simply traveling a short distance to the adjoining county or municipality is sufficient to ensure that apprehension of the criminal or seizure of the stolen property is highly unlikely. In addition to the traditional burglar, fugitives often sell or pawn property to finance their continued evasion from the law. Sharing of information in a rapid manner would increase the ability of law enforcement personnel to track and capture fugitives, as well as criminals. In an example to combat this threat, the State of Florida recently acted on the need to

  9. Combination of existing and alternative technologies to promote oilseeds and pulses proteins in food applications

    OpenAIRE

    Chéreau Denis; Videcoq Pauline; Ruffieux Cécile; Pichon Lisa; Motte Jean-Charles; Belaid Saliha; Ventureira Jorge; Lopez Michel

    2016-01-01

    The continuous world population growth induces a total protein demand increase based mainly on plant sources. To meet these global nutritional challenges, existing and innovative dry and wet fractionation processes will have to be combined to better valorise plant protein fraction from pulses and oilseeds. The worldwide success of soy protein isolates originate from the intrinsic qualities of soybean proteins but also from a con...

  10. Existing pavement input information for the mechanistic-empirical pavement design guide.

    Science.gov (United States)

    2009-02-01

    The objective of this study is to systematically evaluate the Iowa Department of Transportations (DOTs) existing Pavement Management Information System (PMIS) with respect to the input information required for Mechanistic-Empirical Pavement Des...

  11. Recording information on protein complexes in an information management system.

    Science.gov (United States)

    Savitsky, Marc; Diprose, Jonathan M; Morris, Chris; Griffiths, Susanne L; Daniel, Edward; Lin, Bill; Daenke, Susan; Bishop, Benjamin; Siebold, Christian; Wilson, Keith S; Blake, Richard; Stuart, David I; Esnouf, Robert M

    2011-08-01

    The Protein Information Management System (PiMS) is a laboratory information management system (LIMS) designed for use with the production of proteins in a research environment. The software is distributed under the CCP4 licence, and so is available free of charge to academic laboratories. Like most LIMS, the underlying PiMS data model originally had no support for protein-protein complexes. To support the SPINE2-Complexes project the developers have extended PiMS to meet these requirements. The modifications to PiMS, described here, include data model changes, additional protocols, some user interface changes and functionality to detect when an experiment may have formed a complex. Example data are shown for the production of a crystal of a protein complex. Integration with SPINE2-Complexes Target Tracker application is also described. Copyright © 2011 Elsevier Inc. All rights reserved.

  12. 76 FR 9376 - Proposed Extension of Existing Information Collection;

    Science.gov (United States)

    2011-02-17

    ... Extension of Existing Information Collection; Refuse Piles and Impoundment Structures, Recordkeeping and... surface installations. More specifically, these sections address refuse piles (30 CFR 77.215), and... combination of materials; and refuse piles are deposits of coal mine waste (other than overburden or spoil...

  13. Relation between Protein Intrinsic Normal Mode Weights and Pre-Existing Conformer Populations.

    Science.gov (United States)

    Ozgur, Beytullah; Ozdemir, E Sila; Gursoy, Attila; Keskin, Ozlem

    2017-04-20

    Intrinsic fluctuations of a protein enable it to sample a large repertoire of conformers including the open and closed forms. These distinct forms of the protein called conformational substates pre-exist together in equilibrium as an ensemble independent from its ligands. The role of ligand might be simply to alter the equilibrium toward the most appropriate form for binding. Normal mode analysis is proved to be useful in identifying the directions of conformational changes between substates. In this study, we demonstrate that the ratios of normalized weights of a few normal modes driving the protein between its substates can give insights about the ratios of kinetic conversion rates of the substates, although a direct relation between the eigenvalues and kinetic conversion rates or populations of each substate could not be observed. The correlation between the normalized mode weight ratios and the kinetic rate ratios is around 83% on a set of 11 non-enzyme proteins and around 59% on a set of 17 enzymes. The results are suggestive that mode motions carry intrinsic relations with thermodynamics and kinetics of the proteins.

  14. Improving protein-protein interaction prediction using evolutionary information from low-quality MSAs.

    Science.gov (United States)

    Várnai, Csilla; Burkoff, Nikolas S; Wild, David L

    2017-01-01

    Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.

  15. Computational methods using weighed-extreme learning machine to predict protein self-interactions with protein evolutionary information.

    Science.gov (United States)

    An, Ji-Yong; Zhang, Lei; Zhou, Yong; Zhao, Yu-Jun; Wang, Da-Fu

    2017-08-18

    Self-interactions Proteins (SIPs) is important for their biological activity owing to the inherent interaction amongst their secondary structures or domains. However, due to the limitations of experimental Self-interactions detection, one major challenge in the study of prediction SIPs is how to exploit computational approaches for SIPs detection based on evolutionary information contained protein sequence. In the work, we presented a novel computational approach named WELM-LAG, which combined the Weighed-Extreme Learning Machine (WELM) classifier with Local Average Group (LAG) to predict SIPs based on protein sequence. The major improvement of our method lies in presenting an effective feature extraction method used to represent candidate Self-interactions proteins by exploring the evolutionary information embedded in PSI-BLAST-constructed position specific scoring matrix (PSSM); and then employing a reliable and robust WELM classifier to carry out classification. In addition, the Principal Component Analysis (PCA) approach is used to reduce the impact of noise. The WELM-LAG method gave very high average accuracies of 92.94 and 96.74% on yeast and human datasets, respectively. Meanwhile, we compared it with the state-of-the-art support vector machine (SVM) classifier and other existing methods on human and yeast datasets, respectively. Comparative results indicated that our approach is very promising and may provide a cost-effective alternative for predicting SIPs. In addition, we developed a freely available web server called WELM-LAG-SIPs to predict SIPs. The web server is available at http://219.219.62.123:8888/WELMLAG/ .

  16. 76 FR 3178 - Proposed Extension of Existing Information Collection; Rock Burst Control Plan, Metal and...

    Science.gov (United States)

    2011-01-19

    ... Extension of Existing Information Collection; Rock Burst Control Plan, Metal and Nonmetal Mines AGENCY: Mine... extension of the information collection for 30 CFR 57.3461 Rock Bursts. DATES: All comments must be received... contains the request for an extension of the existing collection of information in 30 CFR 57.3461 Rock...

  17. Protein complex prediction in large ontology attributed protein-protein interaction networks.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Li, Yanpeng; Xu, Bo

    2013-01-01

    Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.

  18. Integration of relational and hierarchical network information for protein function prediction

    Directory of Open Access Journals (Sweden)

    Jiang Xiaoyu

    2008-08-01

    Full Text Available Abstract Background In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions. Results We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing. Conclusion A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased

  19. The Proteins API: accessing key integrated protein and genome information.

    Science.gov (United States)

    Nightingale, Andrew; Antunes, Ricardo; Alpi, Emanuele; Bursteinas, Borisas; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd; Martin, Maria

    2017-07-03

    The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to 'talk' to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc). © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. IGF-Binding Proteins: Why Do They Exist and Why Are There So Many?

    Directory of Open Access Journals (Sweden)

    John B. Allard

    2018-04-01

    Full Text Available Insulin-like growth factors (IGFs are key growth-promoting peptides that act as both endocrine hormones and autocrine/paracrine growth factors. In the bloodstream and in local tissues, most IGF molecules are bound by one of the members of the IGF-binding protein (IGFBP family, of which six distinct types exist. These proteins bind to IGF with an equal or greater affinity than the IGF1 receptor and are thus in a key position to regulate IGF signaling globally and locally. Binding to an IGFBP increases the half-life of IGF in the circulation and blocks its potential binding to the insulin receptor. In addition to these classical roles, IGFBPs have been shown to modulate IGF signaling locally under various conditions. Although members of the IGFBP family share significant sequence homology, they each have unique structural features and play distinct roles. These IGFBP genes also have different modes of regulation and distinct expression patterns. Some IGFBPs have been found to bind to their own receptors or to translocate into the interior compartments of cells where they may execute IGF-independent actions. In spite of this functional and regulatory diversity, it has been puzzling that loss-of-function studies have yielded relatively little information about the physiological functions of IGFBPs. In this review, we suggest that evolution has tended to retain an array of IGFBPs in order to facilitate fine-tuning of IGF signaling. We explore the emerging explanation that many IGFBP functions have evolved to allow the targeted adjustment of IGF signaling under stressful or irregular conditions, which would likely not be revealed in a standard laboratory setting.

  1. Protein Sub-Nuclear Localization Prediction Using SVM and Pfam Domain Information

    Science.gov (United States)

    Kumar, Ravindra; Jain, Sohni; Kumari, Bandana; Kumar, Manish

    2014-01-01

    The nucleus is the largest and the highly organized organelle of eukaryotic cells. Within nucleus exist a number of pseudo-compartments, which are not separated by any membrane, yet each of them contains only a specific set of proteins. Understanding protein sub-nuclear localization can hence be an important step towards understanding biological functions of the nucleus. Here we have described a method, SubNucPred developed by us for predicting the sub-nuclear localization of proteins. This method predicts protein localization for 10 different sub-nuclear locations sequentially by combining presence or absence of unique Pfam domain and amino acid composition based SVM model. The prediction accuracy during leave-one-out cross-validation for centromeric proteins was 85.05%, for chromosomal proteins 76.85%, for nuclear speckle proteins 81.27%, for nucleolar proteins 81.79%, for nuclear envelope proteins 79.37%, for nuclear matrix proteins 77.78%, for nucleoplasm proteins 76.98%, for nuclear pore complex proteins 88.89%, for PML body proteins 75.40% and for telomeric proteins it was 83.33%. Comparison with other reported methods showed that SubNucPred performs better than existing methods. A web-server for predicting protein sub-nuclear localization named SubNucPred has been established at http://14.139.227.92/mkumar/subnucpred/. Standalone version of SubNucPred can also be downloaded from the web-server. PMID:24897370

  2. Using Existing Response Repertoires to Make Sense of Information System Implementation

    DEFF Research Database (Denmark)

    Jensen, Tina Blegind; Kjærgaard, Annemette Leonhardt

    2010-01-01

    The implementation of information systems (IS) in organizations often triggers new situations in which users experience a disruption of existing work patterns and routines. Sensemaking becomes central in making users’ meanings explicit, serving as a foundation for further actions and interactions...... with the new technology. The purpose of this paper is to study how users make sense of new technologies by building on existing response repertoires. Empirically, we present findings from a study of an Electronic Patient Record (EPR) system implementation in two Danish hospital wards. Our findings illustrate...... to existing literature by providing a detailed account of how users’ early sensemaking of a technology influences their subsequent actions and reactions towards it. Our findings support managers in understanding users’ perceptions of a new technology, helping them in planning and executing the implementation...

  3. Combination of existing and alternative technologies to promote oilseeds and pulses proteins in food applications

    Directory of Open Access Journals (Sweden)

    Chéreau Denis

    2016-07-01

    Full Text Available The continuous world population growth induces a total protein demand increase based mainly on plant sources. To meet these global nutritional challenges, existing and innovative dry and wet fractionation processes will have to be combined to better valorise plant protein fraction from pulses and oilseeds. The worldwide success of soy protein isolates originate from the intrinsic qualities of soybean proteins but also from a continuous R&D effort since mid-twenty century. Therefore, the soy protein development model can be applied to protein isolates from diverse pulses and oilseeds meals as rapeseed which has already been recognised as novel food protein in Europe. To boost the delivery of plant proteins, agrofood-industries and academics must pool their respective expertise. Innovative and issue solving R&D projects have to be launched to better valorise pulses and oilseed proteins by (i creating oil extraction processes which preserve native proteins structure; (ii developing novel protein extraction processes from lab up to industrial pilot scale; (iii producing plant protein isolates having comparable foaming, emulsifying or gelling functionality than animal; and (iv generating hydrolysed proteins with high digestibility adapted to human nutrition. It is also essential to initiate research programs to innovate in wet and dry fractionations of plants or to design in vitro models to evaluate proteins digestibility and allergenicity. The increased awareness regarding plant protein valorisation resulted in the creation by agro-industries and academics of the open platform IMPROVE which propose a combination of competencies and equipment to boost market uptake of Plant Based Proteins.

  4. 76 FR 30741 - Agency Information Collection Activities: Existing Collection; Comments Requested: Prison...

    Science.gov (United States)

    2011-05-26

    ..., practitioners, researchers, students, the media, and others interested in criminal justice statistics. (5) An... DEPARTMENT OF JUSTICE Office of Justice Programs [OMB Number 1121-0102] Agency Information Collection Activities: Existing Collection; Comments Requested: Prison Population Reports: Summary of...

  5. 77 FR 4574 - Agency Information Collection Activities: USCIS Case Status Online; Extension of an Existing...

    Science.gov (United States)

    2012-01-30

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information Collection Activities: USCIS Case Status Online; Extension of an Existing Information Collection; Comment Request ACTION: 60-Day Notice of Information Collection Under Review: USCIS Case Status Online. The...

  6. The Protein Model Portal--a comprehensive resource for protein structure and model information.

    Science.gov (United States)

    Haas, Juergen; Roth, Steven; Arnold, Konstantin; Kiefer, Florian; Schmidt, Tobias; Bordoli, Lorenza; Schwede, Torsten

    2013-01-01

    The Protein Model Portal (PMP) has been developed to foster effective use of 3D molecular models in biomedical research by providing convenient and comprehensive access to structural information for proteins. Both experimental structures and theoretical models for a given protein can be searched simultaneously and analyzed for structural variability. By providing a comprehensive view on structural information, PMP offers the opportunity to apply consistent assessment and validation criteria to the complete set of structural models available for proteins. PMP is an open project so that new methods developed by the community can contribute to PMP, for example, new modeling servers for creating homology models and model quality estimation servers for model validation. The accuracy of participating modeling servers is continuously evaluated by the Continuous Automated Model EvaluatiOn (CAMEO) project. The PMP offers a unique interface to visualize structural coverage of a protein combining both theoretical models and experimental structures, allowing straightforward assessment of the model quality and hence their utility. The portal is updated regularly and actively developed to include latest methods in the field of computational structural biology. Database URL: http://www.proteinmodelportal.org.

  7. The Protein Model Portal—a comprehensive resource for protein structure and model information

    Science.gov (United States)

    Haas, Juergen; Roth, Steven; Arnold, Konstantin; Kiefer, Florian; Schmidt, Tobias; Bordoli, Lorenza; Schwede, Torsten

    2013-01-01

    The Protein Model Portal (PMP) has been developed to foster effective use of 3D molecular models in biomedical research by providing convenient and comprehensive access to structural information for proteins. Both experimental structures and theoretical models for a given protein can be searched simultaneously and analyzed for structural variability. By providing a comprehensive view on structural information, PMP offers the opportunity to apply consistent assessment and validation criteria to the complete set of structural models available for proteins. PMP is an open project so that new methods developed by the community can contribute to PMP, for example, new modeling servers for creating homology models and model quality estimation servers for model validation. The accuracy of participating modeling servers is continuously evaluated by the Continuous Automated Model EvaluatiOn (CAMEO) project. The PMP offers a unique interface to visualize structural coverage of a protein combining both theoretical models and experimental structures, allowing straightforward assessment of the model quality and hence their utility. The portal is updated regularly and actively developed to include latest methods in the field of computational structural biology. Database URL: http://www.proteinmodelportal.org PMID:23624946

  8. 77 FR 33002 - Proposed Extension of Existing Information Collection; Health Standards for Diesel Particulate...

    Science.gov (United States)

    2012-06-04

    ... information in accordance with the Paperwork Reduction Act of 1995. This program helps to assure that requested data can be provided in the desired format, reporting burden (time and financial resources) is... Extension of Existing Information Collection; Health Standards for Diesel Particulate Matter Exposure...

  9. 77 FR 58173 - Proposed Extension of Existing Information Collection; Explosive Materials and Blasting Units...

    Science.gov (United States)

    2012-09-19

    ... information in accordance with the Paperwork Reduction Act of 1995. This program helps to assure that requested data can be provided in the desired format, reporting burden (time and financial resources) is... Extension of Existing Information Collection; Explosive Materials and Blasting Units (Pertains to Metal and...

  10. Annotating the protein-RNA interaction sites in proteins using evolutionary information and protein backbone structure.

    Science.gov (United States)

    Li, Tao; Li, Qian-Zhong

    2012-11-07

    RNA-protein interactions play important roles in various biological processes. The precise detection of RNA-protein interaction sites is very important for understanding essential biological processes and annotating the function of the proteins. In this study, based on various features from amino acid sequence and structure, including evolutionary information, solvent accessible surface area and torsion angles (φ, ψ) in the backbone structure of the polypeptide chain, a computational method for predicting RNA-binding sites in proteins is proposed. When the method is applied to predict RNA-binding sites in three datasets: RBP86 containing 86 protein chains, RBP107 containing 107 proteins chains and RBP109 containing 109 proteins chains, better sensitivities and specificities are obtained compared to previously published methods in five-fold cross-validation tests. In order to make further examination for the efficiency of our method, the RBP107 dataset is used as training set, RBP86 and RBP109 datasets are used as the independent test sets. In addition, as examples of our prediction, RNA-binding sites in a few proteins are presented. The annotated results are consistent with the PDB annotation. These results show that our method is useful for annotating RNA binding sites of novel proteins.

  11. Analysis of informational redundancy in the protein-assembling machinery

    Science.gov (United States)

    Berkovich, Simon

    2004-03-01

    Entropy analysis of the DNA structure does not reveal a significant departure from randomness indicating lack of informational redundancy. This signifies the absence of a hidden meaning in the genome text and supports the 'barcode' interpretation of DNA given in [1]. Lack of informational redundancy is a characteristic property of an identification label rather than of a message of instructions. Yet randomness of DNA has to induce non-random structures of the proteins. Protein synthesis is a two-step process: transcription into RNA with gene splicing and formation a structure of amino acids. Entropy estimations, performed by A. Djebbari, show typical values of redundancy of the biomolecules along these pathways: DNA gene 4proteins 15-40in gene expression, the RNA copy carries the same information as the original DNA template. Randomness is essentially eliminated only at the step of the protein creation by a degenerate code. According to [1], the significance of the substitution of U for T with a subsequent gene splicing is that these transformations result in a different pattern of RNA oscillations, so the vital DNA communications are protected against extraneous noise coming from the protein making activities. 1. S. Berkovich, "On the 'barcode' functionality of DNA, or the Phenomenon of Life in the Physical Universe", Dorrance Publishing Co., Pittsburgh, 2003

  12. 77 FR 58170 - Proposed Renewal of Existing Information Collection; Fire Protection (Underground Coal Mines)

    Science.gov (United States)

    2012-09-19

    ... Renewal of Existing Information Collection; Fire Protection (Underground Coal Mines) AGENCY: Mine Safety... INFORMATION: I. Background Fire protection standards for underground coal mines are based on section 311(a) of the Federal Mine Safety and Health Act of 1977 (Mine Act). 30 CFR 75.1100 requires that each coal mine...

  13. Identifying essential proteins based on sub-network partition and prioritization by integrating subcellular localization information.

    Science.gov (United States)

    Li, Min; Li, Wenkai; Wu, Fang-Xiang; Pan, Yi; Wang, Jianxin

    2018-06-14

    Essential proteins are important participants in various life activities and play a vital role in the survival and reproduction of living organisms. Identification of essential proteins from protein-protein interaction (PPI) networks has great significance to facilitate the study of human complex diseases, the design of drugs and the development of bioinformatics and computational science. Studies have shown that highly connected proteins in a PPI network tend to be essential. A series of computational methods have been proposed to identify essential proteins by analyzing topological structures of PPI networks. However, the high noise in the PPI data can degrade the accuracy of essential protein prediction. Moreover, proteins must be located in the appropriate subcellular localization to perform their functions, and only when the proteins are located in the same subcellular localization, it is possible that they can interact with each other. In this paper, we propose a new network-based essential protein discovery method based on sub-network partition and prioritization by integrating subcellular localization information, named SPP. The proposed method SPP was tested on two different yeast PPI networks obtained from DIP database and BioGRID database. The experimental results show that SPP can effectively reduce the effect of false positives in PPI networks and predict essential proteins more accurately compared with other existing computational methods DC, BC, CC, SC, EC, IC, NC. Copyright © 2018 Elsevier Ltd. All rights reserved.

  14. Molecular eyes: proteins that transform light into biological information

    NARCIS (Netherlands)

    Kennis, J.T.M.; Mathes, T.

    2013-01-01

    Most biological photoreceptors are protein/cofactor complexes that induce a physiological reaction upon absorption of a photon. Therefore, these proteins represent signal converters that translate light into biological information. Researchers use this property to stimulate and study various

  15. Protein domain recurrence and order can enhance prediction of protein functions

    KAUST Repository

    Abdel Messih, Mario A.

    2012-09-07

    Motivation: Burgeoning sequencing technologies have generated massive amounts of genomic and proteomic data. Annotating the functions of proteins identified in this data has become a big and crucial problem. Various computational methods have been developed to infer the protein functions based on either the sequences or domains of proteins. The existing methods, however, ignore the recurrence and the order of the protein domains in this function inference. Results: We developed two new methods to infer protein functions based on protein domain recurrence and domain order. Our first method, DRDO, calculates the posterior probability of the Gene Ontology terms based on domain recurrence and domain order information, whereas our second method, DRDO-NB, relies on the nave Bayes methodology using the same domain architecture information. Our large-scale benchmark comparisons show strong improvements in the accuracy of the protein function inference achieved by our new methods, demonstrating that domain recurrence and order can provide important information for inference of protein functions. The Author(s) 2012. Published by Oxford University Press.

  16. Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: Evolutionary enhanced Markov clustering.

    Science.gov (United States)

    Theofilatos, Konstantinos; Pavlopoulou, Niki; Papasavvas, Christoforos; Likothanassis, Spiros; Dimitrakopoulos, Christos; Georgopoulos, Efstratios; Moschopoulos, Charalampos; Mavroudi, Seferina

    2015-03-01

    Proteins are considered to be the most important individual components of biological systems and they combine to form physical protein complexes which are responsible for certain molecular functions. Despite the large availability of protein-protein interaction (PPI) information, not much information is available about protein complexes. Experimental methods are limited in terms of time, efficiency, cost and performance constraints. Existing computational methods have provided encouraging preliminary results, but they phase certain disadvantages as they require parameter tuning, some of them cannot handle weighted PPI data and others do not allow a protein to participate in more than one protein complex. In the present paper, we propose a new fully unsupervised methodology for predicting protein complexes from weighted PPI graphs. The proposed methodology is called evolutionary enhanced Markov clustering (EE-MC) and it is a hybrid combination of an adaptive evolutionary algorithm and a state-of-the-art clustering algorithm named enhanced Markov clustering. EE-MC was compared with state-of-the-art methodologies when applied to datasets from the human and the yeast Saccharomyces cerevisiae organisms. Using public available datasets, EE-MC outperformed existing methodologies (in some datasets the separation metric was increased by 10-20%). Moreover, when applied to new human datasets its performance was encouraging in the prediction of protein complexes which consist of proteins with high functional similarity. In specific, 5737 protein complexes were predicted and 72.58% of them are enriched for at least one gene ontology (GO) function term. EE-MC is by design able to overcome intrinsic limitations of existing methodologies such as their inability to handle weighted PPI networks, their constraint to assign every protein in exactly one cluster and the difficulties they face concerning the parameter tuning. This fact was experimentally validated and moreover, new

  17. Residential dynamics: the co-existence of formal and informal systems in Khartoum, Sudan

    CSIR Research Space (South Africa)

    Osman, A

    2010-05-01

    Full Text Available This paper looks at the residential dynamics in Khartoum, Sudan. Some patterns demonstrate that formal and informal systems co-exist and are mutually supportive. There are also particular spatial manifestations that have resulted from a unique socio...

  18. RStrucFam: a web server to associate structure and cognate RNA for RNA-binding proteins from sequence information.

    Science.gov (United States)

    Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan

    2016-10-07

    RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential

  19. Mapping protein information to disease terminologies

    Directory of Open Access Journals (Sweden)

    Mottaz Anaïs

    2007-12-01

    Full Text Available In order to improve the accessibility of genomic and proteomic information to medical researchers, we have developed a procedure to link biological information on proteins involved in diseases to the MeSH and ICD-10 disease terminologies. For this purpose, we took advantage of the manually curated disease annotations in more than 2,000 human protein entries of the UniProt KnowledgeBase. We mapped disease names extracted from the entry comment lines or from the corresponding OMIM entry to the MeSH. The method was assessed on a benchmark set of 200 manually mapped disease comment lines. We obtained a recall of 54% for 91% precision. The same procedure was used to map the more than 3,000 diseases in Swiss-Prot to MeSH with comparable efficiency. Tested on ICD-10, the coverage of the mapped terms was lower, which could be explained by the coarse-grained structure of this terminology for hereditary disease description. The mapping is provided as supplementary material at http://research.isbsib.ch/unimed.

  20. 78 FR 69447 - Agency Information Collection Activities; Existing Collection, Comments Requested: Friction Ridge...

    Science.gov (United States)

    2013-11-19

    ... DEPARTMENT OF JUSTICE Federal Bureau of Investigation [OMB Number 1110-0046] Agency Information Collection Activities; Existing Collection, Comments Requested: Friction Ridge Cards: Arrest and Institution... expired. Reference: OMB control number of 1110-0046. (2) The title of the form/collection: Friction Ridge...

  1. ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval.

    Science.gov (United States)

    Wang, Jingyan; Gao, Xin; Wang, Quanquan; Li, Yongping

    2012-05-08

    The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database. In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure dij by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j), if their context N(i) and N(j) is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing dij by a factor learned from the context N(i) and N(j).Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels) and irrelevant (with different labels) protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity measure, we update the Protein Hierarchial

  2. 76 FR 3175 - Proposed Extension of Existing Information Collection; Hoist Operators' Physical Fitness

    Science.gov (United States)

    2011-01-19

    ... Extension of Existing Information Collection; Hoist Operators' Physical Fitness AGENCY: Mine Safety and... fitness. DATES: All comments must be received by midnight Eastern Standard Time on March 21, 2011... 56.19057 and 57.19057 require the annual examination and certification of hoist operators' fitness by...

  3. Protein Annotators' Assistant: A Novel Application of Information Retrieval Techniques.

    Science.gov (United States)

    Wise, Michael J.

    2000-01-01

    Protein Annotators' Assistant (PAA) is a software system which assists protein annotators in assigning functions to newly sequenced proteins. PAA employs a number of information retrieval techniques in a novel setting and is thus related to text categorization, where multiple categories may be suggested, except that in this case none of the…

  4. 75 FR 79030 - Proposed Extension of Existing Information Collection; Training Plans and Records of Training

    Science.gov (United States)

    2010-12-17

    ... Extension of Existing Information Collection; Training Plans and Records of Training AGENCY: Mine Safety and... extension of the information collection for Training Plans and Records of Training, 30 CFR 48.3, 48.9, 48.23... require training plans for underground and surface mines, respectively. The standards are intended to...

  5. Exhaustive search of linear information encoding protein-peptide recognition.

    Science.gov (United States)

    Kelil, Abdellali; Dubreuil, Benjamin; Levy, Emmanuel D; Michnick, Stephen W

    2017-04-01

    High-throughput in vitro methods have been extensively applied to identify linear information that encodes peptide recognition. However, these methods are limited in number of peptides, sequence variation, and length of peptides that can be explored, and often produce solutions that are not found in the cell. Despite the large number of methods developed to attempt addressing these issues, the exhaustive search of linear information encoding protein-peptide recognition has been so far physically unfeasible. Here, we describe a strategy, called DALEL, for the exhaustive search of linear sequence information encoded in proteins that bind to a common partner. We applied DALEL to explore binding specificity of SH3 domains in the budding yeast Saccharomyces cerevisiae. Using only the polypeptide sequences of SH3 domain binding proteins, we succeeded in identifying the majority of known SH3 binding sites previously discovered either in vitro or in vivo. Moreover, we discovered a number of sites with both non-canonical sequences and distinct properties that may serve ancillary roles in peptide recognition. We compared DALEL to a variety of state-of-the-art algorithms in the blind identification of known binding sites of the human Grb2 SH3 domain. We also benchmarked DALEL on curated biological motifs derived from the ELM database to evaluate the effect of increasing/decreasing the enrichment of the motifs. Our strategy can be applied in conjunction with experimental data of proteins interacting with a common partner to identify binding sites among them. Yet, our strategy can also be applied to any group of proteins of interest to identify enriched linear motifs or to exhaustively explore the space of linear information encoded in a polypeptide sequence. Finally, we have developed a webserver located at http://michnick.bcm.umontreal.ca/dalel, offering user-friendly interface and providing different scenarios utilizing DALEL.

  6. Protein Signaling Networks from Single Cell Fluctuations and Information Theory Profiling

    Science.gov (United States)

    Shin, Young Shik; Remacle, F.; Fan, Rong; Hwang, Kiwook; Wei, Wei; Ahmad, Habib; Levine, R.D.; Heath, James R.

    2011-01-01

    Protein signaling networks among cells play critical roles in a host of pathophysiological processes, from inflammation to tumorigenesis. We report on an approach that integrates microfluidic cell handling, in situ protein secretion profiling, and information theory to determine an extracellular protein-signaling network and the role of perturbations. We assayed 12 proteins secreted from human macrophages that were subjected to lipopolysaccharide challenge, which emulates the macrophage-based innate immune responses against Gram-negative bacteria. We characterize the fluctuations in protein secretion of single cells, and of small cell colonies (n = 2, 3,···), as a function of colony size. Measuring the fluctuations permits a validation of the conditions required for the application of a quantitative version of the Le Chatelier's principle, as derived using information theory. This principle provides a quantitative prediction of the role of perturbations and allows a characterization of a protein-protein interaction network. PMID:21575571

  7. PIE the search: searching PubMed literature for protein interaction information.

    Science.gov (United States)

    Kim, Sun; Kwon, Dongseop; Shin, Soo-Yong; Wilbur, W John

    2012-02-15

    Finding protein-protein interaction (PPI) information from literature is challenging but an important issue. However, keyword search in PubMed(®) is often time consuming because it requires a series of actions that refine keywords and browse search results until it reaches a goal. Due to the rapid growth of biomedical literature, it has become more difficult for biologists and curators to locate PPI information quickly. Therefore, a tool for prioritizing PPI informative articles can be a useful assistant for finding this PPI-relevant information. PIE (Protein Interaction information Extraction) the search is a web service implementing a competition-winning approach utilizing word and syntactic analyses by machine learning techniques. For easy user access, PIE the search provides a PubMed-like search environment, but the output is the list of articles prioritized by PPI confidence scores. By obtaining PPI-related articles at high rank, researchers can more easily find the up-to-date PPI information, which cannot be found in manually curated PPI databases. http://www.ncbi.nlm.nih.gov/IRET/PIE/.

  8. ProDis-ContSHC: Learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval

    KAUST Repository

    Wang, Jim Jing-Yan

    2012-05-08

    Background: The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database.Results: In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure dij by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j), if their context N (i) and N (j) is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing dij by a factor learned from the context N (i) and N (j). Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels) and irrelevant (with different labels) protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity measure, we update

  9. CISAPS: Complex Informational Spectrum for the Analysis of Protein Sequences

    Directory of Open Access Journals (Sweden)

    Charalambos Chrysostomou

    2015-01-01

    Full Text Available Complex informational spectrum analysis for protein sequences (CISAPS and its web-based server are developed and presented. As recent studies show, only the use of the absolute spectrum in the analysis of protein sequences using the informational spectrum analysis is proven to be insufficient. Therefore, CISAPS is developed to consider and provide results in three forms including absolute, real, and imaginary spectrum. Biologically related features to the analysis of influenza A subtypes as presented as a case study in this study can also appear individually either in the real or imaginary spectrum. As the results presented, protein classes can present similarities or differences according to the features extracted from CISAPS web server. These associations are probable to be related with the protein feature that the specific amino acid index represents. In addition, various technical issues such as zero-padding and windowing that may affect the analysis are also addressed. CISAPS uses an expanded list of 611 unique amino acid indices where each one represents a different property to perform the analysis. This web-based server enables researchers with little knowledge of signal processing methods to apply and include complex informational spectrum analysis to their work.

  10. Exist and grow under internet world in the information manage office of the unit with R and D

    International Nuclear Information System (INIS)

    Chen Suyan

    2010-01-01

    In comprehensive research institutes, there exist information centers in addition to the main libraries. These information centers are either the branches of a main library or the units belonging to the research divisions. Compared to the main libraries, the information centers provide scientists with more professional, well targeted and applicable research resources. Their contribution to the successful research and development activities are essential and should not be ignored. In the computer age, people rely more on the Internet to obtain the information. Commercialized information service providers challenge the existence of the traditional information centers and even libraries are at risk of being obsolete. This paper reviewed the characteristics, current status and challenges of the information centers. We shared the successful experience of the Department of Reactor Engineering Research and Design and proposed the development strategies for information centers under the new environment. (author)

  11. Using the clustered circular layout as an informative method for visualizing protein-protein interaction networks.

    Science.gov (United States)

    Fung, David C Y; Wilkins, Marc R; Hart, David; Hong, Seok-Hee

    2010-07-01

    The force-directed layout is commonly used in computer-generated visualizations of protein-protein interaction networks. While it is good for providing a visual outline of the protein complexes and their interactions, it has two limitations when used as a visual analysis method. The first is poor reproducibility. Repeated running of the algorithm does not necessarily generate the same layout, therefore, demanding cognitive readaptation on the investigator's part. The second limitation is that it does not explicitly display complementary biological information, e.g. Gene Ontology, other than the protein names or gene symbols. Here, we present an alternative layout called the clustered circular layout. Using the human DNA replication protein-protein interaction network as a case study, we compared the two network layouts for their merits and limitations in supporting visual analysis.

  12. 76 FR 9376 - Proposed Extension of Existing Information Collection on Qualification/Certification Program and...

    Science.gov (United States)

    2011-02-17

    ... Extension of Existing Information Collection on Qualification/Certification Program and Man Hoist Operators... 77.107-1 on Qualification/Certification Program and Man Hoist Operators Physical Fitness. DATES: All... Labor or Secretary of Health and Human Services must make frequent inspections and investigations in...

  13. 77 FR 4834 - Proposed Extension of Existing Information Collection; Refuge Alternatives for Underground Coal...

    Science.gov (United States)

    2012-01-31

    ... Extension of Existing Information Collection; Refuge Alternatives for Underground Coal Mines AGENCY: Mine... Underground Coal Mines DATES: Submit comments on or before April 2, 2012. ADDRESSES: Comments must be.... Title: Refuge Alternatives for Underground Coal Mines. OMB Number: 1219-0146. Affected Public: Business...

  14. The Universal Protein Resource (UniProt): an expanding universe of protein information.

    Science.gov (United States)

    Wu, Cathy H; Apweiler, Rolf; Bairoch, Amos; Natale, Darren A; Barker, Winona C; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J; Mazumder, Raja; O'Donovan, Claire; Redaschi, Nicole; Suzek, Baris

    2006-01-01

    The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/.

  15. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  16. Informal waste collection and its co-existence with the formal waste sector: The case of Kampala, Uganda

    NARCIS (Netherlands)

    Katusiimeh, M.W.; Burger, C.P.J.; Mol, A.P.J.

    2013-01-01

    We analyze how the informal collectors and the formal sector co-exist in solid waste collection in Kampala. We rely on household surveys and a small survey among the informal collectors in Kampala. Findings suggest that informal collectors play a substantial role in the first stage – collecting

  17. HitPredict version 4: comprehensive reliability scoring of physical protein?protein interactions from more than 100 species

    OpenAIRE

    L?pez, Yosvany; Nakai, Kenta; Patil, Ashwini

    2015-01-01

    HitPredict is a consolidated resource of experimentally identified, physical protein?protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein?protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of p...

  18. A method for partitioning the information contained in a protein sequence between its structure and function.

    Science.gov (United States)

    Possenti, Andrea; Vendruscolo, Michele; Camilloni, Carlo; Tiana, Guido

    2018-05-23

    Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.

  19. Military Leadership in the Context of Challenges and Threats Existing in Information Environment

    Directory of Open Access Journals (Sweden)

    Tomasz Kacała

    2015-06-01

    Full Text Available The aim of the paper is to present the role of a military leader in engaging the challenges and threats existing in the I nformation Environment (IE. Military leadership is crucial for the functioning of a particular form of hierarchical institution, namely the armed forces, in their external surrounding called O perational Environment (OE. A specific type of O E is I nformation Environment (IE characterized by the three dimensions: physical, informational and cognitive. Moreover, its characteristics include the occurrence of a number of challenges and threats. The most important challenges include: overabundance of information, unstructured information, problematic value of information and low information-related competences of its users. I n turn, the most important of the threats identified in the I E are disinformation and propaganda. The role of an effective leader is to prevent, and if it is impossible, to alleviate the consequences of the challenges and threats that may disrupt or even prevent the achievement of the objectives set by an organisation.

  20. Laboratory information management system for membrane protein structure initiative--from gene to crystal.

    Science.gov (United States)

    Troshin, Petr V; Morris, Chris; Prince, Stephen M; Papiz, Miroslav Z

    2008-12-01

    Membrane Protein Structure Initiative (MPSI) exploits laboratory competencies to work collaboratively and distribute work among the different sites. This is possible as protein structure determination requires a series of steps, starting with target selection, through cloning, expression, purification, crystallization and finally structure determination. Distributed sites create a unique set of challenges for integrating and passing on information on the progress of targets. This role is played by the Protein Information Management System (PIMS), which is a laboratory information management system (LIMS), serving as a hub for MPSI, allowing collaborative structural proteomics to be carried out in a distributed fashion. It holds key information on the progress of cloning, expression, purification and crystallization of proteins. PIMS is employed to track the status of protein targets and to manage constructs, primers, experiments, protocols, sample locations and their detailed histories: thus playing a key role in MPSI data exchange. It also serves as the centre of a federation of interoperable information resources such as local laboratory information systems and international archival resources, like PDB or NCBI. During the challenging task of PIMS integration, within the MPSI, we discovered a number of prerequisites for successful PIMS integration. In this article we share our experiences and provide invaluable insights into the process of LIMS adaptation. This information should be of interest to partners who are thinking about using LIMS as a data centre for their collaborative efforts.

  1. Quantifying information transfer by protein domains: Analysis of the Fyn SH2 domain structure

    DEFF Research Database (Denmark)

    Lenaerts, Tom; Ferkinghoff-Borg, Jesper; Stricher, Francois

    2008-01-01

    instance of communication over a noisy channel. In particular, we analyze the conformational correlations between protein residues and apply the concept of mutual information to quantify information exchange. Mapping out changes of mutual information on the protein structure then allows visualizing how...... distal communication is achieved. We illustrate the approach by analyzing information transfer by the SH2 domain of Fyn tyrosine kinase, obtained from Monte Carlo dynamics simulations. Our analysis reveals that the Fyn SH2 domain forms a noisy communication channel that couples residues located......Background: Efficient communication between distant sites within a protein is essential for cooperative biological response. Although often associated with large allosteric movements, more subtle changes in protein dynamics can also induce long-range correlations. However, an appropriate formalism...

  2. 75 FR 79031 - Proposed Extension of Existing Information, Collection; Representative of Miners; Legal Identity...

    Science.gov (United States)

    2010-12-17

    ... Extension of Existing Information, Collection; Representative of Miners; Legal Identity Report; Opening and....2, 40.3, 40.4, and 40.5, Representative of Miners; 30 CFR 41.20, Legal Identity Report; 30 CFR 56... designation. Legal Identity Report Section 109(d) of the Mine Act requires each operator of a coal or other...

  3. ASSESSING AND COMBINING RELIABILITY OF PROTEIN INTERACTION SOURCES

    Science.gov (United States)

    LEACH, SONIA; GABOW, AARON; HUNTER, LAWRENCE; GOLDBERG, DEBRA S.

    2008-01-01

    Integrating diverse sources of interaction information to create protein networks requires strategies sensitive to differences in accuracy and coverage of each source. Previous integration approaches calculate reliabilities of protein interaction information sources based on congruity to a designated ‘gold standard.’ In this paper, we provide a comparison of the two most popular existing approaches and propose a novel alternative for assessing reliabilities which does not require a gold standard. We identify a new method for combining the resultant reliabilities and compare it against an existing method. Further, we propose an extrinsic approach to evaluation of reliability estimates, considering their influence on the downstream tasks of inferring protein function and learning regulatory networks from expression data. Results using this evaluation method show 1) our method for reliability estimation is an attractive alternative to those requiring a gold standard and 2) the new method for combining reliabilities is less sensitive to noise in reliability assignments than the similar existing technique. PMID:17990508

  4. A domain-based approach to predict protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Resat Haluk

    2007-06-01

    Full Text Available Abstract Background Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins. Results DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms. Conclusion We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed

  5. GRIP: A web-based system for constructing Gold Standard datasets for protein-protein interaction prediction

    Directory of Open Access Journals (Sweden)

    Zheng Huiru

    2009-01-01

    Full Text Available Abstract Background Information about protein interaction networks is fundamental to understanding protein function and cellular processes. Interaction patterns among proteins can suggest new drug targets and aid in the design of new therapeutic interventions. Efforts have been made to map interactions on a proteomic-wide scale using both experimental and computational techniques. Reference datasets that contain known interacting proteins (positive cases and non-interacting proteins (negative cases are essential to support computational prediction and validation of protein-protein interactions. Information on known interacting and non interacting proteins are usually stored within databases. Extraction of these data can be both complex and time consuming. Although, the automatic construction of reference datasets for classification is a useful resource for researchers no public resource currently exists to perform this task. Results GRIP (Gold Reference dataset constructor from Information on Protein complexes is a web-based system that provides researchers with the functionality to create reference datasets for protein-protein interaction prediction in Saccharomyces cerevisiae. Both positive and negative cases for a reference dataset can be extracted, organised and downloaded by the user. GRIP also provides an upload facility whereby users can submit proteins to determine protein complex membership. A search facility is provided where a user can search for protein complex information in Saccharomyces cerevisiae. Conclusion GRIP is developed to retrieve information on protein complex, cellular localisation, and physical and genetic interactions in Saccharomyces cerevisiae. Manual construction of reference datasets can be a time consuming process requiring programming knowledge. GRIP simplifies and speeds up this process by allowing users to automatically construct reference datasets. GRIP is free to access at http://rosalind.infj.ulst.ac.uk/GRIP/.

  6. Using structural knowledge in the protein data bank to inform the search for potential host-microbe protein interactions in sequence space: application to Mycobacterium tuberculosis.

    Science.gov (United States)

    Mahajan, Gaurang; Mande, Shekhar C

    2017-04-04

    A comprehensive map of the human-M. tuberculosis (MTB) protein interactome would help fill the gaps in our understanding of the disease, and computational prediction can aid and complement experimental studies towards this end. Several sequence-based in silico approaches tap the existing data on experimentally validated protein-protein interactions (PPIs); these PPIs serve as templates from which novel interactions between pathogen and host are inferred. Such comparative approaches typically make use of local sequence alignment, which, in the absence of structural details about the interfaces mediating the template interactions, could lead to incorrect inferences, particularly when multi-domain proteins are involved. We propose leveraging the domain-domain interaction (DDI) information in PDB complexes to score and prioritize candidate PPIs between host and pathogen proteomes based on targeted sequence-level comparisons. Our method picks out a small set of human-MTB protein pairs as candidates for physical interactions, and the use of functional meta-data suggests that some of them could contribute to the in vivo molecular cross-talk between pathogen and host that regulates the course of the infection. Further, we present numerical data for Pfam domain families that highlights interaction specificity on the domain level. Not every instance of a pair of domains, for which interaction evidence has been found in a few instances (i.e. structures), is likely to functionally interact. Our sorting approach scores candidates according to how "distant" they are in sequence space from known examples of DDIs (templates). Thus, it provides a natural way to deal with the heterogeneity in domain-level interactions. Our method represents a more informed application of local alignment to the sequence-based search for potential human-microbial interactions that uses available PPI data as a prior. Our approach is somewhat limited in its sensitivity by the restricted size and

  7. Annotating Mutational Effects on Proteins and Protein Interactions: Designing Novel and Revisiting Existing Protocols.

    Science.gov (United States)

    Li, Minghui; Goncearenco, Alexander; Panchenko, Anna R

    2017-01-01

    In this review we describe a protocol to annotate the effects of missense mutations on proteins, their functions, stability, and binding. For this purpose we present a collection of the most comprehensive databases which store different types of sequencing data on missense mutations, we discuss their relationships, possible intersections, and unique features. Next, we suggest an annotation workflow using the state-of-the art methods and highlight their usability, advantages, and limitations for different cases. Finally, we address a particularly difficult problem of deciphering the molecular mechanisms of mutations on proteins and protein complexes to understand the origins and mechanisms of diseases.

  8. Elman RNN based classification of proteins sequences on account of their mutual information.

    Science.gov (United States)

    Mishra, Pooja; Nath Pandey, Paras

    2012-10-21

    In the present work we have employed the method of estimating residue correlation within the protein sequences, by using the mutual information (MI) of adjacent residues, based on structural and solvent accessibility properties of amino acids. The long range correlation between nonadjacent residues is improved by constructing a mutual information vector (MIV) for a single protein sequence, like this each protein sequence is associated with its corresponding MIVs. These MIVs are given to Elman RNN to obtain the classification of protein sequences. The modeling power of MIV was shown to be significantly better, giving a new approach towards alignment free classification of protein sequences. We also conclude that sequence structural and solvent accessible property based MIVs are better predictor. Copyright © 2012 Elsevier Ltd. All rights reserved.

  9. Identification of energy information needs and existing information sources for Pennsylvania

    Energy Technology Data Exchange (ETDEWEB)

    Wisch, A.; Kunzier, J.; Limaye, D.; Orlando, J.

    1976-01-01

    Through use of a comprehensive interviewing schedule designed to elicit information needs from state policymakers, this study has shown a statewide need for a workable energy information network. As a counterpoint to this needs survey, it was also demonstrated that many of the components of such an information base already are available at the state and Federal levels. In order to assure that Pennsylvania's decision makers have access to this required information in a current and useful format at a minimal cost, this study has suggested a three-pronged action program: (1) In order to construct a workable energy information network for use by the Commonwealth, a liaison should be established with the Governor's Energy Council and the various national and regional energy information sources as cited in this report. (2) An information directory on State, Federal and private sources should be maintained and distributed on a continuing basis. An assessment of each source should be included with information on ease of access and relevance of the source to Pennsylvania. (3) After an information need is unable to be met through use of (1) the state energy information network and/or (2) the state energy information directory, effort should be initiated to satisfy that specific requirement.

  10. Senescence marker protein 30 (SMP30 expression in eukaryotic cells: existence of multiple species and membrane localization.

    Directory of Open Access Journals (Sweden)

    Peethambaran Arun

    Full Text Available Senescence marker protein (SMP30, also known as regucalcin, is a 34 kDa cytosolic marker protein of aging which plays an important role in intracellular Ca(2+ homeostasis, ascorbic acid biosynthesis, oxidative stress, and detoxification of chemical warfare nerve agents. In our goal to investigate the activity of SMP30 for the detoxification of nerve agents, we have produced a recombinant adenovirus expressing human SMP30 as a fusion protein with a hemaglutinin tag (Ad-SMP30-HA. Ad-SMP30-HA transduced the expression of SMP30-HA and two additional forms of SMP30 with molecular sizes ∼28 kDa and 24 kDa in HEK-293A and C3A liver cells in a dose and time-dependent manner. Intravenous administration of Ad-SMP30-HA in mice results in the expression of all the three forms of SMP30 in the liver and diaphragm. LC-MS/MS results confirmed that the lower molecular weight 28 kDa and 24 kDa proteins are related to the 34 kDa SMP30. The 28 kDa and 24 kDa SMP30 forms were also detected in normal rat liver and mice injected with Ad-SMP30-HA suggesting that SMP30 does exist in multiple forms under physiological conditions. Time course experiments in both cell lines suggest that the 28 kDa and 24 kDa SMP30 forms are likely generated from the 34 kDa SMP30. Interestingly, the 28 kDa and 24 kDa SMP30 forms appeared initially in the cytosol and shifted to the particulate fraction. Studies using small molecule inhibitors of proteolytic pathways revealed the potential involvement of β and γ-secretases but not calpains, lysosomal proteases, proteasome and caspases. This is the first report describing the existence of multiple forms of SMP30, their preferential distribution to membranes and their generation through proteolysis possibly mediated by secretase enzymes.

  11. Plutonium matters (Basic information about its creation, properties, uses and the quantities that exist)

    International Nuclear Information System (INIS)

    Meadley, T.

    1995-01-01

    Plutonium is almost unknown in nature and this, combined with the fact that it is ''man made'' in nuclear reactors, doubtless accounts for at least some of its notoriety as the world's most dangerous and poisonous substance. The purpose of this article is to demystify plutonium by providing some basic information about its creation, properties, uses and the quantities that exist. (Author)

  12. Radical SAM, A Novel Protein Superfamily Linking Unresolved Steps in Familiar Biosynthetic Pathways with Radical Mechanisms: Functional Characterization Using New Analysis and Information Visualization Methods

    Energy Technology Data Exchange (ETDEWEB)

    Sofia, Heidi J.; Chen, Guang; Hetzler, Elizabeth G.; Reyes Spindola, Jorge F.; Miller, Nancy E.

    2001-03-01

    A large protein superfamily with over 500 members has been discovered and analyzed using powerful new bioinformatics and information visualization methods. Evidence exists that these proteins generate a 5?-deoxyadenosyl radical by reductive cleavage of S-adenosylmethionine (SAM) through an unusual Fe-S center. Radical SAM superfamily proteins function in DNA precursor, vitamin, cofactor, antibiotic, and herbicide biosynthesis in a collection of basic and familiar pathways. One of the members is interferon-inducible and is considered a candidate drug target for osteoporosis. The identification of this superfamily suggests that radical-based catalysis is important in a number of previously well-studied but unresolved biochemical pathways.

  13. Deconstructing brain-derived neurotrophic factor actions in adult brain circuits to bridge an existing informational gap in neuro-cell biology

    Directory of Open Access Journals (Sweden)

    Heather Bowling

    2016-01-01

    Full Text Available Brain-derived neurotrophic factor (BDNF plays an important role in neurodevelopment, synaptic plasticity, learning and memory, and in preventing neurodegeneration. Despite decades of investigations into downstream signaling cascades and changes in cellular processes, the mechanisms of how BDNF reshapes circuits in vivo remain unclear. This informational gap partly arises from the fact that the bulk of studies into the molecular actions of BDNF have been performed in dissociated neuronal cultures, while the majority of studies on synaptic plasticity, learning and memory were performed in acute brain slices or in vivo. A recent study by Bowling-Bhattacharya et al., measured the proteomic changes in acute adult hippocampal slices following treatment and reported changes in proteins of neuronal and non-neuronal origin that may in concert modulate synaptic release and secretion in the slice. In this paper, we place these findings into the context of existing literature and discuss how they impact our understanding of how BDNF can reshape the brain.

  14. pLoc-mVirus: Predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC.

    Science.gov (United States)

    Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen

    2017-09-10

    Knowledge of subcellular locations of proteins is crucially important for in-depth understanding their functions in a cell. With the explosive growth of protein sequences generated in the postgenomic age, it is highly demanded to develop computational tools for timely annotating their subcellular locations based on the sequence information alone. The current study is focused on virus proteins. Although considerable efforts have been made in this regard, the problem is far from being solved yet. Most existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions. This kind of multiplex proteins is particularly important for both basic research and drug design. Using the multi-label theory, we present a new predictor called "pLoc-mVirus" by extracting the optimal GO (Gene Ontology) information into the general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validation on a same stringent benchmark dataset indicated that the proposed pLoc-mVirus predictor is remarkably superior to iLoc-Virus, the state-of-the-art method in predicting virus protein subcellular localization. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mVirus/, by which users can easily get their desired results without the need to go through the complicated mathematics involved. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. pLoc-mPlant: predict subcellular localization of multi-location plant proteins by incorporating the optimal GO information into general PseAAC.

    Science.gov (United States)

    Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen

    2017-08-22

    One of the fundamental goals in cellular biochemistry is to identify the functions of proteins in the context of compartments that organize them in the cellular environment. To realize this, it is indispensable to develop an automated method for fast and accurate identification of the subcellular locations of uncharacterized proteins. The current study is focused on plant protein subcellular location prediction based on the sequence information alone. Although considerable efforts have been made in this regard, the problem is far from being solved yet. Most of the existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions. This kind of multiplex protein is particularly important for both basic research and drug design. Using the multi-label theory, we present a new predictor called "pLoc-mPlant" by extracting the optimal GO (Gene Ontology) information into the Chou's general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validation on the same stringent benchmark dataset indicated that the proposed pLoc-mPlant predictor is remarkably superior to iLoc-Plant, the state-of-the-art method for predicting plant protein subcellular localization. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at , by which users can easily get their desired results without the need to go through the complicated mathematics involved.

  16. Utilization of information technology in eastern North Carolina physician practices: determining the existence of a digital divide.

    Science.gov (United States)

    Rosenthal, David A; Layman, Elizabeth J

    2008-02-13

    The United States Department of Health and Human Services (DHHS) has emphasized the importance of utilizing health information technologies, thus making the availability of electronic resources critical for physicians across the country. However, few empirical assessments exist regarding the current status of computerization and utilization of electronic resources in physician offices and physicians' perceptions of the advantages and disadvantages of computerization. Through a survey of physicians' utilization and perceptions of health information technology, this study found that a "digital divide" existed for eastern North Carolina physicians in smaller physician practices. The physicians in smaller practices were less likely to utilize or be interested in utilizing electronic health records, word processing applications, and the Internet.

  17. AFAL: a web service for profiling amino acids surrounding ligands in proteins

    Science.gov (United States)

    Arenas-Salinas, Mauricio; Ortega-Salazar, Samuel; Gonzales-Nilo, Fernando; Pohl, Ehmke; Holmes, David S.; Quatrini, Raquel

    2014-11-01

    With advancements in crystallographic technology and the increasing wealth of information populating structural databases, there is an increasing need for prediction tools based on spatial information that will support the characterization of proteins and protein-ligand interactions. Herein, a new web service is presented termed amino acid frequency around ligand (AFAL) for determining amino acids type and frequencies surrounding ligands within proteins deposited in the Protein Data Bank and for assessing the atoms and atom-ligand distances involved in each interaction (availability: http://structuralbio.utalca.cl/AFAL/index.html). AFAL allows the user to define a wide variety of filtering criteria (protein family, source organism, resolution, sequence redundancy and distance) in order to uncover trends and evolutionary differences in amino acid preferences that define interactions with particular ligands. Results obtained from AFAL provide valuable statistical information about amino acids that may be responsible for establishing particular ligand-protein interactions. The analysis will enable investigators to compare ligand-binding sites of different proteins and to uncover general as well as specific interaction patterns from existing data. Such patterns can be used subsequently to predict ligand binding in proteins that currently have no structural information and to refine the interpretation of existing protein models. The application of AFAL is illustrated by the analysis of proteins interacting with adenosine-5'-triphosphate.

  18. Text Mining for Protein Docking.

    Directory of Open Access Journals (Sweden)

    Varsha D Badal

    2015-12-01

    Full Text Available The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking. Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu. The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound

  19. Defining the limits of homology modeling in information-driven protein docking

    NARCIS (Netherlands)

    Garcia Lopes Maia Rodrigues, João; Melquiond, A S J; Karaca, E; Trellet, M; van Dijk, M; van Zundert, G C P; Schmitz, C; de Vries, S J; Bordogna, A; Bonati, L; Kastritis, P L; Bonvin, Alexandre M J J; Garcia Lopes Maia Rodrigues, João

    2013-01-01

    Information-driven docking is currently one of the most successful approaches to obtain structural models of protein interactions as demonstrated in the latest round of CAPRI. While various experimental and computational techniques can be used to retrieve information about the binding mode, the

  20. Can infrared spectroscopy provide information on protein-protein interactions?

    Science.gov (United States)

    Haris, Parvez I

    2010-08-01

    For most biophysical techniques, characterization of protein-protein interactions is challenging; this is especially true with methods that rely on a physical phenomenon that is common to both of the interacting proteins. Thus, for example, in IR spectroscopy, the carbonyl vibration (1600-1700 cm(-1)) associated with the amide bonds from both of the interacting proteins will overlap extensively, making the interpretation of spectral changes very complicated. Isotope-edited infrared spectroscopy, where one of the interacting proteins is uniformly labelled with (13)C or (13)C,(15)N has been introduced as a solution to this problem, enabling the study of protein-protein interactions using IR spectroscopy. The large shift of the amide I band (approx. 45 cm(-1) towards lower frequency) upon (13)C labelling of one of the proteins reveals the amide I band of the unlabelled protein, enabling it to be used as a probe for monitoring conformational changes. With site-specific isotopic labelling, structural resolution at the level of individual amino acid residues can be achieved. Furthermore, the ability to record IR spectra of proteins in diverse environments means that isotope-edited IR spectroscopy can be used to structurally characterize difficult systems such as protein-protein complexes bound to membranes or large insoluble peptide/protein aggregates. In the present article, examples of application of isotope-edited IR spectroscopy for studying protein-protein interactions are provided.

  1. A novel Multi-Agent Ada-Boost algorithm for predicting protein structural class with the information of protein secondary structure.

    Science.gov (United States)

    Fan, Ming; Zheng, Bin; Li, Lihua

    2015-10-01

    Knowledge of the structural class of a given protein is important for understanding its folding patterns. Although a lot of efforts have been made, it still remains a challenging problem for prediction of protein structural class solely from protein sequences. The feature extraction and classification of proteins are the main problems in prediction. In this research, we extended our earlier work regarding these two aspects. In protein feature extraction, we proposed a scheme by calculating the word frequency and word position from sequences of amino acid, reduced amino acid, and secondary structure. For an accurate classification of the structural class of protein, we developed a novel Multi-Agent Ada-Boost (MA-Ada) method by integrating the features of Multi-Agent system into Ada-Boost algorithm. Extensive experiments were taken to test and compare the proposed method using four benchmark datasets in low homology. The results showed classification accuracies of 88.5%, 96.0%, 88.4%, and 85.5%, respectively, which are much better compared with the existing methods. The source code and dataset are available on request.

  2. MIToS.jl: mutual information tools for protein sequence analysis in the Julia language

    DEFF Research Database (Denmark)

    Zea, Diego J.; Anfossi, Diego; Nielsen, Morten

    2017-01-01

    Motivation: MIToS is an environment for mutual information analysis and a framework for protein multiple sequence alignments (MSAs) and protein structures (PDB) management in Julia language. It integrates sequence and structural information through SIFTS, making Pfam MSAs analysis straightforward....... MIToS streamlines the implementation of any measure calculated from residue contingency tables and its optimization and testing in terms of protein contact prediction. As an example, we implemented and tested a BLOSUM62-based pseudo-count strategy in mutual information analysis. Availability...... and Implementation: The software is totally implemented in Julia and supported for Linux, OS X and Windows. It’s freely available on GitHub under MIT license: http://mitos.leloir.org.ar. Contacts:diegozea@gmail.com or cmb@leloir.org.ar Supplementary information: Supplementary data are available at Bioinformatics...

  3. Experimental model considerations for the study of protein-energy malnutrition co-existing with ischemic brain injury.

    Science.gov (United States)

    Prosser-Loose, Erin J; Smith, Shari E; Paterson, Phyllis G

    2011-05-01

    Protein-energy malnutrition (PEM) affects ~16% of patients at admission for stroke. We previously modeled this in a gerbil global cerebral ischemia model and found that PEM impairs functional outcome and influences mechanisms of ischemic brain injury and recovery. Since this model is no longer reliable, we investigated the utility of the rat 2-vessel occlusion (2-VO) with hypotension model of global ischemia for further study of this clinical problem. Male, Sprague-Dawley rats were exposed to either control diet (18% protein) or PEM induced by feeding a low protein diet (2% protein) for 7d prior to either global ischemia or sham surgery. PEM did not significantly alter the hippocampal CA1 neuron death (p = 0.195 by 2-factor ANOVA) or the increase in dendritic injury caused by exposure to global ischemia. Unexpectedly, however, a strong trend was evident for PEM to decrease the consistency of hippocampal damage, as shown by an increased incidence of unilateral or no hippocampal damage (p=0.069 by chi-square analysis). Although PEM caused significant changes to baseline arterial blood pH, pO(2), pCO(2), and fasting glucose (p0.269). Intra-ischemic tympanic temperature and blood pressure were strictly and equally controlled between ischemic groups. We conclude that co-existing PEM confounded the consistency of hippocampal injury in the 2-VO model. Although the mechanisms responsible were not identified, this model of brain ischemia should not be used for studying this co-morbidity factor. © 2011 Bentham Science Publishers Ltd.

  4. Improving rates of screening and prevention by leveraging existing information systems.

    Science.gov (United States)

    Neil, Nancy

    2003-11-01

    In 1997 Virginia Mason Health System (VMMC), a vertically integrated hospital and multispecialty group practice, had no process or system to deliver the right patient clinical data, in the right form, at the right place--when providers needed it for effective patient care. Without any new investment in technology, a work group of five individuals leveraged existing, primarily paper-based information systems to launch development and implementation of a provider prompting tool--a primary care and prevention (PCP) report--which prompted providers to complete screening, prevention, and disease management services at every patient appointment. The work group developed and pilot tested the report and created a mechanism by which the report could be delivered just in time before each patient's appointment. The report integrated information from independent appointment scheduling, laboratory results reporting, patient demographics, and billing data sources. MEASURING THE PCP REPORT'S IMPACT: The results of two separate analyses demonstrate improvement in rates of screening and prevention across VMMC soon after the PCP report became available. These results led senior leadership to make the PCP report's utilization a systemwide imperative. The PCP report is used by nearly all primary care providers as a prompt to complete screening, prevention, and disease management services at every patient appointment.

  5. Accurate protein structure modeling using sparse NMR data and homologous structure information.

    Science.gov (United States)

    Thompson, James M; Sgourakis, Nikolaos G; Liu, Gaohua; Rossi, Paolo; Tang, Yuefeng; Mills, Jeffrey L; Szyperski, Thomas; Montelione, Gaetano T; Baker, David

    2012-06-19

    While information from homologous structures plays a central role in X-ray structure determination by molecular replacement, such information is rarely used in NMR structure determination because it can be incorrect, both locally and globally, when evolutionary relationships are inferred incorrectly or there has been considerable evolutionary structural divergence. Here we describe a method that allows robust modeling of protein structures of up to 225 residues by combining (1)H(N), (13)C, and (15)N backbone and (13)Cβ chemical shift data, distance restraints derived from homologous structures, and a physically realistic all-atom energy function. Accurate models are distinguished from inaccurate models generated using incorrect sequence alignments by requiring that (i) the all-atom energies of models generated using the restraints are lower than models generated in unrestrained calculations and (ii) the low-energy structures converge to within 2.0 Å backbone rmsd over 75% of the protein. Benchmark calculations on known structures and blind targets show that the method can accurately model protein structures, even with very remote homology information, to a backbone rmsd of 1.2-1.9 Å relative to the conventional determined NMR ensembles and of 0.9-1.6 Å relative to X-ray structures for well-defined regions of the protein structures. This approach facilitates the accurate modeling of protein structures using backbone chemical shift data without need for side-chain resonance assignments and extensive analysis of NOESY cross-peak assignments.

  6. Amino acid alphabet reduction preserves fold information contained in contact interactions in proteins.

    Science.gov (United States)

    Solis, Armando D

    2015-12-01

    To reduce complexity, understand generalized rules of protein folding, and facilitate de novo protein design, the 20-letter amino acid alphabet is commonly reduced to a smaller alphabet by clustering amino acids based on some measure of similarity. In this work, we seek the optimal alphabet that preserves as much of the structural information found in long-range (contact) interactions among amino acids in natively-folded proteins. We employ the Information Maximization Device, based on information theory, to partition the amino acids into well-defined clusters. Numbering from 2 to 19 groups, these optimal clusters of amino acids, while generated automatically, embody well-known properties of amino acids such as hydrophobicity/polarity, charge, size, and aromaticity, and are demonstrated to maintain the discriminative power of long-range interactions with minimal loss of mutual information. Our measurements suggest that reduced alphabets (of less than 10) are able to capture virtually all of the information residing in native contacts and may be sufficient for fold recognition, as demonstrated by extensive threading tests. In an expansive survey of the literature, we observe that alphabets derived from various approaches-including those derived from physicochemical intuition, local structure considerations, and sequence alignments of remote homologs-fare consistently well in preserving contact interaction information, highlighting a convergence in the various factors thought to be relevant to the folding code. Moreover, we find that alphabets commonly used in experimental protein design are nearly optimal and are largely coherent with observations that have arisen in this work. © 2015 Wiley Periodicals, Inc.

  7. POSSIBILITY OF IMPROVING EXISTING STANDARDS AND METHODOLOGIES FOR AUDITING INFORMATION SYSTEMS TO PROVIDE E-GOVERNMENT SERVICES

    Directory of Open Access Journals (Sweden)

    Евгений Геннадьевич Панкратов

    2014-03-01

    Full Text Available This article analyzes the existing methods of e-government systems audit, their shortcomings are examined.  The approaches to improve existing techniques and adapt them to the specific characteristics of e-government systems are suggested. The paper describes the methodology, providing possibilities of integrated assessment of information systems. This methodology uses systems maturity models and can be used in the construction of e-government rankings, as well as in the audit of their implementation process. Maturity models are based on COBIT, COSO methodologies and models of e-government, developed by the relevant committee of the UN. The methodology was tested during the audit of information systems involved in the payment of temporary disability benefits. The audit was carried out during analysis of the outcome of the pilot project for the abolition of the principle of crediting payments for disability benefits.DOI: http://dx.doi.org/10.12731/2218-7405-2014-2-5

  8. Representation of protein-sequence information by amino acid subalphabets

    DEFF Research Database (Denmark)

    Andersen, C.A.F.; Brunak, Søren

    2004-01-01

    -sequence information, using machine learning strategies, where the primary goal is the discovery of novel powerful representations for use in AI techniques. In the case of proteins and the 20 different amino acids they typically contain, it is also a secondary goal to discover how the current selection of amino acids...

  9. PROGRAM SYSTEM AND INFORMATION METADATA BANK OF TERTIARY PROTEIN STRUCTURES

    Directory of Open Access Journals (Sweden)

    T. A. Nikitin

    2013-01-01

    Full Text Available The article deals with the architecture of metadata storage model for check results of three-dimensional protein structures. Concept database model was built. The service and procedure of database update as well as data transformation algorithms for protein structures and their quality were presented. Most important information about entries and their submission forms to store, access, and delivery to users were highlighted. Software suite was developed for the implementation of functional tasks using Java programming language in the NetBeans v.7.0 environment and JQL to query and interact with the database JavaDB. The service was tested and results have shown system effectiveness while protein structures filtration.

  10. pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information.

    Science.gov (United States)

    Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen

    2018-05-01

    For in-depth understanding the functions of proteins in a cell, the knowledge of their subcellular localization is indispensable. The current study is focused on human protein subcellular location prediction based on the sequence information alone. Although considerable efforts have been made in this regard, the problem is far from being solved yet. Most existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions that are particularly important for both basic research and drug design. Using the multi-label theory, we present a new predictor called 'pLoc-mHum' by extracting the crucial GO (Gene Ontology) information into the general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validations on a same stringent benchmark dataset have indicated that the proposed pLoc-mHum predictor is remarkably superior to iLoc-Hum, the state-of-the-art method in predicting the human protein subcellular localization. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mHum/, by which users can easily get their desired results without the need to go through the complicated mathematics involved. xcheng@gordonlifescience.org. Supplementary data are available at Bioinformatics online.

  11. 77 FR 42004 - Proposed Extension of Existing Information Collection; Main Fan Operation and Inspection in Gassy...

    Science.gov (United States)

    2012-07-17

    ... Extension of Existing Information Collection; Main Fan Operation and Inspection in Gassy Underground Metal...) conditions in underground metal and nonmetal mines are largely controlled by the main mine fans. When accumulations of explosive gases, such as methane, are not swept from the mine by the main fans, they may...

  12. 77 FR 26046 - Proposed Extension of Existing Information Collection; Ground Control for Surface Coal Mines and...

    Science.gov (United States)

    2012-05-02

    ... Extension of Existing Information Collection; Ground Control for Surface Coal Mines and Surface Work Areas of Underground Coal Mines AGENCY: Mine Safety and Health Administration, Labor. ACTION: Request for... inspections and investigations in coal or other mines shall be made each year for the purposes of, among other...

  13. Regulator of G Protein Signaling 7 (RGS7) Can Exist in a Homo-oligomeric Form That Is Regulated by Gαo and R7-binding Protein.

    Science.gov (United States)

    Tayou, Junior; Wang, Qiang; Jang, Geeng-Fu; Pronin, Alexey N; Orlandi, Cesare; Martemyanov, Kirill A; Crabb, John W; Slepak, Vladlen Z

    2016-04-22

    RGS (regulator of G protein signaling) proteins of the R7 subfamily (RGS6, -7, -9, and -11) are highly expressed in neurons where they regulate many physiological processes. R7 RGS proteins contain several distinct domains and form obligatory dimers with the atypical Gβ subunit, Gβ5 They also interact with other proteins such as R7-binding protein, R9-anchoring protein, and the orphan receptors GPR158 and GPR179. These interactions facilitate plasma membrane targeting and stability of R7 proteins and modulate their activity. Here, we investigated RGS7 complexes using in situ chemical cross-linking. We found that in mouse brain and transfected cells cross-linking causes formation of distinct RGS7 complexes. One of the products had the apparent molecular mass of ∼150 kDa on SDS-PAGE and did not contain Gβ5 Mass spectrometry analysis showed no other proteins to be present within the 150-kDa complex in the amount close to stoichiometric with RGS7. This finding suggested that RGS7 could form a homo-oligomer. Indeed, co-immunoprecipitation of differentially tagged RGS7 constructs, with or without chemical cross-linking, demonstrated RGS7 self-association. RGS7-RGS7 interaction required the DEP domain but not the RGS and DHEX domains or the Gβ5 subunit. Using transfected cells and knock-out mice, we demonstrated that R7-binding protein had a strong inhibitory effect on homo-oligomerization of RGS7. In contrast, our data indicated that GPR158 could bind to the RGS7 homo-oligomer without causing its dissociation. Co-expression of constitutively active Gαo prevented the RGS7-RGS7 interaction. These results reveal the existence of RGS protein homo-oligomers and show regulation of their assembly by R7 RGS-binding partners. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  14. The Mitochondrial Protein Atlas: A Database of Experimentally Verified Information on the Human Mitochondrial Proteome.

    Science.gov (United States)

    Godin, Noa; Eichler, Jerry

    2017-09-01

    Given its central role in various biological systems, as well as its involvement in numerous pathologies, the mitochondrion is one of the best-studied organelles. However, although the mitochondrial genome has been extensively investigated, protein-level information remains partial, and in many cases, hypothetical. The Mitochondrial Protein Atlas (MPA; URL: lifeserv.bgu.ac.il/wb/jeichler/MPA ) is a database that provides a complete, manually curated inventory of only experimentally validated human mitochondrial proteins. The MPA presently contains 911 unique protein entries, each of which is associated with at least one experimentally validated and referenced mitochondrial localization. The MPA also contains experimentally validated and referenced information defining function, structure, involvement in pathologies, interactions with other MPA proteins, as well as the method(s) of analysis used in each instance. Connections to relevant external data sources are offered for each entry, including links to NCBI Gene, PubMed, and Protein Data Bank. The MPA offers a prototype for other information sources that allow for a distinction between what has been confirmed and what remains to be verified experimentally.

  15. Predicting protein folding rate change upon point mutation using residue-level coevolutionary information.

    Science.gov (United States)

    Mallik, Saurav; Das, Smita; Kundu, Sudip

    2016-01-01

    Change in folding kinetics of globular proteins upon point mutation is crucial to a wide spectrum of biological research, such as protein misfolding, toxicity, and aggregations. Here we seek to address whether residue-level coevolutionary information of globular proteins can be informative to folding rate changes upon point mutations. Generating residue-level coevolutionary networks of globular proteins, we analyze three parameters: relative coevolution order (rCEO), network density (ND), and characteristic path length (CPL). A point mutation is considered to be equivalent to a node deletion of this network and respective percentage changes in rCEO, ND, CPL are found linearly correlated (0.84, 0.73, and -0.61, respectively) with experimental folding rate changes. The three parameters predict the folding rate change upon a point mutation with 0.031, 0.045, and 0.059 standard errors, respectively. © 2015 Wiley Periodicals, Inc.

  16. Plant-mPLoc: a top-down strategy to augment the power for predicting plant protein subcellular localization.

    Directory of Open Access Journals (Sweden)

    Kuo-Chen Chou

    Full Text Available One of the fundamental goals in proteomics and cell biology is to identify the functions of proteins in various cellular organelles and pathways. Information of subcellular locations of proteins can provide useful insights for revealing their functions and understanding how they interact with each other in cellular network systems. Most of the existing methods in predicting plant protein subcellular localization can only cover three or four location sites, and none of them can be used to deal with multiplex plant proteins that can simultaneously exist at two, or move between, two or more different location sits. Actually, such multiplex proteins might have special biological functions worthy of particular notice. The present study was devoted to improve the existing plant protein subcellular location predictors from the aforementioned two aspects. A new predictor called "Plant-mPLoc" is developed by integrating the gene ontology information, functional domain information, and sequential evolutionary information through three different modes of pseudo amino acid composition. It can be used to identify plant proteins among the following 12 location sites: (1 cell membrane, (2 cell wall, (3 chloroplast, (4 cytoplasm, (5 endoplasmic reticulum, (6 extracellular, (7 Golgi apparatus, (8 mitochondrion, (9 nucleus, (10 peroxisome, (11 plastid, and (12 vacuole. Compared with the existing methods for predicting plant protein subcellular localization, the new predictor is much more powerful and flexible. Particularly, it also has the capacity to deal with multiple-location proteins, which is beyond the reach of any existing predictors specialized for identifying plant protein subcellular localization. As a user-friendly web-server, Plant-mPLoc is freely accessible at http://www.csbio.sjtu.edu.cn/bioinf/plant-multi/. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to

  17. Minimal information: an urgent need to assess the functional reliability of recombinant proteins used in biological experiments

    Directory of Open Access Journals (Sweden)

    de Marco Ario

    2008-07-01

    Full Text Available Abstract Structural characterization of proteins used in biological experiments is largely neglected. In most publications, the information available is totally insufficient to judge the functionality of the proteins used and, therefore, the significance of identified protein-protein interactions (was the interaction specific or due to unspecific binding of misfolded protein regions? or reliability of kinetic and thermodynamic data (how much protein was in its native form?. As a consequence, the results of single experiments might not only become questionable, but the whole reliability of systems biology, built on these fundaments, would be weakened. The introduction of Minimal Information concerning purified proteins to add as metadata to the main body of a manuscript would render straightforward the assessment of their functional and structural qualities and, consequently, of results obtained using these proteins. Furthermore, accepted standards for protein annotation would simplify data comparison and exchange. This article has been envisaged as a proposal for aggregating scientists who share the opinion that the scientific community needs a platform for Minimum Information for Protein Functionality Evaluation (MIPFE.

  18. A protein-dependent side-chain rotamer library.

    KAUST Repository

    Bhuyan, M.S.; Gao, Xin

    2011-01-01

    Protein side-chain packing problem has remained one of the key open problems in bioinformatics. The three main components of protein side-chain prediction methods are a rotamer library, an energy function and a search algorithm. Rotamer libraries summarize the existing knowledge of the experimentally determined structures quantitatively. Depending on how much contextual information is encoded, there are backbone-independent rotamer libraries and backbone-dependent rotamer libraries. Backbone-independent libraries only encode sequential information, whereas backbone-dependent libraries encode both sequential and locally structural information. However, side-chain conformations are determined by spatially local information, rather than sequentially local information. Since in the side-chain prediction problem, the backbone structure is given, spatially local information should ideally be encoded into the rotamer libraries. In this paper, we propose a new type of backbone-dependent rotamer library, which encodes structural information of all the spatially neighboring residues. We call it protein-dependent rotamer libraries. Given any rotamer library and a protein backbone structure, we first model the protein structure as a Markov random field. Then the marginal distributions are estimated by the inference algorithms, without doing global optimization or search. The rotamers from the given library are then re-ranked and associated with the updated probabilities. Experimental results demonstrate that the proposed protein-dependent libraries significantly outperform the widely used backbone-dependent libraries in terms of the side-chain prediction accuracy and the rotamer ranking ability. Furthermore, without global optimization/search, the side-chain prediction power of the protein-dependent library is still comparable to the global-search-based side-chain prediction methods.

  19. A protein-dependent side-chain rotamer library.

    KAUST Repository

    Bhuyan, M.S.

    2011-12-14

    Protein side-chain packing problem has remained one of the key open problems in bioinformatics. The three main components of protein side-chain prediction methods are a rotamer library, an energy function and a search algorithm. Rotamer libraries summarize the existing knowledge of the experimentally determined structures quantitatively. Depending on how much contextual information is encoded, there are backbone-independent rotamer libraries and backbone-dependent rotamer libraries. Backbone-independent libraries only encode sequential information, whereas backbone-dependent libraries encode both sequential and locally structural information. However, side-chain conformations are determined by spatially local information, rather than sequentially local information. Since in the side-chain prediction problem, the backbone structure is given, spatially local information should ideally be encoded into the rotamer libraries. In this paper, we propose a new type of backbone-dependent rotamer library, which encodes structural information of all the spatially neighboring residues. We call it protein-dependent rotamer libraries. Given any rotamer library and a protein backbone structure, we first model the protein structure as a Markov random field. Then the marginal distributions are estimated by the inference algorithms, without doing global optimization or search. The rotamers from the given library are then re-ranked and associated with the updated probabilities. Experimental results demonstrate that the proposed protein-dependent libraries significantly outperform the widely used backbone-dependent libraries in terms of the side-chain prediction accuracy and the rotamer ranking ability. Furthermore, without global optimization/search, the side-chain prediction power of the protein-dependent library is still comparable to the global-search-based side-chain prediction methods.

  20. Enhancing the prediction of protein pairings between interacting families using orthology information

    Directory of Open Access Journals (Sweden)

    Pazos Florencio

    2008-01-01

    Full Text Available Abstract Background It has repeatedly been shown that interacting protein families tend to have similar phylogenetic trees. These similarities can be used to predicting the mapping between two families of interacting proteins (i.e. which proteins from one family interact with which members of the other. The correct mapping will be that which maximizes the similarity between the trees. The two families may eventually comprise orthologs and paralogs, if members of the two families are present in more than one organism. This fact can be exploited to restrict the possible mappings, simply by impeding links between proteins of different organisms. We present here an algorithm to predict the mapping between families of interacting proteins which is able to incorporate information regarding orthologues, or any other assignment of proteins to "classes" that may restrict possible mappings. Results For the first time in methods for predicting mappings, we have tested this new approach on a large number of interacting protein domains in order to statistically assess its performance. The method accurately predicts around 80% in the most favourable cases. We also analysed in detail the results of the method for a well defined case of interacting families, the sensor and kinase components of the Ntr-type two-component system, for which up to 98% of the pairings predicted by the method were correct. Conclusion Based on the well established relationship between tree similarity and interactions we developed a method for predicting the mapping between two interacting families using genomic information alone. The program is available through a web interface.

  1. Exploration of the dynamic properties of protein complexes predicted from spatially constrained protein-protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Eric A Yen

    2014-05-01

    Full Text Available Protein complexes are not static, but rather highly dynamic with subunits that undergo 1-dimensional diffusion with respect to each other. Interactions within protein complexes are modulated through regulatory inputs that alter interactions and introduce new components and deplete existing components through exchange. While it is clear that the structure and function of any given protein complex is coupled to its dynamical properties, it remains a challenge to predict the possible conformations that complexes can adopt. Protein-fragment Complementation Assays detect physical interactions between protein pairs constrained to ≤8 nm from each other in living cells. This method has been used to build networks composed of 1000s of pair-wise interactions. Significantly, these networks contain a wealth of dynamic information, as the assay is fully reversible and the proteins are expressed in their natural context. In this study, we describe a method that extracts this valuable information in the form of predicted conformations, allowing the user to explore the conformational landscape, to search for structures that correlate with an activity state, and estimate the abundance of conformations in the living cell. The generator is based on a Markov Chain Monte Carlo simulation that uses the interaction dataset as input and is constrained by the physical resolution of the assay. We applied this method to an 18-member protein complex composed of the seven core proteins of the budding yeast Arp2/3 complex and 11 associated regulators and effector proteins. We generated 20,480 output structures and identified conformational states using principle component analysis. We interrogated the conformation landscape and found evidence of symmetry breaking, a mixture of likely active and inactive conformational states and dynamic exchange of the core protein Arc15 between core and regulatory components. Our method provides a novel tool for prediction and

  2. Integrating genomic information with protein sequence and 3D atomic level structure at the RCSB protein data bank.

    Science.gov (United States)

    Prlic, Andreas; Kalro, Tara; Bhattacharya, Roshni; Christie, Cole; Burley, Stephen K; Rose, Peter W

    2016-12-15

    The Protein Data Bank (PDB) now contains more than 120,000 three-dimensional (3D) structures of biological macromolecules. To allow an interpretation of how PDB data relates to other publicly available annotations, we developed a novel data integration platform that maps 3D structural information across various datasets. This integration bridges from the human genome across protein sequence to 3D structure space. We developed novel software solutions for data management and visualization, while incorporating new libraries for web-based visualization using SVG graphics. The new views are available from http://www.rcsb.org and software is available from https://github.com/rcsb/. andreas.prlic@rcsb.orgSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  3. Automated quantitative assessment of proteins' biological function in protein knowledge bases.

    Science.gov (United States)

    Mayr, Gabriele; Lepperdinger, Günter; Lackner, Peter

    2008-01-01

    Primary protein sequence data are archived in databases together with information regarding corresponding biological functions. In this respect, UniProt/Swiss-Prot is currently the most comprehensive collection and it is routinely cross-examined when trying to unravel the biological role of hypothetical proteins. Bioscientists frequently extract single entries and further evaluate those on a subjective basis. In lieu of a standardized procedure for scoring the existing knowledge regarding individual proteins, we here report about a computer-assisted method, which we applied to score the present knowledge about any given Swiss-Prot entry. Applying this quantitative score allows the comparison of proteins with respect to their sequence yet highlights the comprehension of functional data. pfs analysis may be also applied for quality control of individual entries or for database management in order to rank entry listings.

  4. Automated Quantitative Assessment of Proteins' Biological Function in Protein Knowledge Bases

    Directory of Open Access Journals (Sweden)

    Gabriele Mayr

    2008-01-01

    Full Text Available Primary protein sequence data are archived in databases together with information regarding corresponding biological functions. In this respect, UniProt/Swiss-Prot is currently the most comprehensive collection and it is routinely cross-examined when trying to unravel the biological role of hypothetical proteins. Bioscientists frequently extract single entries and further evaluate those on a subjective basis. In lieu of a standardized procedure for scoring the existing knowledge regarding individual proteins, we here report about a computer-assisted method, which we applied to score the present knowledge about any given Swiss-Prot entry. Applying this quantitative score allows the comparison of proteins with respect to their sequence yet highlights the comprehension of functional data. pfs analysis may be also applied for quality control of individual entries or for database management in order to rank entry listings.

  5. The Coding of Biological Information: From Nucleotide Sequence to Protein Recognition

    Science.gov (United States)

    Štambuk, Nikola

    The paper reviews the classic results of Swanson, Dayhoff, Grantham, Blalock and Root-Bernstein, which link genetic code nucleotide patterns to the protein structure, evolution and molecular recognition. Symbolic representation of the binary addresses defining particular nucleotide and amino acid properties is discussed, with consideration of: structure and metric of the code, direct correspondence between amino acid and nucleotide information, and molecular recognition of the interacting protein motifs coded by the complementary DNA and RNA strands.

  6. The Protein Information Management System (PiMS): a generic tool for any structural biology research laboratory

    International Nuclear Information System (INIS)

    Morris, Chris; Pajon, Anne; Griffiths, Susanne L.; Daniel, Ed; Savitsky, Marc; Lin, Bill; Diprose, Jonathan M.; Wilter da Silva, Alan; Pilicheva, Katya; Troshin, Peter; Niekerk, Johannes van; Isaacs, Neil; Naismith, James; Nave, Colin; Blake, Richard; Wilson, Keith S.; Stuart, David I.; Henrick, Kim; Esnouf, Robert M.

    2011-01-01

    The Protein Information Management System (PiMS) is described together with a discussion of how its features make it well suited to laboratories of all sizes. The techniques used in protein production and structural biology have been developing rapidly, but techniques for recording the laboratory information produced have not kept pace. One approach is the development of laboratory information-management systems (LIMS), which typically use a relational database schema to model and store results from a laboratory workflow. The underlying philosophy and implementation of the Protein Information Management System (PiMS), a LIMS development specifically targeted at the flexible and unpredictable workflows of protein-production research laboratories of all scales, is described. PiMS is a web-based Java application that uses either Postgres or Oracle as the underlying relational database-management system. PiMS is available under a free licence to all academic laboratories either for local installation or for use as a managed service

  7. The Protein Information Management System (PiMS): a generic tool for any structural biology research laboratory

    Energy Technology Data Exchange (ETDEWEB)

    Morris, Chris [STFC Daresbury Laboratory, Warrington WA4 4AD (United Kingdom); Pajon, Anne [Wellcome Trust Genome Campus, Hinxton CB10 1SD (United Kingdom); Griffiths, Susanne L. [University of York, Heslington, York YO10 5DD (United Kingdom); Daniel, Ed [STFC Daresbury Laboratory, Warrington WA4 4AD (United Kingdom); Savitsky, Marc [University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom); Lin, Bill [STFC Daresbury Laboratory, Warrington WA4 4AD (United Kingdom); Diprose, Jonathan M. [University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom); Wilter da Silva, Alan [Wellcome Trust Genome Campus, Hinxton CB10 1SD (United Kingdom); Pilicheva, Katya [University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom); Troshin, Peter [STFC Daresbury Laboratory, Warrington WA4 4AD (United Kingdom); Niekerk, Johannes van [University of Dundee, Dundee DD1 5EH, Scotland (United Kingdom); Isaacs, Neil [University of Glasgow, Glasgow G12 8QQ, Scotland (United Kingdom); Naismith, James [University of St Andrews, St Andrews, Fife KY16 9ST, Scotland (United Kingdom); Nave, Colin; Blake, Richard [STFC Daresbury Laboratory, Warrington WA4 4AD (United Kingdom); Wilson, Keith S. [University of York, Heslington, York YO10 5DD (United Kingdom); Stuart, David I. [University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom); Henrick, Kim [Wellcome Trust Genome Campus, Hinxton CB10 1SD (United Kingdom); Esnouf, Robert M., E-mail: robert@strubi.ox.ac.uk [University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom); STFC Daresbury Laboratory, Warrington WA4 4AD (United Kingdom)

    2011-04-01

    The Protein Information Management System (PiMS) is described together with a discussion of how its features make it well suited to laboratories of all sizes. The techniques used in protein production and structural biology have been developing rapidly, but techniques for recording the laboratory information produced have not kept pace. One approach is the development of laboratory information-management systems (LIMS), which typically use a relational database schema to model and store results from a laboratory workflow. The underlying philosophy and implementation of the Protein Information Management System (PiMS), a LIMS development specifically targeted at the flexible and unpredictable workflows of protein-production research laboratories of all scales, is described. PiMS is a web-based Java application that uses either Postgres or Oracle as the underlying relational database-management system. PiMS is available under a free licence to all academic laboratories either for local installation or for use as a managed service.

  8. The protein side of the central dogma: permanence and change.

    Science.gov (United States)

    Morange, Michel

    2006-01-01

    There are two facets to the central dogma proposed by Francis Crick in 1957. One concerns the relation between the sequence of nucleotides and the sequence of amino acids, the second is devoted to the relation between the sequence of amino acids and the native three-dimensional structure of proteins. 'Folding is simply a function of the order of the amino acids,' i.e. no information is required for the proper folding of a protein other than the information contained in its sequence. This protein side of the central dogma was elaborated in a scientific context in which the characteristics and functions of proteins, and the mechanisms of protein folding, were seen very differently. This context, which made the folding problem a simple one, supported the bold proposition of Francis Crick. The protein side of the central dogma was not challenged by the discovery of prions if one adopts the definition of information given by Francis Crick. It might have been challenged by the discovery that regulatory enzymes exist in different conformations, and the evidence for the existence of chaperones assisting protein folding. But it was not, and folding remains what it was for Francis Crick, 'simply a function of the order of amino acids'. But the meaning of 'function' has dramatically changed. It is no longer the result of simple physicochemical laws, but that of a long evolutionary process which has optimized protein folding. Molecular mechanistic explanations have to be allied with evolutionary explanations, in a way characteristic of present biology.

  9. A Novel Approach for Protein-Named Entity Recognition and Protein-Protein Interaction Extraction

    Directory of Open Access Journals (Sweden)

    Meijing Li

    2015-01-01

    Full Text Available Many researchers focus on developing protein-named entity recognition (Protein-NER or PPI extraction systems. However, the studies about these two topics cannot be merged well; then existing PPI extraction systems’ Protein-NER still needs to improve. In this paper, we developed the protein-protein interaction extraction system named PPIMiner based on Support Vector Machine (SVM and parsing tree. PPIMiner consists of three main models: natural language processing (NLP model, Protein-NER model, and PPI discovery model. The Protein-NER model, which is named ProNER, identifies the protein names based on two methods: dictionary-based method and machine learning-based method. ProNER is capable of identifying more proteins than dictionary-based Protein-NER model in other existing systems. The final discovered PPIs extracted via PPI discovery model are represented in detail because we showed the protein interaction types and the occurrence frequency through two different methods. In the experiments, the result shows that the performances achieved by our ProNER and PPI discovery model are better than other existing tools. PPIMiner applied this protein-named entity recognition approach and parsing tree based PPI extraction method to improve the performance of PPI extraction. We also provide an easy-to-use interface to access PPIs database and an online system for PPIs extraction and Protein-NER.

  10. A Type-2 fuzzy data fusion approach for building reliable weighted protein interaction networks with application in protein complex detection.

    Science.gov (United States)

    Mehranfar, Adele; Ghadiri, Nasser; Kouhsar, Morteza; Golshani, Ashkan

    2017-09-01

    Detecting the protein complexes is an important task in analyzing the protein interaction networks. Although many algorithms predict protein complexes in different ways, surveys on the interaction networks indicate that about 50% of detected interactions are false positives. Consequently, the accuracy of existing methods needs to be improved. In this paper we propose a novel algorithm to detect the protein complexes in 'noisy' protein interaction data. First, we integrate several biological data sources to determine the reliability of each interaction and determine more accurate weights for the interactions. A data fusion component is used for this step, based on the interval type-2 fuzzy voter that provides an efficient combination of the information sources. This fusion component detects the errors and diminishes their effect on the detection protein complexes. So in the first step, the reliability scores have been assigned for every interaction in the network. In the second step, we have proposed a general protein complex detection algorithm by exploiting and adopting the strong points of other algorithms and existing hypotheses regarding real complexes. Finally, the proposed method has been applied for the yeast interaction datasets for predicting the interactions. The results show that our framework has a better performance regarding precision and F-measure than the existing approaches. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. From Point Clouds to Building Information Models: 3D Semi-Automatic Reconstruction of Indoors of Existing Buildings

    Directory of Open Access Journals (Sweden)

    Hélène Macher

    2017-10-01

    Full Text Available The creation of as-built Building Information Models requires the acquisition of the as-is state of existing buildings. Laser scanners are widely used to achieve this goal since they permit to collect information about object geometry in form of point clouds and provide a large amount of accurate data in a very fast way and with a high level of details. Unfortunately, the scan-to-BIM (Building Information Model process remains currently largely a manual process which is time consuming and error-prone. In this paper, a semi-automatic approach is presented for the 3D reconstruction of indoors of existing buildings from point clouds. Several segmentations are performed so that point clouds corresponding to grounds, ceilings and walls are extracted. Based on these point clouds, walls and slabs of buildings are reconstructed and described in the IFC format in order to be integrated into BIM software. The assessment of the approach is proposed thanks to two datasets. The evaluation items are the degree of automation, the transferability of the approach and the geometric quality of results of the 3D reconstruction. Additionally, quality indexes are introduced to inspect the results in order to be able to detect potential errors of reconstruction.

  12. Normal mode analysis as a method to derive protein dynamics information from the Protein Data Bank.

    Science.gov (United States)

    Wako, Hiroshi; Endo, Shigeru

    2017-12-01

    Normal mode analysis (NMA) can facilitate quick and systematic investigation of protein dynamics using data from the Protein Data Bank (PDB). We developed an elastic network model-based NMA program using dihedral angles as independent variables. Compared to the NMA programs that use Cartesian coordinates as independent variables, key attributes of the proposed program are as follows: (1) chain connectivity related to the folding pattern of a polypeptide chain is naturally embedded in the model; (2) the full-atom system is acceptable, and owing to a considerably smaller number of independent variables, the PDB data can be used without further manipulation; (3) the number of variables can be easily reduced by some of the rotatable dihedral angles; (4) the PDB data for any molecule besides proteins can be considered without coarse-graining; and (5) individual motions of constituent subunits and ligand molecules can be easily decomposed into external and internal motions to examine their mutual and intrinsic motions. Its performance is illustrated with an example of a DNA-binding allosteric protein, a catabolite activator protein. In particular, the focus is on the conformational change upon cAMP and DNA binding, and on the communication between their binding sites remotely located from each other. In this illustration, NMA creates a vivid picture of the protein dynamics at various levels of the structures, i.e., atoms, residues, secondary structures, domains, subunits, and the complete system, including DNA and cAMP. Comparative studies of the specific protein in different states, e.g., apo- and holo-conformations, and free and complexed configurations, provide useful information for studying structurally and functionally important aspects of the protein.

  13. Reversible unfolding of infectious prion assemblies reveals the existence of an oligomeric elementary brick.

    Directory of Open Access Journals (Sweden)

    Angélique Igel-Egalon

    2017-09-01

    Full Text Available Mammalian prions, the pathogens that cause transmissible spongiform encephalopathies, propagate by self-perpetuating the structural information stored in the abnormally folded, aggregated conformer (PrPSc of the host-encoded prion protein (PrPC. To date, no structural model related to prion assembly organization satisfactorily describes how strain-specified structural information is encoded and by which mechanism this information is transferred to PrPC. To achieve progress on this issue, we correlated the PrPSc quaternary structural transition from three distinct prion strains during unfolding and refolding with their templating activity. We reveal the existence of a mesoscopic organization in PrPSc through the packing of a highly stable oligomeric elementary subunit (suPrP, in which the strain structural determinant (SSD is encoded. Once kinetically trapped, this elementary subunit reversibly loses all replicative information. We demonstrate that acquisition of the templating interface and infectivity requires structural rearrangement of suPrP, in concert with its condensation. The existence of such an elementary brick scales down the SSD support to a small oligomer and provide a basis of reflexion for prion templating process and propagation.

  14. Prediction of glutathionylation sites in proteins using minimal sequence information and their experimental validation.

    Science.gov (United States)

    Pal, Debojyoti; Sharma, Deepak; Kumar, Mukesh; Sandur, Santosh K

    2016-09-01

    S-glutathionylation of proteins plays an important role in various biological processes and is known to be protective modification during oxidative stress. Since, experimental detection of S-glutathionylation is labor intensive and time consuming, bioinformatics based approach is a viable alternative. Available methods require relatively longer sequence information, which may prevent prediction if sequence information is incomplete. Here, we present a model to predict glutathionylation sites from pentapeptide sequences. It is based upon differential association of amino acids with glutathionylated and non-glutathionylated cysteines from a database of experimentally verified sequences. This data was used to calculate position dependent F-scores, which measure how a particular amino acid at a particular position may affect the likelihood of glutathionylation event. Glutathionylation-score (G-score), indicating propensity of a sequence to undergo glutathionylation, was calculated using position-dependent F-scores for each amino-acid. Cut-off values were used for prediction. Our model returned an accuracy of 58% with Matthew's correlation-coefficient (MCC) value of 0.165. On an independent dataset, our model outperformed the currently available model, in spite of needing much less sequence information. Pentapeptide motifs having high abundance among glutathionylated proteins were identified. A list of potential glutathionylation hotspot sequences were obtained by assigning G-scores and subsequent Protein-BLAST analysis revealed a total of 254 putative glutathionable proteins, a number of which were already known to be glutathionylated. Our model predicted glutathionylation sites in 93.93% of experimentally verified glutathionylated proteins. Outcome of this study may assist in discovering novel glutathionylation sites and finding candidate proteins for glutathionylation.

  15. ProteinTracker: an application for managing protein production and purification.

    Science.gov (United States)

    Ponko, Stefan C; Bienvenue, David

    2012-05-10

    Laboratories that produce protein reagents for research and development face the challenge of deciding whether to track batch-related data using simple file based storage mechanisms (e.g. spreadsheets and notebooks), or commit the time and effort to install, configure and maintain a more complex laboratory information management system (LIMS). Managing reagent data stored in files is challenging because files are often copied, moved, and reformatted. Furthermore, there is no simple way to query the data if/when questions arise. Commercial LIMS often include additional modules that may be paid for but not actually used, and often require software expertise to truly customize them for a given environment. This web-application allows small to medium-sized protein production groups to track data related to plasmid DNA, conditioned media samples (supes), cell lines used for expression, and purified protein information, including method of purification and quality control results. In addition, a request system was added that includes a means of prioritizing requests to help manage the high demand of protein production resources at most organizations. ProteinTracker makes extensive use of existing open-source libraries and is designed to track essential data related to the production and purification of proteins. ProteinTracker is an open-source web-based application that provides organizations with the ability to track key data involved in the production and purification of proteins and may be modified to meet the specific needs of an organization. The source code and database setup script can be downloaded from http://sourceforge.net/projects/proteintracker. This site also contains installation instructions and a user guide. A demonstration version of the application can be viewed at http://www.proteintracker.org.

  16. Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization.

    Science.gov (United States)

    Wang, Hua; Huang, Heng; Ding, Chris; Nie, Feiping

    2013-04-01

    Protein interactions are central to all the biological processes and structural scaffolds in living organisms, because they orchestrate a number of cellular processes such as metabolic pathways and immunological recognition. Several high-throughput methods, for example, yeast two-hybrid system and mass spectrometry method, can help determine protein interactions, which, however, suffer from high false-positive rates. Moreover, many protein interactions predicted by one method are not supported by another. Therefore, computational methods are necessary and crucial to complete the interactome expeditiously. In this work, we formulate the problem of predicting protein interactions from a new mathematical perspective--sparse matrix completion, and propose a novel nonnegative matrix factorization (NMF)-based matrix completion approach to predict new protein interactions from existing protein interaction networks. Through using manifold regularization, we further develop our method to integrate different biological data sources, such as protein sequences, gene expressions, protein structure information, etc. Extensive experimental results on four species, Saccharomyces cerevisiae, Drosophila melanogaster, Homo sapiens, and Caenorhabditis elegans, have shown that our new methods outperform related state-of-the-art protein interaction prediction methods.

  17. Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions.

    Science.gov (United States)

    Ochoa, David; García-Gutiérrez, Ponciano; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2013-01-27

    A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.

  18. SCOWLP: a web-based database for detailed characterization and visualization of protein interfaces

    Directory of Open Access Journals (Sweden)

    Schroeder Michael

    2006-03-01

    Full Text Available Abstract Background Currently there is a strong need for methods that help to obtain an accurate description of protein interfaces in order to be able to understand the principles that govern molecular recognition and protein function. Many of the recent efforts to computationally identify and characterize protein networks extract protein interaction information at atomic resolution from the PDB. However, they pay none or little attention to small protein ligands and solvent. They are key components and mediators of protein interactions and fundamental for a complete description of protein interfaces. Interactome profiling requires the development of computational tools to extract and analyze protein-protein, protein-ligand and detailed solvent interaction information from the PDB in an automatic and comparative fashion. Adding this information to the existing one on protein-protein interactions will allow us to better understand protein interaction networks and protein function. Description SCOWLP (Structural Characterization Of Water, Ligands and Proteins is a user-friendly and publicly accessible web-based relational database for detailed characterization and visualization of the PDB protein interfaces. The SCOWLP database includes proteins, peptidic-ligands and interface water molecules as descriptors of protein interfaces. It contains currently 74,907 protein interfaces and 2,093,976 residue-residue interactions formed by 60,664 structural units (protein domains and peptidic-ligands and their interacting solvent. The SCOWLP web-server allows detailed structural analysis and comparisons of protein interfaces at atomic level by text query of PDB codes and/or by navigating a SCOP-based tree. It includes a visualization tool to interactively display the interfaces and label interacting residues and interface solvent by atomic physicochemical properties. SCOWLP is automatically updated with every SCOP release. Conclusion SCOWLP enriches

  19. Improving protein fold recognition and structural class prediction accuracies using physicochemical properties of amino acids.

    Science.gov (United States)

    Raicar, Gaurav; Saini, Harsh; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok

    2016-08-07

    Predicting the three-dimensional (3-D) structure of a protein is an important task in the field of bioinformatics and biological sciences. However, directly predicting the 3-D structure from the primary structure is hard to achieve. Therefore, predicting the fold or structural class of a protein sequence is generally used as an intermediate step in determining the protein's 3-D structure. For protein fold recognition (PFR) and structural class prediction (SCP), two steps are required - feature extraction step and classification step. Feature extraction techniques generally utilize syntactical-based information, evolutionary-based information and physicochemical-based information to extract features. In this study, we explore the importance of utilizing the physicochemical properties of amino acids for improving PFR and SCP accuracies. For this, we propose a Forward Consecutive Search (FCS) scheme which aims to strategically select physicochemical attributes that will supplement the existing feature extraction techniques for PFR and SCP. An exhaustive search is conducted on all the existing 544 physicochemical attributes using the proposed FCS scheme and a subset of physicochemical attributes is identified. Features extracted from these selected attributes are then combined with existing syntactical-based and evolutionary-based features, to show an improvement in the recognition and prediction performance on benchmark datasets. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Prediction of vitamin interacting residues in a vitamin binding protein using evolutionary information

    Directory of Open Access Journals (Sweden)

    Panwar Bharat

    2013-02-01

    Full Text Available Abstract Background The vitamins are important cofactors in various enzymatic-reactions. In past, many inhibitors have been designed against vitamin binding pockets in order to inhibit vitamin-protein interactions. Thus, it is important to identify vitamin interacting residues in a protein. It is possible to detect vitamin-binding pockets on a protein, if its tertiary structure is known. Unfortunately tertiary structures of limited proteins are available. Therefore, it is important to develop in-silico models for predicting vitamin interacting residues in protein from its primary structure. Results In this study, first we compared protein-interacting residues of vitamins with other ligands using Two Sample Logo (TSL. It was observed that ATP, GTP, NAD, FAD and mannose preferred {G,R,K,S,H}, {G,K,T,S,D,N}, {T,G,Y}, {G,Y,W} and {Y,D,W,N,E} residues respectively, whereas vitamins preferred {Y,F,S,W,T,G,H} residues for the interaction with proteins. Furthermore, compositional information of preferred and non-preferred residues along with patterns-specificity was also observed within different vitamin-classes. Vitamins A, B and B6 preferred {F,I,W,Y,L,V}, {S,Y,G,T,H,W,N,E} and {S,T,G,H,Y,N} interacting residues respectively. It suggested that protein-binding patterns of vitamins are different from other ligands, and motivated us to develop separate predictor for vitamins and their sub-classes. The four different prediction modules, (i vitamin interacting residues (VIRs, (ii vitamin-A interacting residues (VAIRs, (iii vitamin-B interacting residues (VBIRs and (iv pyridoxal-5-phosphate (vitamin B6 interacting residues (PLPIRs have been developed. We applied various classifiers of SVM, BayesNet, NaiveBayes, ComplementNaiveBayes, NaiveBayesMultinomial, RandomForest and IBk etc., as machine learning techniques, using binary and Position-Specific Scoring Matrix (PSSM features of protein sequences. Finally, we selected best performing SVM modules and

  1. Prediction of vitamin interacting residues in a vitamin binding protein using evolutionary information.

    Science.gov (United States)

    Panwar, Bharat; Gupta, Sudheer; Raghava, Gajendra P S

    2013-02-07

    The vitamins are important cofactors in various enzymatic-reactions. In past, many inhibitors have been designed against vitamin binding pockets in order to inhibit vitamin-protein interactions. Thus, it is important to identify vitamin interacting residues in a protein. It is possible to detect vitamin-binding pockets on a protein, if its tertiary structure is known. Unfortunately tertiary structures of limited proteins are available. Therefore, it is important to develop in-silico models for predicting vitamin interacting residues in protein from its primary structure. In this study, first we compared protein-interacting residues of vitamins with other ligands using Two Sample Logo (TSL). It was observed that ATP, GTP, NAD, FAD and mannose preferred {G,R,K,S,H}, {G,K,T,S,D,N}, {T,G,Y}, {G,Y,W} and {Y,D,W,N,E} residues respectively, whereas vitamins preferred {Y,F,S,W,T,G,H} residues for the interaction with proteins. Furthermore, compositional information of preferred and non-preferred residues along with patterns-specificity was also observed within different vitamin-classes. Vitamins A, B and B6 preferred {F,I,W,Y,L,V}, {S,Y,G,T,H,W,N,E} and {S,T,G,H,Y,N} interacting residues respectively. It suggested that protein-binding patterns of vitamins are different from other ligands, and motivated us to develop separate predictor for vitamins and their sub-classes. The four different prediction modules, (i) vitamin interacting residues (VIRs), (ii) vitamin-A interacting residues (VAIRs), (iii) vitamin-B interacting residues (VBIRs) and (iv) pyridoxal-5-phosphate (vitamin B6) interacting residues (PLPIRs) have been developed. We applied various classifiers of SVM, BayesNet, NaiveBayes, ComplementNaiveBayes, NaiveBayesMultinomial, RandomForest and IBk etc., as machine learning techniques, using binary and Position-Specific Scoring Matrix (PSSM) features of protein sequences. Finally, we selected best performing SVM modules and obtained highest MCC of 0.53, 0.48, 0.61, 0

  2. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  3. Biases in the experimental annotations of protein function and their effect on our understanding of protein function space.

    Directory of Open Access Journals (Sweden)

    Alexandra M Schnoes

    Full Text Available The ongoing functional annotation of proteins relies upon the work of curators to capture experimental findings from scientific literature and apply them to protein sequence and structure data. However, with the increasing use of high-throughput experimental assays, a small number of experimental studies dominate the functional protein annotations collected in databases. Here, we investigate just how prevalent is the "few articles - many proteins" phenomenon. We examine the experimentally validated annotation of proteins provided by several groups in the GO Consortium, and show that the distribution of proteins per published study is exponential, with 0.14% of articles providing the source of annotations for 25% of the proteins in the UniProt-GOA compilation. Since each of the dominant articles describes the use of an assay that can find only one function or a small group of functions, this leads to substantial biases in what we know about the function of many proteins. Mass-spectrometry, microscopy and RNAi experiments dominate high throughput experiments. Consequently, the functional information derived from these experiments is mostly of the subcellular location of proteins, and of the participation of proteins in embryonic developmental pathways. For some organisms, the information provided by different studies overlap by a large amount. We also show that the information provided by high throughput experiments is less specific than those provided by low throughput experiments. Given the experimental techniques available, certain biases in protein function annotation due to high-throughput experiments are unavoidable. Knowing that these biases exist and understanding their characteristics and extent is important for database curators, developers of function annotation programs, and anyone who uses protein function annotation data to plan experiments.

  4. The Existence of Equilibrium Asset Price Under Diverse Information

    Directory of Open Access Journals (Sweden)

    R. Agus Sartono

    2005-09-01

    Our model shows that the more diverse the information, the higher the lambda coefficient which means the market becomes less liquid. The models consistent with Miller (1977 who found that the bigger the gap of private information is, the less liquid the market will be. If both informed traders have the same information they will demand the same amount of risky asset and it turns out to be similar as in the Kyle (1985 model.

  5. Seven fundamental, unsolved questions in molecular biology. Cooperative storage and bi-directional transfer of biological information by nucleic acids and proteins: an alternative to "central dogma".

    Science.gov (United States)

    Biro, J C

    2004-01-01

    The Human Genome Mapping Project provided us a large amount of sequence data. However our understanding of these data did not grow proportionally, because old dogmas still set the limits of our thinking. The gene-centric, reductionistical side of molecular biology is reviewed and seven problems are formulated, each indicating the insufficiency of the "central dogma". The following is concluded and suggested: 1. Genes are located and expressed on both DNA strands; 2. Introns are the source of important biological regulation and diversity; 3. Repeats are the frame of the chromatin structure and participate in the chromatin regulation; 4. The molecular accessibility of the canonical dsDNA structure is poor; 5. The genetic code is co-evolved with the amino acids and there is a stereochemical matching between the codes andamino acids; 6. The flow of information between nucleic acids and proteins is bi-directional and reverse translation might exist; 7. Complex genetic information is always carried and stored by nucleic acids and proteins together.

  6. ProteinTracker: an application for managing protein production and purification

    Directory of Open Access Journals (Sweden)

    Ponko Stefan C

    2012-05-01

    Full Text Available Abstract Background Laboratories that produce protein reagents for research and development face the challenge of deciding whether to track batch-related data using simple file based storage mechanisms (e.g. spreadsheets and notebooks, or commit the time and effort to install, configure and maintain a more complex laboratory information management system (LIMS. Managing reagent data stored in files is challenging because files are often copied, moved, and reformatted. Furthermore, there is no simple way to query the data if/when questions arise. Commercial LIMS often include additional modules that may be paid for but not actually used, and often require software expertise to truly customize them for a given environment. Findings This web-application allows small to medium-sized protein production groups to track data related to plasmid DNA, conditioned media samples (supes, cell lines used for expression, and purified protein information, including method of purification and quality control results. In addition, a request system was added that includes a means of prioritizing requests to help manage the high demand of protein production resources at most organizations. ProteinTracker makes extensive use of existing open-source libraries and is designed to track essential data related to the production and purification of proteins. Conclusions ProteinTracker is an open-source web-based application that provides organizations with the ability to track key data involved in the production and purification of proteins and may be modified to meet the specific needs of an organization. The source code and database setup script can be downloaded from http://sourceforge.net/projects/proteintracker. This site also contains installation instructions and a user guide. A demonstration version of the application can be viewed at http://www.proteintracker.org.

  7. ProteinTracker: an application for managing protein production and purification

    Science.gov (United States)

    2012-01-01

    Background Laboratories that produce protein reagents for research and development face the challenge of deciding whether to track batch-related data using simple file based storage mechanisms (e.g. spreadsheets and notebooks), or commit the time and effort to install, configure and maintain a more complex laboratory information management system (LIMS). Managing reagent data stored in files is challenging because files are often copied, moved, and reformatted. Furthermore, there is no simple way to query the data if/when questions arise. Commercial LIMS often include additional modules that may be paid for but not actually used, and often require software expertise to truly customize them for a given environment. Findings This web-application allows small to medium-sized protein production groups to track data related to plasmid DNA, conditioned media samples (supes), cell lines used for expression, and purified protein information, including method of purification and quality control results. In addition, a request system was added that includes a means of prioritizing requests to help manage the high demand of protein production resources at most organizations. ProteinTracker makes extensive use of existing open-source libraries and is designed to track essential data related to the production and purification of proteins. Conclusions ProteinTracker is an open-source web-based application that provides organizations with the ability to track key data involved in the production and purification of proteins and may be modified to meet the specific needs of an organization. The source code and database setup script can be downloaded from http://sourceforge.net/projects/proteintracker. This site also contains installation instructions and a user guide. A demonstration version of the application can be viewed at http://www.proteintracker.org. PMID:22574679

  8. The master two-dimensional gel database of human AMA cell proteins: towards linking protein and genome sequence and mapping information (update 1991)

    DEFF Research Database (Denmark)

    Celis, J E; Leffers, H; Rasmussen, H H

    1991-01-01

    autoantigens" and "cDNAs". For convenience we have included an alphabetical list of all known proteins recorded in this database. In the long run, the main goal of this database is to link protein and DNA sequencing and mapping information (Human Genome Program) and to provide an integrated picture......The master two-dimensional gel database of human AMA cells currently lists 3801 cellular and secreted proteins, of which 371 cellular polypeptides (306 IEF; 65 NEPHGE) were added to the master images during the last 10 months. These include: (i) very basic and acidic proteins that do not focus...

  9. The Protein Information Management System (PiMS): a generic tool for any structural biology research laboratory.

    Science.gov (United States)

    Morris, Chris; Pajon, Anne; Griffiths, Susanne L; Daniel, Ed; Savitsky, Marc; Lin, Bill; Diprose, Jonathan M; da Silva, Alan Wilter; Pilicheva, Katya; Troshin, Peter; van Niekerk, Johannes; Isaacs, Neil; Naismith, James; Nave, Colin; Blake, Richard; Wilson, Keith S; Stuart, David I; Henrick, Kim; Esnouf, Robert M

    2011-04-01

    The techniques used in protein production and structural biology have been developing rapidly, but techniques for recording the laboratory information produced have not kept pace. One approach is the development of laboratory information-management systems (LIMS), which typically use a relational database schema to model and store results from a laboratory workflow. The underlying philosophy and implementation of the Protein Information Management System (PiMS), a LIMS development specifically targeted at the flexible and unpredictable workflows of protein-production research laboratories of all scales, is described. PiMS is a web-based Java application that uses either Postgres or Oracle as the underlying relational database-management system. PiMS is available under a free licence to all academic laboratories either for local installation or for use as a managed service.

  10. Accessible surface area of proteins from purely sequence information and the importance of global features

    Science.gov (United States)

    Faraggi, Eshel; Zhou, Yaoqi; Kloczkowski, Andrzej

    2014-03-01

    We present a new approach for predicting the accessible surface area of proteins. The novelty of this approach lies in not using residue mutation profiles generated by multiple sequence alignments as descriptive inputs. Rather, sequential window information and the global monomer and dimer compositions of the chain are used. We find that much of the lost accuracy due to the elimination of evolutionary information is recouped by the use of global features. Furthermore, this new predictor produces similar results for proteins with or without sequence homologs deposited in the Protein Data Bank, and hence shows generalizability. Finally, these predictions are obtained in a small fraction (1/1000) of the time required to run mutation profile based prediction. All these factors indicate the possible usability of this work in de-novo protein structure prediction and in de-novo protein design using iterative searches. Funded in part by the financial support of the National Institutes of Health through Grants R01GM072014 and R01GM073095, and the National Science Foundation through Grant NSF MCB 1071785.

  11. Biases in the Experimental Annotations of Protein Function and Their Effect on Our Understanding of Protein Function Space

    Science.gov (United States)

    Schnoes, Alexandra M.; Ream, David C.; Thorman, Alexander W.; Babbitt, Patricia C.; Friedberg, Iddo

    2013-01-01

    The ongoing functional annotation of proteins relies upon the work of curators to capture experimental findings from scientific literature and apply them to protein sequence and structure data. However, with the increasing use of high-throughput experimental assays, a small number of experimental studies dominate the functional protein annotations collected in databases. Here, we investigate just how prevalent is the “few articles - many proteins” phenomenon. We examine the experimentally validated annotation of proteins provided by several groups in the GO Consortium, and show that the distribution of proteins per published study is exponential, with 0.14% of articles providing the source of annotations for 25% of the proteins in the UniProt-GOA compilation. Since each of the dominant articles describes the use of an assay that can find only one function or a small group of functions, this leads to substantial biases in what we know about the function of many proteins. Mass-spectrometry, microscopy and RNAi experiments dominate high throughput experiments. Consequently, the functional information derived from these experiments is mostly of the subcellular location of proteins, and of the participation of proteins in embryonic developmental pathways. For some organisms, the information provided by different studies overlap by a large amount. We also show that the information provided by high throughput experiments is less specific than those provided by low throughput experiments. Given the experimental techniques available, certain biases in protein function annotation due to high-throughput experiments are unavoidable. Knowing that these biases exist and understanding their characteristics and extent is important for database curators, developers of function annotation programs, and anyone who uses protein function annotation data to plan experiments. PMID:23737737

  12. Evidence for the Existence of One Antenna-Associated, Lipid-Dissolved and Two Protein-Bound Pools of Diadinoxanthin Cycle Pigments in Diatoms[C][W

    Science.gov (United States)

    Lepetit, Bernard; Volke, Daniela; Gilbert, Matthias; Wilhelm, Christian; Goss, Reimund

    2010-01-01

    We studied the localization of diadinoxanthin cycle pigments in the diatoms Cyclotella meneghiniana and Phaeodactylum tricornutum. Isolation of pigment protein complexes revealed that the majority of high-light-synthesized diadinoxanthin and diatoxanthin is associated with the fucoxanthin chlorophyll protein (FCP) complexes. The characterization of intact cells, thylakoid membranes, and pigment protein complexes by absorption and low-temperature fluorescence spectroscopy showed that the FCPs contain certain amounts of protein-bound diadinoxanthin cycle pigments, which are not significantly different in high-light and low-light cultures. The largest part of high-light-formed diadinoxanthin cycle pigments, however, is not bound to antenna apoproteins but located in a lipid shield around the FCPs, which is copurified with the complexes. This lipid shield is primarily composed of the thylakoid membrane lipid monogalactosyldiacylglycerol. We also show that the photosystem I (PSI) fraction contains a tightly connected FCP complex that is enriched in protein-bound diadinoxanthin cycle pigments. The peripheral FCP and the FCP associated with PSI are composed of different apoproteins. Tandem mass spectrometry analysis revealed that the peripheral FCP is composed mainly of the light-harvesting complex protein Lhcf and also significant amounts of Lhcr. The PSI fraction, on the other hand, shows an enrichment of Lhcr proteins, which are thus responsible for the diadinoxanthin cycle pigment binding. The existence of lipid-dissolved and protein-bound diadinoxanthin cycle pigments in the peripheral antenna and in PSI is discussed with respect to different specific functions of the xanthophylls. PMID:20935178

  13. How to prove the existence of metabolons?

    DEFF Research Database (Denmark)

    Bassard, Jean-Étienne André; Halkier, Barbara Ann

    2017-01-01

    Sequential enzymes in biosynthetic pathways are organized in metabolons. It is challenging to provide experimental evidence for the existence of metabolons as biosynthetic pathways are composed of highly dynamic protein–protein interactions. Many different methods are being applied, each with str...

  14. SAS-Based Studies of Protein Fibrillation

    DEFF Research Database (Denmark)

    Marasini, Carlotta; Vestergaard, Bente

    2017-01-01

    Protein fibrillation is associated with a number of fatal amyloid diseases (e.g. Alzheimer's and Parkinson's diseases). From a structural point of view, the aggregation process starts from an ensemble of native states that convert into transiently formed oligomers, higher order assemblies and pro...... and highlight existing reports, exemplifying the wealth of information that can be derived from the method....

  15. Worldwide Protein Data Bank validation information: usage and trends.

    Science.gov (United States)

    Smart, Oliver S; Horský, Vladimír; Gore, Swanand; Svobodová Vařeková, Radka; Bendová, Veronika; Kleywegt, Gerard J; Velankar, Sameer

    2018-03-01

    Realising the importance of assessing the quality of the biomolecular structures deposited in the Protein Data Bank (PDB), the Worldwide Protein Data Bank (wwPDB) partners established Validation Task Forces to obtain advice on the methods and standards to be used to validate structures determined by X-ray crystallography, nuclear magnetic resonance spectroscopy and three-dimensional electron cryo-microscopy. The resulting wwPDB validation pipeline is an integral part of the wwPDB OneDep deposition, biocuration and validation system. The wwPDB Validation Service webserver (https://validate.wwpdb.org) can be used to perform checks prior to deposition. Here, it is shown how validation metrics can be combined to produce an overall score that allows the ranking of macromolecular structures and domains in search results. The ValTrends DB database provides users with a convenient way to access and analyse validation information and other properties of X-ray crystal structures in the PDB, including investigating trends in and correlations between different structure properties and validation metrics.

  16. Learning to leverage existing information systems: Part 1. Principles.

    Science.gov (United States)

    Neil, Nancy; Nerenz, David

    2003-10-01

    The success of performance improvement efforts depends on effective measurement and feedback regarding clinical processes and outcomes. Yet most health care organizations have fragmented rather than integrated data systems. Methods and practical guidance are provided for leveraging available information sources to obtain and create valid performance improvement-related information for use by clinicians and administrators. At Virginia Mason Health System (VMHS; Seattle), a vertically integrated hospital and multispecialty group practice, patient records are paper based and are supplemented with electronic reporting for laboratory and radiology services. Despite growth in the resources and interest devoted to organization-wide performance measurement, quality improvement, and evidence-based tools, VMHS's information systems consist of largely stand-alone, legacy systems organized around the ability to retrieve information on patients, one at a time. By 2002, without any investment in technology, VMHS had developed standardized, clinic-wide key indicators of performance updated and reported regularly at the patient, provider, site, and organizational levels. On the basis of VHMS's experience, principles can be suggested to guide other organizations to explore solutions using their own information systems: for example, start simply, but start; identify information needs; tap multiple data streams; and improve incrementally.

  17. A study on the relevance and influence of the existing regulation and risk informed/performance based regulation

    Energy Technology Data Exchange (ETDEWEB)

    Cheong, B. J.; Kang, J. M.; Kim, H. S.; Koh, S. H.; Kang, D. H.; Park, C. H. [Cheju Univ., Jeju (Korea, Republic of)

    2003-02-15

    The goal of this study is to estimate the relevance and Influence of the Existing Regulation and the RI-PBR to the institutionalization of the regulatory system. This study reviews the current regulatory system and the status of the RI-PBR implementation of the US NRC and Korea based upon SECY Papers, Risk Informed Regulation Implementation Plan (RIRIP) of the US NRC and other domestic studies. In order to investigate the perceptions, knowledge level, ground for the regulatory change, a survey was performed to Korean nuclear utilities, researchers and regulators on the perception on the RIR. The questionnaire was composed of 50 questions regarding personal details on work experience, level of education and specific field of work ; level of knowledge on the risk informed performance based regulation (RI-PBR); the perception of the current regulation, the effectiveness, level of procedure, flexibility, dependency on the regulator and personal view, and the perception of the RI-PBR such as flexibility of regulation, introduction time and the effect of RI-PBR, safety improvement, public perception, parts of the existing regulatory system that should be changed, etc. 515 answered from all sectors of the nuclear field; utilities, engineering companies, research institutes, and regulatory bodies.

  18. MIPS: a database for protein sequences, homology data and yeast genome information.

    Science.gov (United States)

    Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

    1997-01-01

    The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498

  19. Diversity and evolution of ABC proteins in basidiomycetes.

    Science.gov (United States)

    Kovalchuk, Andriy; Lee, Yong-Hwan; Asiegbu, Fred O

    2013-01-01

    ABC proteins constitute one of the largest families of proteins. They are implicated in wide variety of cellular processes ranging from ribosome biogenesis to multidrug resistance. With the advance of fungal genomics, the number of known fungal ABC proteins increases rapidly but the information on their biological functions remains scarce. In this work we extended the previous analysis of fungal ABC proteins to include recently sequenced species of basidiomycetes. We performed an identification and initial cataloging of ABC proteins from 23 fungal species representing 10 orders within class Agaricomycotina. We identified more than 1000 genes coding for ABC proteins. Comparison of sets of ABC proteins present in basidiomycetes and ascomycetes revealed the existence of two groups of ABC proteins specific for basidiomycetes. Results of survey should contribute to the better understanding of evolution of ABC proteins in fungi and support further experimental work on their characterization.

  20. Canola/rapeseed protein-functionality and nutrition

    Directory of Open Access Journals (Sweden)

    Wanasundara Janitha P.D.

    2016-07-01

    Full Text Available Protein rich meal is a valuable co-product of canola/rapeseed oil extraction. Seed storage proteins that include cruciferin (11S and napin (2S dominate the protein complement of canola while oleosins, lipid transfer proteins and other minor proteins of non-storage nature are also found. Although oil-free canola meal contains 36–40% protein on a dry weight basis, non-protein components including fibre, polymeric phenolics, phytates and sinapine, etc. of the seed coat and cellular components make protein less suitable for food use. Separation of canola protein from non-protein components is a technical challenge but necessary to obtain full nutritional and functional potential of protein. Process conditions of raw material and protein preparation are critical of nutritional and functional value of the final protein product. The storage proteins of canola can satisfy many nutritional and functional requirements for food applications. Protein macromolecules of canola also provide functionalities required in applications beyond edible uses; there exists substantial potential as a source of plant protein and a renewable biopolymer. Available information at present is mostly based on the protein products that can be obtained as mixtures of storage protein types and other chemical constituents of the seed; therefore, full potential of canola storage proteins is yet to be revealed.

  1. Minimum Information Loss Based Multi-kernel Learning for Flagellar Protein Recognition in Trypanosoma Brucei

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-12-01

    Trypanosma brucei (T. Brucei) is an important pathogen agent of African trypanosomiasis. The flagellum is an essential and multifunctional organelle of T. Brucei, thus it is very important to recognize the flagellar proteins from T. Brucei proteins for the purposes of both biological research and drug design. In this paper, we investigate computationally recognizing flagellar proteins in T. Brucei by pattern recognition methods. It is argued that an optimal decision function can be obtained as the difference of probability functions of flagella protein and the non-flagellar protein for the purpose of flagella protein recognition. We propose to learn a multi-kernel classification function to approximate this optimal decision function, by minimizing the information loss of such approximation which is measured by the Kull back-Leibler (KL) divergence. An iterative multi-kernel classifier learning algorithm is developed to minimize the KL divergence for the problem of T. Brucei flagella protein recognition, experiments show its advantage over other T. Brucei flagellar protein recognition and multi-kernel learning methods. © 2014 IEEE.

  2. Coulomb interactions between cytoplasmic electric fields and phosphorylated messenger proteins optimize information flow in cells.

    Directory of Open Access Journals (Sweden)

    Robert A Gatenby

    2010-08-01

    Full Text Available Normal cell function requires timely and accurate transmission of information from receptors on the cell membrane (CM to the nucleus. Movement of messenger proteins in the cytoplasm is thought to be dependent on random walk. However, Brownian motion will disperse messenger proteins throughout the cytosol resulting in slow and highly variable transit times. We propose that a critical component of information transfer is an intracellular electric field generated by distribution of charge on the nuclear membrane (NM. While the latter has been demonstrated experimentally for decades, the role of the consequent electric field has been assumed to be minimal due to a Debye length of about 1 nanometer that results from screening by intracellular Cl- and K+. We propose inclusion of these inorganic ions in the Debye-Huckel equation is incorrect because nuclear pores allow transit through the membrane at a rate far faster than the time to thermodynamic equilibrium. In our model, only the charged, mobile messenger proteins contribute to the Debye length.Using this revised model and published data, we estimate the NM possesses a Debye-Huckel length of a few microns and find this is consistent with recent measurement using intracellular nano-voltmeters. We demonstrate the field will accelerate isolated messenger proteins toward the nucleus through Coulomb interactions with negative charges added by phosphorylation. We calculate transit times as short as 0.01 sec. When large numbers of phosphorylated messenger proteins are generated by increasing concentrations of extracellular ligands, we demonstrate they generate a self-screening environment that regionally attenuates the cytoplasmic field, slowing movement but permitting greater cross talk among pathways. Preliminary experimental results with phosphorylated RAF are consistent with model predictions.This work demonstrates that previously unrecognized Coulomb interactions between phosphorylated messenger

  3. AT_CHLORO, a comprehensive chloroplast proteome database with subplastidial localization and curated information on envelope proteins.

    Science.gov (United States)

    Ferro, Myriam; Brugière, Sabine; Salvi, Daniel; Seigneurin-Berny, Daphné; Court, Magali; Moyet, Lucas; Ramus, Claire; Miras, Stéphane; Mellal, Mourad; Le Gall, Sophie; Kieffer-Jaquinod, Sylvie; Bruley, Christophe; Garin, Jérôme; Joyard, Jacques; Masselon, Christophe; Rolland, Norbert

    2010-06-01

    Recent advances in the proteomics field have allowed a series of high throughput experiments to be conducted on chloroplast samples, and the data are available in several public databases. However, the accurate localization of many chloroplast proteins often remains hypothetical. This is especially true for envelope proteins. We went a step further into the knowledge of the chloroplast proteome by focusing, in the same set of experiments, on the localization of proteins in the stroma, the thylakoids, and envelope membranes. LC-MS/MS-based analyses first allowed building the AT_CHLORO database (http://www.grenoble.prabi.fr/protehome/grenoble-plant-proteomics/), a comprehensive repertoire of the 1323 proteins, identified by 10,654 unique peptide sequences, present in highly purified chloroplasts and their subfractions prepared from Arabidopsis thaliana leaves. This database also provides extensive proteomics information (peptide sequences and molecular weight, chromatographic retention times, MS/MS spectra, and spectral count) for a unique chloroplast protein accurate mass and time tag database gathering identified peptides with their respective and precise analytical coordinates, molecular weight, and retention time. We assessed the partitioning of each protein in the three chloroplast compartments by using a semiquantitative proteomics approach (spectral count). These data together with an in-depth investigation of the literature were compiled to provide accurate subplastidial localization of previously known and newly identified proteins. A unique knowledge base containing extensive information on the proteins identified in envelope fractions was thus obtained, allowing new insights into this membrane system to be revealed. Altogether, the data we obtained provide unexpected information about plastidial or subplastidial localization of some proteins that were not suspected to be associated to this membrane system. The spectral counting-based strategy was further

  4. Protein Inference from the Integration of Tandem MS Data and Interactome Networks.

    Science.gov (United States)

    Zhong, Jiancheng; Wang, Jianxing; Ding, Xiaojun; Zhang, Zhen; Li, Min; Wu, Fang-Xiang; Pan, Yi

    2017-01-01

    Since proteins are digested into a mixture of peptides in the preprocessing step of tandem mass spectrometry (MS), it is difficult to determine which specific protein a shared peptide belongs to. In recent studies, besides tandem MS data and peptide identification information, some other information is exploited to infer proteins. Different from the methods which first use only tandem MS data to infer proteins and then use network information to refine them, this study proposes a protein inference method named TMSIN, which uses interactome networks directly. As two interacting proteins should co-exist, it is reasonable to assume that if one of the interacting proteins is confidently inferred in a sample, its interacting partners should have a high probability in the same sample, too. Therefore, we can use the neighborhood information of a protein in an interactome network to adjust the probability that the shared peptide belongs to the protein. In TMSIN, a multi-weighted graph is constructed by incorporating the bipartite graph with interactome network information, where the bipartite graph is built with the peptide identification information. Based on multi-weighted graphs, TMSIN adopts an iterative workflow to infer proteins. At each iterative step, the probability that a shared peptide belongs to a specific protein is calculated by using the Bayes' law based on the neighbor protein support scores of each protein which are mapped by the shared peptides. We carried out experiments on yeast data and human data to evaluate the performance of TMSIN in terms of ROC, q-value, and accuracy. The experimental results show that AUC scores yielded by TMSIN are 0.742 and 0.874 in yeast dataset and human dataset, respectively, and TMSIN yields the maximum number of true positives when q-value less than or equal to 0.05. The overlap analysis shows that TMSIN is an effective complementary approach for protein inference.

  5. Prediction of membrane transport proteins and their substrate specificities using primary sequence information.

    Directory of Open Access Journals (Sweden)

    Nitish K Mishra

    Full Text Available Membrane transport proteins (transporters move hydrophilic substrates across hydrophobic membranes and play vital roles in most cellular functions. Transporters represent a diverse group of proteins that differ in topology, energy coupling mechanism, and substrate specificity as well as sequence similarity. Among the functional annotations of transporters, information about their transporting substrates is especially important. The experimental identification and characterization of transporters is currently costly and time-consuming. The development of robust bioinformatics-based methods for the prediction of membrane transport proteins and their substrate specificities is therefore an important and urgent task.Support vector machine (SVM-based computational models, which comprehensively utilize integrative protein sequence features such as amino acid composition, dipeptide composition, physico-chemical composition, biochemical composition, and position-specific scoring matrices (PSSM, were developed to predict the substrate specificity of seven transporter classes: amino acid, anion, cation, electron, protein/mRNA, sugar, and other transporters. An additional model to differentiate transporters from non-transporters was also developed. Among the developed models, the biochemical composition and PSSM hybrid model outperformed other models and achieved an overall average prediction accuracy of 76.69% with a Mathews correlation coefficient (MCC of 0.49 and a receiver operating characteristic area under the curve (AUC of 0.833 on our main dataset. This model also achieved an overall average prediction accuracy of 78.88% and MCC of 0.41 on an independent dataset.Our analyses suggest that evolutionary information (i.e., the PSSM and the AAIndex are key features for the substrate specificity prediction of transport proteins. In comparison, similarity-based methods such as BLAST, PSI-BLAST, and hidden Markov models do not provide accurate predictions

  6. Purification of the spliced leader ribonucleoprotein particle from Leptomonas collosoma revealed the existence of an Sm protein in trypanosomes. Cloning the SmE homologue.

    Science.gov (United States)

    Goncharov, I; Palfi, Z; Bindereif, A; Michaeli, S

    1999-04-30

    Trans-splicing in trypanosomes involves the addition of a common spliced leader (SL) sequence, which is derived from a small RNA, the SL RNA, to all mRNA precursors. The SL RNA is present in the cell in the form of a ribonucleoprotein, the SL RNP. Using conventional chromatography and affinity selection with 2'-O-methylated RNA oligonucleotides at high ionic strength, five proteins of 70, 16, 13, 12, and 8 kDa were co-selected with the SL RNA from Leptomonas collosoma, representing the SL RNP core particle. Under conditions of lower ionic strength, additional proteins of 28 and 20 kDa were revealed. On the basis of peptide sequences, the gene coding for a protein with a predicted molecular weight of 11.9 kDa was cloned and identified as homologue of the cis-spliceosomal SmE. The protein carries the Sm motifs 1 and 2 characteristic of Sm antigens that bind to all known cis-spliceosomal uridylic acid-rich small nuclear RNAs (U snRNAs), suggesting the existence of Sm proteins in trypanosomes. This finding is of special interest because trypanosome snRNPs are the only snRNPs examined to date that are not recognized by anti-Sm antibodies. Because of the early divergence of trypanosomes from the eukaryotic lineage, the trypanosome SmE protein represents one of the primordial Sm proteins in nature.

  7. 78 FR 28894 - Agency Information Collection Activities: Extension, Without Change, of an Existing Information...

    Science.gov (United States)

    2013-05-16

    ... evaluate the company for inclusion in the IMAGE program. The information provided by the company plays a... entity in the private sector to participate in the program and the information obtained from the company... collection of information is necessary for the proper performance of the functions of the agency, including...

  8. A Review of Avian Monitoring and Mitigation Information at Existing Utility-Scale Solar Facilities

    Energy Technology Data Exchange (ETDEWEB)

    Walston, Leroy J. [Argonne National Lab. (ANL), Argonne, IL (United States); Rollins, Katherine E. [Argonne National Lab. (ANL), Argonne, IL (United States); Smith, Karen P. [Argonne National Lab. (ANL), Argonne, IL (United States); LaGory, Kirk E. [Argonne National Lab. (ANL), Argonne, IL (United States); Sinclair, Karin [National Renewable Energy Lab. (NREL), Golden, CO (United States); Turchi, Craig [National Renewable Energy Lab. (NREL), Golden, CO (United States); Wendelin, Tim [National Renewable Energy Lab. (NREL), Golden, CO (United States); Souder, Heidi [National Renewable Energy Lab. (NREL), Golden, CO (United States)

    2015-01-01

    There are two basic types of solar energy technology: photovoltaic and concentrating solar power. As the number of utility-scale solar energy facilities using these technologies is expected to increase in the United States, so are the potential impacts on wildlife and their habitats. Recent attention is on the risk of fatality to birds. Understanding the current rates of avian mortality and existing monitoring requirements is an important first step in developing science-based mitigation and minimization protocols. The resulting information also allows a comparison of the avian mortality rates of utility-scale solar energy facilities with those from other technologies and sources, as well as the identification of data gaps and research needs. This report will present and discuss the current state of knowledge regarding avian issues at utility-scale solar energy facilities.

  9. Ensemble Linear Neighborhood Propagation for Predicting Subchloroplast Localization of Multi-Location Proteins.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2016-12-02

    In the postgenomic era, the number of unreviewed protein sequences is remarkably larger and grows tremendously faster than that of reviewed ones. However, existing methods for protein subchloroplast localization often ignore the information from these unlabeled proteins. This paper proposes a multi-label predictor based on ensemble linear neighborhood propagation (LNP), namely, LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins. Experimental results on a stringent benchmark dataset and a novel independent dataset suggest that LNP-Chlo performs at least 6% (absolute) better than state-of-the-art predictors. This paper also demonstrates that ensemble LNP significantly outperforms LNP based on individual features. For readers' convenience, the online Web server LNP-Chlo is freely available at http://bioinfo.eie.polyu.edu.hk/LNPChloServer/ .

  10. A study on the relevance and influence of the existing regulation and risk informed/performance based regulation

    Energy Technology Data Exchange (ETDEWEB)

    Cheong, B. J.; Koh, Y. J.; Kim, H. S.; Koh, S. H.; Kang, D. H.; Kang, T. W. [Cheju National Univ., Jeju (Korea, Republic of)

    2004-02-15

    The goal of this study is to estimate the Relevance and Influence of the Existing Regulation and the RI-PBR to the institutionalization of the regulatory system. This study reviews the current regulatory system and the status of the RI-PBR implementation of the US NRC and Korea based upon SECY Papers, Risk Informed Regulation Implementation Plan (RIRIP) of the US NRC and other domestic studies. Also the recent trends of the individual technologies regarding the RI-PBR and RIA are summarized.

  11. Time-resolved infrared studies of protein conformational dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Woodruff, W.H.; Causgrove, T.P.; Dyer, R.B. [Los Alamos National Laboratory, NM (United States); Callender, R.H. [Univ. of New York, NY (United States)

    1994-12-01

    We have demonstrated that TRIR in the amide I region gives structural information regarding protein conformational changes in realtime, both on processes involved in the development of the functional structure (protein folding) and on protein structural changes that accompany the functional dynamics of the native structure. Assignment of many of the amide I peaks to specific amide or sidechain structures will require much additional effort. Specifically, the congestion and complexity of the protein vibrational spectra dictate that isotope studies are an absolute requirement for more than a qualitative notion of the structural interpretation of these measurements. It is clear, however, that enormous potential exists for elucidating structural relaxation dynamics and energetics with a high degree of structural specificity using this approach.

  12. Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.

    Science.gov (United States)

    Zhang, Shu-Bo; Tang, Qiang-Rong

    2016-07-21

    Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. The RCSB protein data bank: integrative view of protein, gene and 3D structural information.

    Science.gov (United States)

    Rose, Peter W; Prlić, Andreas; Altunkaya, Ali; Bi, Chunxiao; Bradley, Anthony R; Christie, Cole H; Costanzo, Luigi Di; Duarte, Jose M; Dutta, Shuchismita; Feng, Zukang; Green, Rachel Kramer; Goodsell, David S; Hudson, Brian; Kalro, Tara; Lowe, Robert; Peisach, Ezra; Randle, Christopher; Rose, Alexander S; Shao, Chenghua; Tao, Yi-Ping; Valasatava, Yana; Voigt, Maria; Westbrook, John D; Woo, Jesse; Yang, Huangwang; Young, Jasmine Y; Zardecki, Christine; Berman, Helen M; Burley, Stephen K

    2017-01-04

    The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, http://rcsb.org), the US data center for the global PDB archive, makes PDB data freely available to all users, from structural biologists to computational biologists and beyond. New tools and resources have been added to the RCSB PDB web portal in support of a 'Structural View of Biology.' Recent developments have improved the User experience, including the high-speed NGL Viewer that provides 3D molecular visualization in any web browser, improved support for data file download and enhanced organization of website pages for query, reporting and individual structure exploration. Structure validation information is now visible for all archival entries. PDB data have been integrated with external biological resources, including chromosomal position within the human genome; protein modifications; and metabolic pathways. PDB-101 educational materials have been reorganized into a searchable website and expanded to include new features such as the Geis Digital Archive. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Cell-specific monitoring of protein synthesis in vivo.

    Directory of Open Access Journals (Sweden)

    Nikos Kourtis

    Full Text Available Analysis of general and specific protein synthesis provides important information, relevant to cellular physiology and function. However, existing methodologies, involving metabolic labelling by incorporation of radioactive amino acids into nascent polypeptides, cannot be applied to monitor protein synthesis in specific cells or tissues, in live specimens. We have developed a novel approach for monitoring protein synthesis in specific cells or tissues, in vivo. Fluorescent reporter proteins such as GFP are expressed in specific cells and tissues of interest or throughout animals using appropriate promoters. Protein synthesis rates are assessed by following fluorescence recovery after partial photobleaching of the fluorophore at targeted sites. We evaluate the method by examining protein synthesis rates in diverse cell types of live, wild type or mRNA translation-defective Caenorhabditis elegans animals. Because it is non-invasive, our approach allows monitoring of protein synthesis in single cells or tissues with intrinsically different protein synthesis rates. Furthermore, it can be readily implemented in other organisms or cell culture systems.

  15. DeepLoc: prediction of protein subcellular localization using deep learning

    DEFF Research Database (Denmark)

    Almagro Armenteros, Jose Juan; Sønderby, Casper Kaae; Sønderby, Søren Kaae

    2017-01-01

    The prediction of eukaryotic protein subcellular localization is a well-studied topic in bioinformatics due to its relevance in proteomics research. Many machine learning methods have been successfully applied in this task, but in most of them, predictions rely on annotation of homologues from...... knowledge databases. For novel proteins where no annotated homologues exist, and for predicting the effects of sequence variants, it is desirable to have methods for predicting protein properties from sequence information only. Here, we present a prediction algorithm using deep neural networks to predict...... current state-of-the-art algorithms, including those relying on homology information. The method is available as a web server at http://www.cbs.dtu.dk/services/DeepLoc . Example code is available at https://github.com/JJAlmagro/subcellular_localization . The dataset is available at http...

  16. Kullback-Leibler information in resolving natural resource conflicts when definitive data exist

    Science.gov (United States)

    Anderson, D.R.; Burnham, K.P.; White, Gary C.

    2001-01-01

    Conflicts often arise in the management of natural resources. Often they result from differing perceptions, varying interpretations of the law, and self-interests among stakeholder groups (for example, the values and perceptions about spotted owls and forest management differ markedly among environmental groups, government regulatory agencies, and timber industries). We extend the conceptual approach to conflict resolution of Anderson et al. (1999) by using information-theoretic methods to provide quantitative evidence for differing stakeholder positions. Importantly, we assume that relevant empirical data exist that are central to the potential resolution of the conflict. We present a hypothetical example involving an experiment to assess potential effects of a chemical on monthly survival probabilities of the hen clam (Spisula solidissima). The conflict centers on 3 stakeholder positions: 1) no effect, 2) an acute effect, and 3) an acute and chronic effect of the chemical treatment. Such data were given to 18 analytical teams to make independent analyses and provide the relative evidence for each of 3 stakeholder positions in the conflict. The empirical evidence strongly supports only one of the 3 positions in the conflict: the application of the chemical causes acute and chronic effects on monthly survival, following treatment. Formal inference from all the stakeholder positions is provided for the 2 key parameters underlying the hen clam controversy. The estimates of these parameters were essentially unbiased (the relative bias for the control and treatment group's survival probability was -0.857% and 1.400%, respectively) and precise (coefficients of variation were 0.576% and 2.761%, respectively). The advantages of making formal inference from all the models, rather than drawing conclusions from only the estimated best model, is illustrated. Finally, we contrast information-theoretic and Bayesian approaches in terms of how positions in the controversy enter

  17. Neuron-Like Networks Between Ribosomal Proteins Within the Ribosome

    Science.gov (United States)

    Poirot, Olivier; Timsit, Youri

    2016-05-01

    From brain to the World Wide Web, information-processing networks share common scale invariant properties. Here, we reveal the existence of neural-like networks at a molecular scale within the ribosome. We show that with their extensions, ribosomal proteins form complex assortative interaction networks through which they communicate through tiny interfaces. The analysis of the crystal structures of 50S eubacterial particles reveals that most of these interfaces involve key phylogenetically conserved residues. The systematic observation of interactions between basic and aromatic amino acids at the interfaces and along the extension provides new structural insights that may contribute to decipher the molecular mechanisms of signal transmission within or between the ribosomal proteins. Similar to neurons interacting through “molecular synapses”, ribosomal proteins form a network that suggest an analogy with a simple molecular brain in which the “sensory-proteins” innervate the functional ribosomal sites, while the “inter-proteins” interconnect them into circuits suitable to process the information flow that circulates during protein synthesis. It is likely that these circuits have evolved to coordinate both the complex macromolecular motions and the binding of the multiple factors during translation. This opens new perspectives on nanoscale information transfer and processing.

  18. HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

    Science.gov (United States)

    López, Yosvany; Nakai, Kenta; Patil, Ashwini

    2015-01-01

    HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp. © The Author(s) 2015. Published by Oxford University Press.

  19. Wiki-pi: a web-server of annotated human protein-protein interactions to aid in discovery of protein function.

    Directory of Open Access Journals (Sweden)

    Naoki Orii

    Full Text Available Protein-protein interactions (PPIs are the basis of biological functions. Knowledge of the interactions of a protein can help understand its molecular function and its association with different biological processes and pathways. Several publicly available databases provide comprehensive information about individual proteins, such as their sequence, structure, and function. There also exist databases that are built exclusively to provide PPIs by curating them from published literature. The information provided in these web resources is protein-centric, and not PPI-centric. The PPIs are typically provided as lists of interactions of a given gene with links to interacting partners; they do not present a comprehensive view of the nature of both the proteins involved in the interactions. A web database that allows search and retrieval based on biomedical characteristics of PPIs is lacking, and is needed. We present Wiki-Pi (read Wiki-π, a web-based interface to a database of human PPIs, which allows users to retrieve interactions by their biomedical attributes such as their association to diseases, pathways, drugs and biological functions. Each retrieved PPI is shown with annotations of both of the participant proteins side-by-side, creating a basis to hypothesize the biological function facilitated by the interaction. Conceptually, it is a search engine for PPIs analogous to PubMed for scientific literature. Its usefulness in generating novel scientific hypotheses is demonstrated through the study of IGSF21, a little-known gene that was recently identified to be associated with diabetic retinopathy. Using Wiki-Pi, we infer that its association to diabetic retinopathy may be mediated through its interactions with the genes HSPB1, KRAS, TMSB4X and DGKD, and that it may be involved in cellular response to external stimuli, cytoskeletal organization and regulation of molecular activity. The website also provides a wiki-like capability allowing users

  20. Triplet Excited States as a Source of Relevant (Bio)Chemical Information

    OpenAIRE

    Jiménez Molero, María Consuelo; Miranda Alonso, Miguel Ángel

    2014-01-01

    The properties of triplet excited states are markedly medium-dependent, which turns this species into valuable tools for investigating the microenvironments existing in protein binding pockets. Monitoring of the triplet excited state behavior of drugs within transport proteins (serum albumins and alpha(1)-acid glycoproteins) by laser flash photolysis constitutes a valuable source of information on the strength of interaction, conformational freedom and protection from oxygen or other external...

  1. Automatic annotation of protein motif function with Gene Ontology terms

    Directory of Open Access Journals (Sweden)

    Gopalakrishnan Vanathi

    2004-09-01

    Full Text Available Abstract Background Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, amuch needed and importanttask is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. Results This paperpresents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifsis viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association isfound to be a very useful feature. We take advantageof the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correctassociation. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. Conclusions In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about thefunctions of newly discovered candidate protein motifs.

  2. 76 FR 81517 - Agency Information Collection Activities: Form I-131, Revision of an Existing Information...

    Science.gov (United States)

    2011-12-28

    ... information collection. (2) Title of the Form/Collection: Application for Travel Document. (3) Agency form... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information...-Day Notice of Information Collection Under Review: Form I- 131, Application for Travel Document. The...

  3. Protein engineering techniques gateways to synthetic protein universe

    CERN Document Server

    Poluri, Krishna Mohan

    2017-01-01

    This brief provides a broad overview of protein-engineering research, offering a glimpse of the most common experimental methods. It also presents various computational programs with applications that are widely used in directed evolution, computational and de novo protein design. Further, it sheds light on the advantages and pitfalls of existing methodologies and future perspectives of protein engineering techniques.

  4. The human interactome knowledge base (hint-kb): An integrative human protein interaction database enriched with predicted protein–protein interaction scores using a novel hybrid technique

    KAUST Repository

    Theofilatos, Konstantinos A.

    2013-07-12

    Proteins are the functional components of many cellular processes and the identification of their physical protein–protein interactions (PPIs) is an area of mature academic research. Various databases have been developed containing information about experimentally and computationally detected human PPIs as well as their corresponding annotation data. However, these databases contain many false positive interactions, are partial and only a few of them incorporate data from various sources. To overcome these limitations, we have developed HINT-KB (http://biotools.ceid.upatras.gr/hint-kb/), a knowledge base that integrates data from various sources, provides a user-friendly interface for their retrieval, cal-culatesasetoffeaturesofinterest and computesaconfidence score for every candidate protein interaction. This confidence score is essential for filtering the false positive interactions which are present in existing databases, predicting new protein interactions and measuring the frequency of each true protein interaction. For this reason, a novel machine learning hybrid methodology, called (Evolutionary Kalman Mathematical Modelling—EvoKalMaModel), was used to achieve an accurate and interpretable scoring methodology. The experimental results indicated that the proposed scoring scheme outperforms existing computational methods for the prediction of PPIs.

  5. The Protein Model Portal.

    Science.gov (United States)

    Arnold, Konstantin; Kiefer, Florian; Kopp, Jürgen; Battey, James N D; Podvinec, Michael; Westbrook, John D; Berman, Helen M; Bordoli, Lorenza; Schwede, Torsten

    2009-03-01

    Structural Genomics has been successful in determining the structures of many unique proteins in a high throughput manner. Still, the number of known protein sequences is much larger than the number of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. Thereby, experimental structure determination efforts and homology modeling complement each other in the exploration of the protein structure space. One of the challenges in using model information effectively has been to access all models available for a specific protein in heterogeneous formats at different sites using various incompatible accession code systems. Often, structure models for hundreds of proteins can be derived from a given experimentally determined structure, using a variety of established methods. This has been done by all of the PSI centers, and by various independent modeling groups. The goal of the Protein Model Portal (PMP) is to provide a single portal which gives access to the various models that can be leveraged from PSI targets and other experimental protein structures. A single interface allows all existing pre-computed models across these various sites to be queried simultaneously, and provides links to interactive services for template selection, target-template alignment, model building, and quality assessment. The current release of the portal consists of 7.6 million model structures provided by different partner resources (CSMP, JCSG, MCSG, NESG, NYSGXRC, JCMM, ModBase, SWISS-MODEL Repository). The PMP is available at http://www.proteinmodelportal.org and from the PSI Structural Genomics Knowledgebase.

  6. Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

    Science.gov (United States)

    Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

    2015-01-01

    Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.

  7. Determination of structural fluctuations of proteins from structure-based calculations of residual dipolar couplings

    International Nuclear Information System (INIS)

    Montalvao, Rinaldo W.; De Simone, Alfonso; Vendruscolo, Michele

    2012-01-01

    Residual dipolar couplings (RDCs) have the potential of providing detailed information about the conformational fluctuations of proteins. It is very challenging, however, to extract such information because of the complex relationship between RDCs and protein structures. A promising approach to decode this relationship involves structure-based calculations of the alignment tensors of protein conformations. By implementing this strategy to generate structural restraints in molecular dynamics simulations we show that it is possible to extract effectively the information provided by RDCs about the conformational fluctuations in the native states of proteins. The approach that we present can be used in a wide range of alignment media, including Pf1, charged bicelles and gels. The accuracy of the method is demonstrated by the analysis of the Q factors for RDCs not used as restraints in the calculations, which are significantly lower than those corresponding to existing high-resolution structures and structural ensembles, hence showing that we capture effectively the contributions to RDCs from conformational fluctuations.

  8. Inferring repeat-protein energetics from evolutionary information.

    Directory of Open Access Journals (Sweden)

    Rocío Espada

    2017-06-01

    Full Text Available Natural protein sequences contain a record of their history. A common constraint in a given protein family is the ability to fold to specific structures, and it has been shown possible to infer the main native ensemble by analyzing covariations in extant sequences. Still, many natural proteins that fold into the same structural topology show different stabilization energies, and these are often related to their physiological behavior. We propose a description for the energetic variation given by sequence modifications in repeat proteins, systems for which the overall problem is simplified by their inherent symmetry. We explicitly account for single amino acid and pair-wise interactions and treat higher order correlations with a single term. We show that the resulting evolutionary field can be interpreted with structural detail. We trace the variations in the energetic scores of natural proteins and relate them to their experimental characterization. The resulting energetic evolutionary field allows the prediction of the folding free energy change for several mutants, and can be used to generate synthetic sequences that are statistically indistinguishable from the natural counterparts.

  9. Actin, actin-binding proteins, and actin-related proteins in the nucleus.

    Science.gov (United States)

    Kristó, Ildikó; Bajusz, Izabella; Bajusz, Csaba; Borkúti, Péter; Vilmos, Péter

    2016-04-01

    Extensive research in the past decade has significantly broadened our view about the role actin plays in the life of the cell and added novel aspects to actin research. One of these new aspects is the discovery of the existence of nuclear actin which became evident only recently. Nuclear activities including transcriptional activation in the case of all three RNA polymerases, editing and nuclear export of mRNAs, and chromatin remodeling all depend on actin. It also became clear that there is a fine-tuned equilibrium between cytoplasmic and nuclear actin pools and that this balance is ensured by an export-import system dedicated to actin. After over half a century of research on conventional actin and its organizing partners in the cytoplasm, it was also an unexpected finding that the nucleus contains more than 30 actin-binding proteins and new classes of actin-related proteins which are not able to form filaments but had evolved nuclear-specific functions. The actin-binding and actin-related proteins in the nucleus have been linked to RNA transcription and processing, nuclear transport, and chromatin remodeling. In this paper, we attempt to provide an overview of the wide range of information that is now available about actin, actin-binding, and actin-related proteins in the nucleus.

  10. FireProt: web server for automated design of thermostable proteins

    Science.gov (United States)

    Musil, Milos; Stourac, Jan; Brezovsky, Jan; Prokop, Zbynek; Zendulka, Jaroslav; Martinek, Tomas

    2017-01-01

    Abstract There is a continuous interest in increasing proteins stability to enhance their usability in numerous biomedical and biotechnological applications. A number of in silico tools for the prediction of the effect of mutations on protein stability have been developed recently. However, only single-point mutations with a small effect on protein stability are typically predicted with the existing tools and have to be followed by laborious protein expression, purification, and characterization. Here, we present FireProt, a web server for the automated design of multiple-point thermostable mutant proteins that combines structural and evolutionary information in its calculation core. FireProt utilizes sixteen tools and three protein engineering strategies for making reliable protein designs. The server is complemented with interactive, easy-to-use interface that allows users to directly analyze and optionally modify designed thermostable mutants. FireProt is freely available at http://loschmidt.chemi.muni.cz/fireprot. PMID:28449074

  11. DbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications.

    Science.gov (United States)

    Lu, Cheng-Tsung; Huang, Kai-Yao; Su, Min-Gang; Lee, Tzong-Yi; Bretaña, Neil Arvin; Chang, Wen-Chi; Chen, Yi-Ju; Chen, Yu-Ju; Huang, Hsien-Da

    2013-01-01

    Protein modification is an extremely important post-translational regulation that adjusts the physical and chemical properties, conformation, stability and activity of a protein; thus altering protein function. Due to the high throughput of mass spectrometry (MS)-based methods in identifying site-specific post-translational modifications (PTMs), dbPTM (http://dbPTM.mbc.nctu.edu.tw/) is updated to integrate experimental PTMs obtained from public resources as well as manually curated MS/MS peptides associated with PTMs from research articles. Version 3.0 of dbPTM aims to be an informative resource for investigating the substrate specificity of PTM sites and functional association of PTMs between substrates and their interacting proteins. In order to investigate the substrate specificity for modification sites, a newly developed statistical method has been applied to identify the significant substrate motifs for each type of PTMs containing sufficient experimental data. According to the data statistics in dbPTM, >60% of PTM sites are located in the functional domains of proteins. It is known that most PTMs can create binding sites for specific protein-interaction domains that work together for cellular function. Thus, this update integrates protein-protein interaction and domain-domain interaction to determine the functional association of PTM sites located in protein-interacting domains. Additionally, the information of structural topologies on transmembrane (TM) proteins is integrated in dbPTM in order to delineate the structural correlation between the reported PTM sites and TM topologies. To facilitate the investigation of PTMs on TM proteins, the PTM substrate sites and the structural topology are graphically represented. Also, literature information related to PTMs, orthologous conservations and substrate motifs of PTMs are also provided in the resource. Finally, this version features an improved web interface to facilitate convenient access to the resource.

  12. Inhibition of existing denitrification enzyme activity by chloramphenicol

    Science.gov (United States)

    Brooks, M.H.; Smith, R.L.; Macalady, D.L.

    1992-01-01

    Chloramphenicol completely inhibited the activity of existing denitrification enzymes in acetylene-block incubations with (i) sediments from a nitrate-contaminated aquifer and (ii) a continuous culture of denitrifying groundwater bacteria. Control flasks with no antibiotic produced significant amounts of nitrous oxide in the same time period. Amendment with chloramphenicol after nitrous oxide production had begun resulted in a significant decrease in the rate of nitrous oxide production. Chloramphenicol also decreased (>50%) the activity of existing denitrification enzymes in pure cultures of Pseudomonas denitrificans that were harvested during log- phase growth and maintained for 2 weeks in a starvation medium lacking electron donor. Short-term time courses of nitrate consumption and nitrous oxide production in the presence of acetylene with P. denitrificans undergoing carbon starvation were performed under optimal conditions designed to mimic denitrification enzyme activity assays used with soils. Time courses were linear for both chloramphenicol and control flasks, and rate estimates for the two treatments were significantly different at the 95% confidence level. Complete or partial inhibition of existing enzyme activity is not consistent with the current understanding of the mode of action of chloramphenicol or current practice, in which the compound is frequently employed to inhibit de novo protein synthesis during the course of microbial activity assays. The results of this study demonstrate that chloramphenicol amendment can inhibit the activity of existing denitrification enzymes and suggest that caution is needed in the design and interpretation of denitrification activity assays in which chloramphenicol is used to prevent new protein synthesis.

  13. DECK: Distance and environment-dependent, coarse-grained, knowledge-based potentials for protein-protein docking

    Directory of Open Access Journals (Sweden)

    Vakser Ilya A

    2011-07-01

    Full Text Available Abstract Background Computational approaches to protein-protein docking typically include scoring aimed at improving the rank of the near-native structure relative to the false-positive matches. Knowledge-based potentials improve modeling of protein complexes by taking advantage of the rapidly increasing amount of experimentally derived information on protein-protein association. An essential element of knowledge-based potentials is defining the reference state for an optimal description of the residue-residue (or atom-atom pairs in the non-interaction state. Results The study presents a new Distance- and Environment-dependent, Coarse-grained, Knowledge-based (DECK potential for scoring of protein-protein docking predictions. Training sets of protein-protein matches were generated based on bound and unbound forms of proteins taken from the DOCKGROUND resource. Each residue was represented by a pseudo-atom in the geometric center of the side chain. To capture the long-range and the multi-body interactions, residues in different secondary structure elements at protein-protein interfaces were considered as different residue types. Five reference states for the potentials were defined and tested. The optimal reference state was selected and the cutoff effect on the distance-dependent potentials investigated. The potentials were validated on the docking decoys sets, showing better performance than the existing potentials used in scoring of protein-protein docking results. Conclusions A novel residue-based statistical potential for protein-protein docking was developed and validated on docking decoy sets. The results show that the scoring function DECK can successfully identify near-native protein-protein matches and thus is useful in protein docking. In addition to the practical application of the potentials, the study provides insights into the relative utility of the reference states, the scope of the distance dependence, and the coarse-graining of

  14. Protein in Urine: MedlinePlus Lab Test Information

    Science.gov (United States)

    ... this page: https://medlineplus.gov/labtests/proteininurine.html Protein in Urine To use the sharing features on this page, please enable JavaScript. What is a Protein in Urine Test? A protein in urine test ...

  15. Implementation of an anonymisation tool for clinical trials using a clinical trial processor integrated with an existing trial patient data information system

    NARCIS (Netherlands)

    Aryanto, Kadek Y. E.; Broekema, Andre; Oudkerk, Matthijs; van Ooijen, Peter M. A.

    To present an adapted Clinical Trial Processor (CTP) test set-up for receiving, anonymising and saving Digital Imaging and Communications in Medicine (DICOM) data using external input from the original database of an existing clinical study information system to guide the anonymisation process. Two

  16. Financial gap calculations for existing cogeneration 2008

    International Nuclear Information System (INIS)

    Hers, S.J.; Wetzels, W.; Seebregts, A.J.; Van der Welle, A.J.

    2008-05-01

    The Dutch SDE (abbreviation for the renewable energy incentive) subsidy scheme promotes the reduction of CO2 emissions which results from the use of Combined Heat and Power (CHP) plants. This report calculates the profitability of operation of existing CHP plants. This information can be used for decision making on the SDE subsidy for existing CHP plants in 2008 [nl

  17. On the existence of optimal contract mechanisms for incomplete information principal-agent models

    NARCIS (Netherlands)

    Balder, E.J.

    1997-01-01

    Two abstract results are given for the existence of optimal contract selection mechanisms in principal-agent models; by a suitable reformulation of the (almost) incentive compatibility constraint, they deal with both single- and multi-agent models. In particular, it is shown that the existence

  18. Proteomic amino-termini profiling reveals targeting information for protein import into complex plastids.

    Directory of Open Access Journals (Sweden)

    Pitter F Huesgen

    Full Text Available In organisms with complex plastids acquired by secondary endosymbiosis from a photosynthetic eukaryote, the majority of plastid proteins are nuclear-encoded, translated on cytoplasmic ribosomes, and guided across four membranes by a bipartite targeting sequence. In-depth understanding of this vital import process has been impeded by a lack of information about the transit peptide part of this sequence, which mediates transport across the inner three membranes. We determined the mature N-termini of hundreds of proteins from the model diatom Thalassiosira pseudonana, revealing extensive N-terminal modification by acetylation and proteolytic processing in both cytosol and plastid. We identified 63 mature N-termini of nucleus-encoded plastid proteins, deduced their complete transit peptide sequences, determined a consensus motif for their cleavage by the stromal processing peptidase, and found evidence for subsequent processing by a plastid methionine aminopeptidase. The cleavage motif differs from that of higher plants, but is shared with other eukaryotes with complex plastids.

  19. Protein immobilization strategies for protein biochips

    NARCIS (Netherlands)

    Rusmini, F.; Rusmini, Federica; Zhong, Zhiyuan; Feijen, Jan

    2007-01-01

    In the past few years, protein biochips have emerged as promising proteomic and diagnostic tools for obtaining information about protein functions and interactions. Important technological innovations have been made. However, considerable development is still required, especially regarding protein

  20. Protein Molecular Structures, Protein SubFractions, and Protein Availability Affected by Heat Processing: A Review

    International Nuclear Information System (INIS)

    Yu, P.

    2007-01-01

    The utilization and availability of protein depended on the types of protein and their specific susceptibility to enzymatic hydrolysis (inhibitory activities) in the gastrointestine and was highly associated with protein molecular structures. Studying internal protein structure and protein subfraction profiles leaded to an understanding of the components that make up a whole protein. An understanding of the molecular structure of the whole protein was often vital to understanding its digestive behavior and nutritive value in animals. In this review, recently obtained information on protein molecular structural effects of heat processing was reviewed, in relation to protein characteristics affecting digestive behavior and nutrient utilization and availability. The emphasis of this review was on (1) using the newly advanced synchrotron technology (S-FTIR) as a novel approach to reveal protein molecular chemistry affected by heat processing within intact plant tissues; (2) revealing the effects of heat processing on the profile changes of protein subfractions associated with digestive behaviors and kinetics manipulated by heat processing; (3) prediction of the changes of protein availability and supply after heat processing, using the advanced DVE/OEB and NRC-2001 models, and (4) obtaining information on optimal processing conditions of protein as intestinal protein source to achieve target values for potential high net absorbable protein in the small intestine. The information described in this article may give better insight in the mechanisms involved and the intrinsic protein molecular structural changes occurring upon processing.

  1. Survey of the Effectiveness of Internet Information on Patient Education for Bone Morphogenetic Protein.

    Science.gov (United States)

    Huang, Meng; Briceño, Valentina; Lam, Sandi K; Luerssen, Thomas G; Jea, Andrew

    2016-03-01

    In light of recent reports of potential short- and long-term complications of bone morphogenetic protein (BMP) and increasing "off-label" use among spine surgeons, we wished to analyze online information on BMP and its controversial uses, as patients frequently search the Internet for medical information, even though the quality and accuracy of available information are highly variable. Between December 2014 and January 2015, we conducted a Google search to identify the 50 most accessed websites providing BMP information using the search phrase "bone morphogenetic protein." Websites were classified based on authorship. Each website was examined for the provision of appropriate patient inclusion and exclusion criteria, surgical and nonsurgical treatment alternatives, purported benefits, disclosure of common and potential complications, peer-reviewed literature citations, and discussion of off-label use. Two percent of websites were authored by private medical groups, 2% by academic medical groups, 10% by insurance companies, 16% by biomedical industries, 4% by news sources, 0% by lawyers, and 66% by others. Sixty-two percent referenced peer-reviewed literature. Benefits and complications were reported in 44% and 26% of websites, respectively. Surgical and nonsurgical treatment alternatives were mentioned in 16% and 4% of websites, respectively. Discussion of off-label BMP use occurred in 18% of websites. Our study showed the ineffectiveness of the Internet in reporting quality information on BMP use. We found that websites authored by insurance companies provide an acceptable foundation for patient education. This, however, cannot replace the need for a thorough dialogue between doctor and patient about risks, benefits, and indications. Copyright © 2016. Published by Elsevier Inc.

  2. 75 FR 63180 - Agency Information Collection Activities: Existing Collection; Emergency Extension

    Science.gov (United States)

    2010-10-14

    ... be effective after the current October 31, 2010 expiration date. FOR FURTHER INFORMATION CONTACT... collected information from State and local governments with 100 or more full-time employees since 1974... governments and to provide information on the employment status of minorities and women. The data are shared...

  3. 77 FR 15787 - Agency Information Collection Activities: Form I-131, Revision of an Existing Information...

    Science.gov (United States)

    2012-03-16

    ... the Form/Collection: Application for Travel Document. (3) Agency Form Number, if any, and the... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information...-Day Notice of Information Collection Under Review: Form I- 131, Application for Travel Document. The...

  4. 75 FR 40829 - Agency Information Collection Activities: Existing Collection; Emergency Extension

    Science.gov (United States)

    2010-07-14

    ... be effective after the current July 31, 2010 expiration date. FOR FURTHER INFORMATION CONTACT: Ronald... information from State and local governments with 100 or more full-time employees since 1974 (biennially in... information on the employment status of minorities and women. The data are shared with several other Federal...

  5. Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression

    Directory of Open Access Journals (Sweden)

    Vandepoele Klaas

    2009-06-01

    Full Text Available Abstract Background Large-scale identification of the interrelationships between different components of the cell, such as the interactions between proteins, has recently gained great interest. However, unraveling large-scale protein-protein interaction maps is laborious and expensive. Moreover, assessing the reliability of the interactions can be cumbersome. Results In this study, we have developed a computational method that exploits the existing knowledge on protein-protein interactions in diverse species through orthologous relations on the one hand, and functional association data on the other hand to predict and filter protein-protein interactions in Arabidopsis thaliana. A highly reliable set of protein-protein interactions is predicted through this integrative approach making use of existing protein-protein interaction data from yeast, human, C. elegans and D. melanogaster. Localization, biological process, and co-expression data are used as powerful indicators for protein-protein interactions. The functional repertoire of the identified interactome reveals interactions between proteins functioning in well-conserved as well as plant-specific biological processes. We observe that although common mechanisms (e.g. actin polymerization and components (e.g. ARPs, actin-related proteins exist between different lineages, they are active in specific processes such as growth, cancer metastasis and trichome development in yeast, human and Arabidopsis, respectively. Conclusion We conclude that the integration of orthology with functional association data is adequate to predict protein-protein interactions. Through this approach, a high number of novel protein-protein interactions with diverse biological roles is discovered. Overall, we have predicted a reliable set of protein-protein interactions suitable for further computational as well as experimental analyses.

  6. Construction of ontology augmented networks for protein complex prediction.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian

    2013-01-01

    Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.

  7. 76 FR 4930 - Agency Information Collection Activities: Extension of an Existing Information Collection...

    Science.gov (United States)

    2011-01-27

    ... DEPARTMENT OF HOMELAND SECURITY United States Immigration and Customs Enforcement Agency... addressed to OMB Desk Officer, for United States Immigration and Customs Enforcement, Department of Homeland...-Day Notice of Information Collection for Review; Immigration Bond; OMB Control No. 1653-0022. The...

  8. EST2Prot: Mapping EST sequences to proteins

    Directory of Open Access Journals (Sweden)

    Lin David M

    2006-03-01

    Full Text Available Abstract Background EST libraries are used in various biological studies, from microarray experiments to proteomic and genetic screens. These libraries usually contain many uncharacterized ESTs that are typically ignored since they cannot be mapped to known genes. Consequently, new discoveries are possibly overlooked. Results We describe a system (EST2Prot that uses multiple elements to map EST sequences to their corresponding protein products. EST2Prot uses UniGene clusters, substring analysis, information about protein coding regions in existing DNA sequences and protein database searches to detect protein products related to a query EST sequence. Gene Ontology terms, Swiss-Prot keywords, and protein similarity data are used to map the ESTs to functional descriptors. Conclusion EST2Prot extends and significantly enriches the popular UniGene mapping by utilizing multiple relations between known biological entities. It produces a mapping between ESTs and proteins in real-time through a simple web-interface. The system is part of the Biozon database and is accessible at http://biozon.org/tools/est/.

  9. 76 FR 10609 - Agency Information Collection Activities: Form I-290B, Revision of an Existing Information...

    Science.gov (United States)

    2011-02-25

    ...: 30-Day Notice of Information Collection Under Review: Form I- 290B, Notice of Appeal to the Office of Administrative Appeals (AAO); OMB Control No. 1615-0095. The Department of Homeland Security, U.S. Citizenship... submission of responses. Overview of this information collection: (1) Type of Information Collection...

  10. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng

    2016-06-15

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

  11. 30 CFR 778.9 - Certifying and updating existing permit application information.

    Science.gov (United States)

    2010-07-01

    ... you have previously applied for a permit and the required information is already in AVS, then you may... information already in AVS is accurate and complete may certify to us by swearing or affirming, under oath and in writing, that the relevant information in AVS is accurate, complete, and up to date. (2) Part of...

  12. 77 FR 14535 - Agency Information Collection Activities: Extension of an Existing Information Collection...

    Science.gov (United States)

    2012-03-12

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information... Department of Homeland Security, U.S. Citizenship and Immigration Services (USCIS) will be submitting the...; U.S. Citizenship and Immigration Services (USCIS). (4) Affected public who will be asked or required...

  13. IRaPPA: information retrieval based integration of biophysical models for protein assembly selection.

    Science.gov (United States)

    Moal, Iain H; Barradas-Bautista, Didier; Jiménez-García, Brian; Torchala, Mieczyslaw; van der Velde, Arjan; Vreven, Thom; Weng, Zhiping; Bates, Paul A; Fernández-Recio, Juan

    2017-06-15

    In order to function, proteins frequently bind to one another and form 3D assemblies. Knowledge of the atomic details of these structures helps our understanding of how proteins work together, how mutations can lead to disease, and facilitates the designing of drugs which prevent or mimic the interaction. Atomic modeling of protein-protein interactions requires the selection of near-native structures from a set of docked poses based on their calculable properties. By considering this as an information retrieval problem, we have adapted methods developed for Internet search ranking and electoral voting into IRaPPA, a pipeline integrating biophysical properties. The approach enhances the identification of near-native structures when applied to four docking methods, resulting in a near-native appearing in the top 10 solutions for up to 50% of complexes benchmarked, and up to 70% in the top 100. IRaPPA has been implemented in the SwarmDock server ( http://bmm.crick.ac.uk/∼SwarmDock/ ), pyDock server ( http://life.bsc.es/pid/pydockrescoring/ ) and ZDOCK server ( http://zdock.umassmed.edu/ ), with code available on request. moal@ebi.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  14. Information pertinent to the migration of radionuclides in ground water at the Nevada Test Site. Part 1. Review and analysis of existing information

    International Nuclear Information System (INIS)

    Borg, I.Y.; Stone, R.; Levy, H.B.; Ramspott, L.D.

    1976-01-01

    A history of NTS is given, the geologic and hydrologic setting is described, and the amount of radioactivity deposited within and near the main aquifers is estimated. The conclusions include: information currently available is insufficient to state categorically that radioactivity will never be carried off the Nevada Test Site by ground water movement; nonetheless, such a migration at levels above the maximum permissible concentration to existing wells and springs is considered unlikely; if offsite migration occurs, it will probably be from the southwestern margins of Pahute Mesa, where there is only a small chance of contaminating existing public water supplies; tritium is the most mobile radionuclide and may be the only long-lived isotope of concern. Highest priority is assigned to measurement of tritium and other radionuclides in large water samples taken from nuclear chimneys that water has re-entered after an explosion; expansion of the existing groundwater monitoring program at NTS to include wells with a higher probability of intersecting flow of contaminated water; measurement of groundwater flow velocities and other associated hydrologic parameters. High priority is assigned to production of an inventory of radionuclides deposited near NTS borders, especially beneath Pahute Mesa; determination of amounts of radioactivity deposited directly into the Lower Carbonate Aquifer; a sensitivity analysis of the many parameters that enter into transport calculations; a study of the many unplugged holes that penetrate the Tuff Aquitard; testing of the assumption that radionuclides deposited in the unsaturated zone are isolated from the saturated zone because of limited precipitation and downward movement of moisture; and determination of distribution coefficients for NTS alluvium, carbonate, and rhyolitic rocks, which are lacking or poorly represented in the literature. Twelve other recommendations of lesser priority are also given

  15. Bioinformatics and moonlighting proteins

    Directory of Open Access Journals (Sweden)

    Sergio eHernández

    2015-06-01

    Full Text Available Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a remote homology searches using Psi-Blast, b detection of functional motifs and domains, c analysis of data from protein-protein interaction databases (PPIs, d match the query protein sequence to 3D databases (i.e., algorithms as PISITE, e mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/, previously published by our group, has been used as a benchmark for the all of the analyses.

  16. Seismic assessment of existing nuclear chemical plants

    International Nuclear Information System (INIS)

    Merriman, P.A.

    1997-01-01

    This paper outlines the generic approach to the seismic assessment of existing structures. It describes the role of the safety case in determining the studies carried out by the functional departments on individual projects. There is an emphasis on the role of existing information and material tests to provide realistic properties for analysis to account for possible degradation effects. Finally, a case study of a concrete containment cell is shown to illustrate the approach. (author)

  17. Coevolution study of mitochondria respiratory chain proteins: toward the understanding of protein--protein interaction.

    Science.gov (United States)

    Yang, Ming; Ge, Yan; Wu, Jiayan; Xiao, Jingfa; Yu, Jun

    2011-05-20

    Coevolution can be seen as the interdependency between evolutionary histories. In the context of protein evolution, functional correlation proteins are ever-present coordinated evolutionary characters without disruption of organismal integrity. As to complex system, there are two forms of protein--protein interactions in vivo, which refer to inter-complex interaction and intra-complex interaction. In this paper, we studied the difference of coevolution characters between inter-complex interaction and intra-complex interaction using "Mirror tree" method on the respiratory chain (RC) proteins. We divided the correlation coefficients of every pairwise RC proteins into two groups corresponding to the binary protein--protein interaction in intra-complex and the binary protein--protein interaction in inter-complex, respectively. A dramatical discrepancy is detected between the coevolution characters of the two sets of protein interactions (Wilcoxon test, p-value = 4.4 × 10(-6)). Our finding reveals some critical information on coevolutionary study and assists the mechanical investigation of protein--protein interaction. Furthermore, the results also provide some unique clue for supramolecular organization of protein complexes in the mitochondrial inner membrane. More detailed binding sites map and genome information of nuclear encoded RC proteins will be extraordinary valuable for the further mitochondria dynamics study. Copyright © 2011. Published by Elsevier Ltd.

  18. The Princeton Protein Orthology Database (P-POD): a comparative genomics analysis tool for biologists.

    OpenAIRE

    Sven Heinicke; Michael S Livstone; Charles Lu; Rose Oughtred; Fan Kang; Samuel V Angiuoli; Owen White; David Botstein; Kara Dolinski

    2007-01-01

    Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu), a user-friendly database system that allows users to find and visualize the phylogenetic r...

  19. Mapping functional prion-prion protein interaction sites using prion protein based peptide-arrays

    NARCIS (Netherlands)

    Rigter, A.; Priem, J.; Timmers-Parohi, D.; Langeveld, J.; Bossers, A.

    2009-01-01

    Protein-protein interactions are at the basis of most if not all biological processes in living cells. Therefore, adapting existing techniques or developing new techniques to study interactions between proteins are of importance in elucidating which amino acid sequences contribute to these

  20. Common data model for natural language processing based on two existing standard information models: CDA+GrAF.

    Science.gov (United States)

    Meystre, Stéphane M; Lee, Sanghoon; Jung, Chai Young; Chevrier, Raphaël D

    2012-08-01

    An increasing need for collaboration and resources sharing in the Natural Language Processing (NLP) research and development community motivates efforts to create and share a common data model and a common terminology for all information annotated and extracted from clinical text. We have combined two existing standards: the HL7 Clinical Document Architecture (CDA), and the ISO Graph Annotation Format (GrAF; in development), to develop such a data model entitled "CDA+GrAF". We experimented with several methods to combine these existing standards, and eventually selected a method wrapping separate CDA and GrAF parts in a common standoff annotation (i.e., separate from the annotated text) XML document. Two use cases, clinical document sections, and the 2010 i2b2/VA NLP Challenge (i.e., problems, tests, and treatments, with their assertions and relations), were used to create examples of such standoff annotation documents, and were successfully validated with the XML schemata provided with both standards. We developed a tool to automatically translate annotation documents from the 2010 i2b2/VA NLP Challenge format to GrAF, and automatically generated 50 annotation documents using this tool, all successfully validated. Finally, we adapted the XSL stylesheet provided with HL7 CDA to allow viewing annotation XML documents in a web browser, and plan to adapt existing tools for translating annotation documents between CDA+GrAF and the UIMA and GATE frameworks. This common data model may ease directly comparing NLP tools and applications, combining their output, transforming and "translating" annotations between different NLP applications, and eventually "plug-and-play" of different modules in NLP applications. Copyright © 2011 Elsevier Inc. All rights reserved.

  1. VaProS: a database-integration approach for protein/genome information retrieval

    KAUST Repository

    Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R.; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei

    2016-01-01

    Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein–protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts’ knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/.

  2. VaProS: a database-integration approach for protein/genome information retrieval

    KAUST Repository

    Gojobori, Takashi

    2016-12-24

    Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein–protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts’ knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/.

  3. Validation of protein carbonyl measurement

    DEFF Research Database (Denmark)

    Augustyniak, Edyta; Adam, Aisha; Wojdyla, Katarzyna

    2015-01-01

    Protein carbonyls are widely analysed as a measure of protein oxidation. Several different methods exist for their determination. A previous study had described orders of magnitude variance that existed when protein carbonyls were analysed in a single laboratory by ELISA using different commercial...... protein carbonyl analysis across Europe. ELISA and Western blotting techniques detected an increase in protein carbonyl formation between 0 and 5min of UV irradiation irrespective of method used. After irradiation for 15min, less oxidation was detected by half of the laboratories than after 5min...... irradiation. Three of the four ELISA carbonyl results fell within 95% confidence intervals. Likely errors in calculating absolute carbonyl values may be attributed to differences in standardisation. Out of up to 88 proteins identified as containing carbonyl groups after tryptic cleavage of irradiated...

  4. Multi-label learning with fuzzy hypergraph regularization for protein subcellular location prediction.

    Science.gov (United States)

    Chen, Jing; Tang, Yuan Yan; Chen, C L Philip; Fang, Bin; Lin, Yuewei; Shang, Zhaowei

    2014-12-01

    Protein subcellular location prediction aims to predict the location where a protein resides within a cell using computational methods. Considering the main limitations of the existing methods, we propose a hierarchical multi-label learning model FHML for both single-location proteins and multi-location proteins. The latent concepts are extracted through feature space decomposition and label space decomposition under the nonnegative data factorization framework. The extracted latent concepts are used as the codebook to indirectly connect the protein features to their annotations. We construct dual fuzzy hypergraphs to capture the intrinsic high-order relations embedded in not only feature space, but also label space. Finally, the subcellular location annotation information is propagated from the labeled proteins to the unlabeled proteins by performing dual fuzzy hypergraph Laplacian regularization. The experimental results on the six protein benchmark datasets demonstrate the superiority of our proposed method by comparing it with the state-of-the-art methods, and illustrate the benefit of exploiting both feature correlations and label correlations.

  5. Collection and analysis of existing information on applicability of investigation methods for estimation of beginning age of faulting in present faulting pattern

    International Nuclear Information System (INIS)

    Doke, Ryosuke; Yasue, Ken-ichi; Tanikawa, Shin-ichi; Nakayasu, Akio; Niizato, Tadafumi; Tanaka, Takenobu; Aoki, Michinori; Sekiya, Ayako

    2011-12-01

    In the field of R and D programs of a geological disposal of high level radioactive waste, it is great importance to develop a set of investigation and analysis techniques for the assessment of long-term geosphere stability over a geological time, which means that any changes of geological environment will not significantly impact on the long-term safety of a geological disposal system. In Japanese archipelago, crustal movements are so active that uplift and subsidence are remarkable in recent several hundreds of thousands of years. Therefore, it is necessary to assess the long-term geosphere stability taking into account a topographic change caused by crustal movements. One of the factors for the topographic change is the movement of an active fault, which is a geological process to release a strain accumulated by plate motion. A beginning age of the faulting in the present faulting pattern suggests the beginning age of neotectonic activities around the active fault, and also provides basic information to identifying the stage of a geomorphic development of mountains. Therefore, the age of faulting in the present faulting pattern is important information to estimate a topographic change in the future on the mountain regions of Japan. In this study, existing information related to methods for the estimation of the beginning age of the faulting in the present faulting pattern on the active fault were collected and reviewed. A principle of method, noticing points and technical know-hows in the application of the methods, data uncertainty, and so on were extracted from the existing information. Based on these extracted information, task-flows indicating working process on the estimation of the beginning age for the faulting of the active fault were illustrated on each method. Additionally, the distribution map of the beginning age with accuracy of faulting in the present faulting pattern on the active fault was illustrated. (author)

  6. Protein kinesis: The dynamics of protein trafficking and stability

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1995-12-31

    The purpose of this conference is to provide a multidisciplinary forum for exchange of state-of-the-art information on protein kinesis. This volume contains abstracts of papers in the following areas: protein folding and modification in the endoplasmic reticulum; protein trafficking; protein translocation and folding; protein degradation; polarity; nuclear trafficking; membrane dynamics; and protein import into organelles.

  7. 77 FR 25206 - Proposed Extension of Existing Information Collection; Underground Retorts

    Science.gov (United States)

    2012-04-27

    ... information in accordance with the Paperwork Reduction Act of 1995. This program helps to ensure that requested data can be provided in the desired format, reporting burden (time and financial resources) is... Information Collection; Underground Retorts AGENCY: Mine Safety and Health Administration, Labor. ACTION...

  8. Protein fold recognition using geometric kernel data fusion.

    Science.gov (United States)

    Zakeri, Pooya; Jeuris, Ben; Vandebril, Raf; Moreau, Yves

    2014-07-01

    Various approaches based on features extracted from protein sequences and often machine learning methods have been used in the prediction of protein folds. Finding an efficient technique for integrating these different protein features has received increasing attention. In particular, kernel methods are an interesting class of techniques for integrating heterogeneous data. Various methods have been proposed to fuse multiple kernels. Most techniques for multiple kernel learning focus on learning a convex linear combination of base kernels. In addition to the limitation of linear combinations, working with such approaches could cause a loss of potentially useful information. We design several techniques to combine kernel matrices by taking more involved, geometry inspired means of these matrices instead of convex linear combinations. We consider various sequence-based protein features including information extracted directly from position-specific scoring matrices and local sequence alignment. We evaluate our methods for classification on the SCOP PDB-40D benchmark dataset for protein fold recognition. The best overall accuracy on the protein fold recognition test set obtained by our methods is ∼ 86.7%. This is an improvement over the results of the best existing approach. Moreover, our computational model has been developed by incorporating the functional domain composition of proteins through a hybridization model. It is observed that by using our proposed hybridization model, the protein fold recognition accuracy is further improved to 89.30%. Furthermore, we investigate the performance of our approach on the protein remote homology detection problem by fusing multiple string kernels. The MATLAB code used for our proposed geometric kernel fusion frameworks are publicly available at http://people.cs.kuleuven.be/∼raf.vandebril/homepage/software/geomean.php?menu=5/. © The Author 2014. Published by Oxford University Press.

  9. Protein subcellular localization assays using split fluorescent proteins

    Science.gov (United States)

    Waldo, Geoffrey S [Santa Fe, NM; Cabantous, Stephanie [Los Alamos, NM

    2009-09-08

    The invention provides protein subcellular localization assays using split fluorescent protein systems. The assays are conducted in living cells, do not require fixation and washing steps inherent in existing immunostaining and related techniques, and permit rapid, non-invasive, direct visualization of protein localization in living cells. The split fluorescent protein systems used in the practice of the invention generally comprise two or more self-complementing fragments of a fluorescent protein, such as GFP, wherein one or more of the fragments correspond to one or more beta-strand microdomains and are used to "tag" proteins of interest, and a complementary "assay" fragment of the fluorescent protein. Either or both of the fragments may be functionalized with a subcellular targeting sequence enabling it to be expressed in or directed to a particular subcellular compartment (i.e., the nucleus).

  10. 75 FR 18833 - Agency Information Collection Activities: Existing Collection; Emergency Extension

    Science.gov (United States)

    2010-04-13

    ... be effective after the current April 30, 2010 expiration date. FOR FURTHER INFORMATION CONTACT... collected information from State and local governments with 100 or more full-time employees since 1974... minorities and women. The data are shared with several other Federal agencies. Pursuant to section 709(d) of...

  11. Growing functional modules from a seed protein via integration of protein interaction and gene expression data

    Directory of Open Access Journals (Sweden)

    Dimitrakopoulou Konstantina

    2007-10-01

    Full Text Available Abstract Background Nowadays modern biology aims at unravelling the strands of complex biological structures such as the protein-protein interaction (PPI networks. A key concept in the organization of PPI networks is the existence of dense subnetworks (functional modules in them. In recent approaches clustering algorithms were applied at these networks and the resulting subnetworks were evaluated by estimating the coverage of well-established protein complexes they contained. However, most of these algorithms elaborate on an unweighted graph structure which in turn fails to elevate those interactions that would contribute to the construction of biologically more valid and coherent functional modules. Results In the current study, we present a method that corroborates the integration of protein interaction and microarray data via the discovery of biologically valid functional modules. Initially the gene expression information is overlaid as weights onto the PPI network and the enriched PPI graph allows us to exploit its topological aspects, while simultaneously highlights enhanced functional association in specific pairs of proteins. Then we present an algorithm that unveils the functional modules of the weighted graph by expanding a kernel protein set, which originates from a given 'seed' protein used as starting-point. Conclusion The integrated data and the concept of our approach provide reliable functional modules. We give proofs based on yeast data that our method manages to give accurate results in terms both of structural coherency, as well as functional consistency.

  12. 77 FR 6134 - Agency Information Collection Activities: Form I-290B, Extension of an Existing Information...

    Science.gov (United States)

    2012-02-07

    ...: 60-Day Notice of Information Collection Under Review: Form I- 290B, Notice of Appeal or Motion...., permitting electronic submission of responses. Overview of this information collection: (1) Type of...: Notice of Appeal or Motion. (3) Agency form number, if any, and the applicable component of the...

  13. Sequence-specific capture of protein-DNA complexes for mass spectrometric protein identification.

    Directory of Open Access Journals (Sweden)

    Cheng-Hsien Wu

    Full Text Available The regulation of gene transcription is fundamental to the existence of complex multicellular organisms such as humans. Although it is widely recognized that much of gene regulation is controlled by gene-specific protein-DNA interactions, there presently exists little in the way of tools to identify proteins that interact with the genome at locations of interest. We have developed a novel strategy to address this problem, which we refer to as GENECAPP, for Global ExoNuclease-based Enrichment of Chromatin-Associated Proteins for Proteomics. In this approach, formaldehyde cross-linking is employed to covalently link DNA to its associated proteins; subsequent fragmentation of the DNA, followed by exonuclease digestion, produces a single-stranded region of the DNA that enables sequence-specific hybridization capture of the protein-DNA complex on a solid support. Mass spectrometric (MS analysis of the captured proteins is then used for their identification and/or quantification. We show here the development and optimization of GENECAPP for an in vitro model system, comprised of the murine insulin-like growth factor-binding protein 1 (IGFBP1 promoter region and FoxO1, a member of the forkhead rhabdomyosarcoma (FoxO subfamily of transcription factors, which binds specifically to the IGFBP1 promoter. This novel strategy provides a powerful tool for studies of protein-DNA and protein-protein interactions.

  14. 76 FR 14073 - Agency Information Collection Activities: Existing Collection; Comments Requested

    Science.gov (United States)

    2011-03-15

    ... forms to replace the NPS-1A which will collect data on special topics, such as mental health, medical... information collection is published to obtain comments from the public and affected agencies. Comments are... or additional information, please contact Paul Guerino by e-mail at paul[email protected] or at (202...

  15. Cysteine and tryptophan anomalies found when scanning all the binding sites in the Protein Data Bank.

    Science.gov (United States)

    Iván, Gábor; Szabadka, Zoltán; Grolmusz, Vince

    2010-01-01

    The Protein Data Bank (PDB) is one of the richest sources of structural biological information in the World. It started to exist as the computer-readable depository of crystallographic data complementing printed papers. The proper interpretation of the content of the individual files in the PDB still needs the detailed information found in the citing publication. An advanced graph theoretical method is presented here for automatically repairing, re-organising and re-structuring PDB data yielding the identification of all the protein-ligand complexes and all the binding sites in the PDB. As an application, we identified strong cysteine and tryptophan irregularities in the data.

  16. An overview of the prediction of protein DNA-binding sites.

    Science.gov (United States)

    Si, Jingna; Zhao, Rui; Wu, Rongling

    2015-03-06

    Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.

  17. Identification and characterization of novel ERC-55 interacting proteins: evidence for the existence of several ERC-55 splicing variants; including the cytosolic ERC-55-C.

    Science.gov (United States)

    Ludvigsen, Maja; Jacobsen, Christian; Maunsbach, Arvid B; Honoré, Bent

    2009-12-01

    ERC-55, encoded from RCN2, is localized in the ER and belongs to the CREC protein family. ERC-55 is involved in various diseases and abnormal cell behavior, however, the function is not well defined and it has controversially been reported to interact with a cytosolic protein, the vitamin D receptor. We have used a number of proteomic techniques to further our functional understanding of ERC-55. By affinity purification, we observed interaction with a large variety of proteins, including those secreted and localized outside of the secretory pathway, in the cytosol and also in various organelles. We confirm the existence of several ERC-55 splicing variants including ERC-55-C localized in the cytosol in association with the cytoskeleton. Localization was verified by immunoelectron microscopy and sub-cellular fractionation. Interaction of lactoferrin, S100P, calcyclin (S100A6), peroxiredoxin-6, kininogen and lysozyme with ERC-55 was further studied in vitro by SPR experiments. Interaction of S100P requires [Ca(2+)] of approximately 10(-7) M or greater, while calcyclin interaction requires [Ca(2+)] of >10(-5) M. Interaction with peroxiredoxin-6 is independent of Ca(2+). Co-localization of lactoferrin, S100P and calcyclin with ERC-55 in the perinuclear area was analyzed by fluorescence confocal microscopy. The functional variety of the interacting proteins indicates a broad spectrum of ERC-55 activities such as immunity, redox homeostasis, cell cycle regulation and coagulation.

  18. Informing the Human Plasma Protein Binding of ...

    Science.gov (United States)

    The free fraction of a xenobiotic in plasma (Fub) is an important determinant of chemical adsorption, distribution, metabolism, elimination, and toxicity, yet experimental plasma protein binding data is scarce for environmentally relevant chemicals. The presented work explores the merit of utilizing available pharmaceutical data to predict Fub for environmentally relevant chemicals via machine learning techniques. Quantitative structure-activity relationship (QSAR) models were constructed with k nearest neighbors (kNN), support vector machines (SVM), and random forest (RF) machine learning algorithms from a training set of 1045 pharmaceuticals. The models were then evaluated with independent test sets of pharmaceuticals (200 compounds) and environmentally relevant ToxCast chemicals (406 total, in two groups of 238 and 168 compounds). The selection of a minimal feature set of 10-15 2D molecular descriptors allowed for both informative feature interpretation and practical applicability domain assessment via a bounded box of descriptor ranges and principal component analysis. The diverse pharmaceutical and environmental chemical sets exhibit similarities in terms of chemical space (99-82% overlap), as well as comparable bias and variance in constructed learning curves. All the models exhibit significant predictability with mean absolute errors (MAE) in the range of 0.10-0.18 Fub. The models performed best for highly bound chemicals (MAE 0.07-0.12), neutrals (MAE 0

  19. 77 FR 16865 - Proposed Extension of Existing Information Collection; Occupational Noise Exposure

    Science.gov (United States)

    2012-03-22

    ... harmful physical agent and one of the most pervasive health hazards in mining. Repeated exposure to high... employees working nearby. The Mine Safety and Health Administration (MSHA), the Occupational Safety and... DEPARTMENT OF LABOR Mine Safety and Health Administration Proposed Extension of Existing...

  20. 75 FR 37815 - Agency Information Collection Activities: Two-Year Extension of an Existing Information...

    Science.gov (United States)

    2010-06-30

    ... health professions and nursing education and training programs. The reporting system measures the grantee... developed for the Bureau of Health Professions' Title VII and VIII health professions and nursing education... and Results Act (GPRA). The Bureau will be making minor changes to the previously approved information...

  1. De novo protein structure prediction by dynamic fragment assembly and conformational space annealing.

    Science.gov (United States)

    Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung

    2011-08-01

    Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.

  2. Detection of protein complex from protein-protein interaction network using Markov clustering

    International Nuclear Information System (INIS)

    Ochieng, P J; Kusuma, W A; Haryanto, T

    2017-01-01

    Detection of complexes, or groups of functionally related proteins, is an important challenge while analysing biological networks. However, existing algorithms to identify protein complexes are insufficient when applied to dense networks of experimentally derived interaction data. Therefore, we introduced a graph clustering method based on Markov clustering algorithm to identify protein complex within highly interconnected protein-protein interaction networks. Protein-protein interaction network was first constructed to develop geometrical network, the network was then partitioned using Markov clustering to detect protein complexes. The interest of the proposed method was illustrated by its application to Human Proteins associated to type II diabetes mellitus. Flow simulation of MCL algorithm was initially performed and topological properties of the resultant network were analysed for detection of the protein complex. The results indicated the proposed method successfully detect an overall of 34 complexes with 11 complexes consisting of overlapping modules and 20 non-overlapping modules. The major complex consisted of 102 proteins and 521 interactions with cluster modularity and density of 0.745 and 0.101 respectively. The comparison analysis revealed MCL out perform AP, MCODE and SCPS algorithms with high clustering coefficient (0.751) network density and modularity index (0.630). This demonstrated MCL was the most reliable and efficient graph clustering algorithm for detection of protein complexes from PPI networks. (paper)

  3. Imbalanced multi-modal multi-label learning for subcellular localization prediction of human proteins with both single and multiple sites.

    Directory of Open Access Journals (Sweden)

    Jianjun He

    Full Text Available It is well known that an important step toward understanding the functions of a protein is to determine its subcellular location. Although numerous prediction algorithms have been developed, most of them typically focused on the proteins with only one location. In recent years, researchers have begun to pay attention to the subcellular localization prediction of the proteins with multiple sites. However, almost all the existing approaches have failed to take into account the correlations among the locations caused by the proteins with multiple sites, which may be the important information for improving the prediction accuracy of the proteins with multiple sites. In this paper, a new algorithm which can effectively exploit the correlations among the locations is proposed by using gaussian process model. Besides, the algorithm also can realize optimal linear combination of various feature extraction technologies and could be robust to the imbalanced data set. Experimental results on a human protein data set show that the proposed algorithm is valid and can achieve better performance than the existing approaches.

  4. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

    Science.gov (United States)

    Fang, Chao; Shang, Yi; Xu, Dong

    2018-05-01

    Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html. © 2018 Wiley Periodicals, Inc.

  5. Prediction of hot spots in protein interfaces using a random forest model with hybrid features.

    Science.gov (United States)

    Wang, Lin; Liu, Zhi-Ping; Zhang, Xiang-Sun; Chen, Luonan

    2012-03-01

    Prediction of hot spots in protein interfaces provides crucial information for the research on protein-protein interaction and drug design. Existing machine learning methods generally judge whether a given residue is likely to be a hot spot by extracting features only from the target residue. However, hot spots usually form a small cluster of residues which are tightly packed together at the center of protein interface. With this in mind, we present a novel method to extract hybrid features which incorporate a wide range of information of the target residue and its spatially neighboring residues, i.e. the nearest contact residue in the other face (mirror-contact residue) and the nearest contact residue in the same face (intra-contact residue). We provide a novel random forest (RF) model to effectively integrate these hybrid features for predicting hot spots in protein interfaces. Our method can achieve accuracy (ACC) of 82.4% and Matthew's correlation coefficient (MCC) of 0.482 in Alanine Scanning Energetics Database, and ACC of 77.6% and MCC of 0.429 in Binding Interface Database. In a comparison study, performance of our RF model exceeds other existing methods, such as Robetta, FOLDEF, KFC, KFC2, MINERVA and HotPoint. Of our hybrid features, three physicochemical features of target residues (mass, polarizability and isoelectric point), the relative side-chain accessible surface area and the average depth index of mirror-contact residues are found to be the main discriminative features in hot spots prediction. We also confirm that hot spots tend to form large contact surface areas between two interacting proteins. Source data and code are available at: http://www.aporc.org/doc/wiki/HotSpot.

  6. Molecular tweezers modulate 14-3-3 protein-protein interactions

    Science.gov (United States)

    Bier, David; Rose, Rolf; Bravo-Rodriguez, Kenny; Bartel, Maria; Ramirez-Anguita, Juan Manuel; Dutt, Som; Wilch, Constanze; Klärner, Frank-Gerrit; Sanchez-Garcia, Elsa; Schrader, Thomas; Ottmann, Christian

    2013-03-01

    Supramolecular chemistry has recently emerged as a promising way to modulate protein functions, but devising molecules that will interact with a protein in the desired manner is difficult as many competing interactions exist in a biological environment (with solvents, salts or different sites for the target biomolecule). We now show that lysine-specific molecular tweezers bind to a 14-3-3 adapter protein and modulate its interaction with partner proteins. The tweezers inhibit binding between the 14-3-3 protein and two partner proteins—a phosphorylated (C-Raf) protein and an unphosphorylated one (ExoS)—in a concentration-dependent manner. Protein crystallography shows that this effect arises from the binding of the tweezers to a single surface-exposed lysine (Lys214) of the 14-3-3 protein in the proximity of its central channel, which normally binds the partner proteins. A combination of structural analysis and computer simulations provides rules for the tweezers' binding preferences, thus allowing us to predict their influence on this type of protein-protein interactions.

  7. ProCKSI: a decision support system for Protein (Structure Comparison, Knowledge, Similarity and Information

    Directory of Open Access Journals (Sweden)

    Błażewicz Jacek

    2007-10-01

    Full Text Available Abstract Background We introduce the decision support system for Protein (Structure Comparison, Knowledge, Similarity and Information (ProCKSI. ProCKSI integrates various protein similarity measures through an easy to use interface that allows the comparison of multiple proteins simultaneously. It employs the Universal Similarity Metric (USM, the Maximum Contact Map Overlap (MaxCMO of protein structures and other external methods such as the DaliLite and the TM-align methods, the Combinatorial Extension (CE of the optimal path, and the FAST Align and Search Tool (FAST. Additionally, ProCKSI allows the user to upload a user-defined similarity matrix supplementing the methods mentioned, and computes a similarity consensus in order to provide a rich, integrated, multicriteria view of large datasets of protein structures. Results We present ProCKSI's architecture and workflow describing its intuitive user interface, and show its potential on three distinct test-cases. In the first case, ProCKSI is used to evaluate the results of a previous CASP competition, assessing the similarity of proposed models for given targets where the structures could have a large deviation from one another. To perform this type of comparison reliably, we introduce a new consensus method. The second study deals with the verification of a classification scheme for protein kinases, originally derived by sequence comparison by Hanks and Hunter, but here we use a consensus similarity measure based on structures. In the third experiment using the Rost and Sander dataset (RS126, we investigate how a combination of different sets of similarity measures influences the quality and performance of ProCKSI's new consensus measure. ProCKSI performs well with all three datasets, showing its potential for complex, simultaneous multi-method assessment of structural similarity in large protein datasets. Furthermore, combining different similarity measures is usually more robust than

  8. Ontological Proofs of Existence and Non-Existence

    Czech Academy of Sciences Publication Activity Database

    Hájek, Petr

    2008-01-01

    Roč. 90, č. 2 (2008), s. 257-262 ISSN 0039-3215 R&D Projects: GA AV ČR IAA100300503 Institutional research plan: CEZ:AV0Z10300504 Keywords : ontological proofs * existence * non-existence * Gödel * Caramuel Subject RIV: BA - General Mathematics

  9. Directed Evolution of Proteins through In Vitro Protein Synthesis in Liposomes

    Directory of Open Access Journals (Sweden)

    Takehiro Nishikawa

    2012-01-01

    Full Text Available Directed evolution of proteins is a technique used to modify protein functions through “Darwinian selection.” In vitro compartmentalization (IVC is an in vitro gene screening system for directed evolution of proteins. IVC establishes the link between genetic information (genotype and the protein translated from the information (phenotype, which is essential for all directed evolution methods, by encapsulating both in a nonliving microcompartment. Herein, we introduce a new liposome-based IVC system consisting of a liposome, the protein synthesis using recombinant elements (PURE system and a fluorescence-activated cell sorter (FACS used as a microcompartment, in vitro protein synthesis system, and high-throughput screen, respectively. Liposome-based IVC is characterized by in vitro protein synthesis from a single copy of a gene in a cell-sized unilamellar liposome and quantitative functional evaluation of the synthesized proteins. Examples of liposome-based IVC for screening proteins such as GFP and β-glucuronidase are described. We discuss the future directions for this method and its applications.

  10. Structural Transition and Antibody Binding of EBOV GP and ZIKV E Proteins from Pre-Fusion to Fusion-Initiation State

    Directory of Open Access Journals (Sweden)

    Anna Lappala

    2018-05-01

    Full Text Available Membrane fusion proteins are responsible for viral entry into host cells—a crucial first step in viral infection. These proteins undergo large conformational changes from pre-fusion to fusion-initiation structures, and, despite differences in viral genomes and disease etiology, many fusion proteins are arranged as trimers. Structural information for both pre-fusion and fusion-initiation states is critical for understanding virus neutralization by the host immune system. In the case of Ebola virus glycoprotein (EBOV GP and Zika virus envelope protein (ZIKV E, pre-fusion state structures have been identified experimentally, but only partial structures of fusion-initiation states have been described. While the fusion-initiation structure is in an energetically unfavorable state that is difficult to solve experimentally, the existing structural information combined with computational approaches enabled the modeling of fusion-initiation state structures of both proteins. These structural models provide an improved understanding of four different neutralizing antibodies in the prevention of viral host entry.

  11. Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information.

    Science.gov (United States)

    Jia, Bin; Wang, Xiaodong

    2013-12-17

    : The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.

  12. [Non-invasive analysis of proteins in living cells using NMR spectroscopy].

    Science.gov (United States)

    Tochio, Hidehito; Murayama, Shuhei; Inomata, Kohsuke; Morimoto, Daichi; Ohno, Ayako; Shirakawa, Masahiro

    2015-01-01

    NMR spectroscopy enables structural analyses of proteins and has been widely used in the structural biology field in recent decades. NMR spectroscopy can be applied to proteins inside living cells, allowing characterization of their structures and dynamics in intracellular environments. The simplest "in-cell NMR" approach employs bacterial cells; in this approach, live Escherichia coli cells overexpressing a specific protein are subjected to NMR. The cells are grown in an NMR active isotope-enriched medium to ensure that the overexpressed proteins are labeled with the stable isotopes. Thus the obtained NMR spectra, which are derived from labeled proteins, contain atomic-level information about the structure and dynamics of the proteins. Recent progress enables us to work with higher eukaryotic cells such as HeLa and HEK293 cells, for which a number of techniques have been developed to achieve isotope labeling of the specific target protein. In this review, we describe successful use of electroporation for in-cell NMR. In addition, (19)F-NMR to characterize protein-ligand interactions in cells is presented. Because (19)F nuclei rarely exist in natural cells, when (19)F-labeled proteins are delivered into cells and (19)F-NMR signals are observed, one can safely ascertain that these signals originate from the delivered proteins and not other molecules.

  13. Clinical trials information in drug development and regulation : existing systems and standards

    NARCIS (Netherlands)

    Valkenhoef, Gert van; Tervonen, Tommi; Brock, Bert de; Hillege, Hans

    2012-01-01

    Clinical trials provide pivotal evidence on drug efficacy and safety. The evidence, information from clinical trials, is currently used by regulatory decision makers in marketing authorization decisions, but only in an implicit manner. For clinical trials information to be used in a transparent and

  14. An Overview of the Prediction of Protein DNA-Binding Sites

    Directory of Open Access Journals (Sweden)

    Jingna Si

    2015-03-01

    Full Text Available Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.

  15. 77 FR 17098 - Proposed Extension of Existing Information Collection; Independent Contractor Registration and...

    Science.gov (United States)

    2012-03-23

    ... Information Collection; Independent Contractor Registration and Identification AGENCY: Mine Safety and Health...-00040, Independent Contractor Register. OMB last approved this information collection request (ICR) on...); or 202-693-9441 (facsimile). SUPPLEMENTARY INFORMATION: I. Background Independent contractors...

  16. Developing an Actuarial Track Utilizing Existing Resources

    Science.gov (United States)

    Rodgers, Kathy V.; Sarol, Yalçin

    2014-01-01

    Students earning a degree in mathematics often seek information on how to apply their mathematical knowledge. One option is to follow a curriculum with an actuarial emphasis designed to prepare students as an applied mathematician in the actuarial field. By developing only two new courses and utilizing existing courses for Validation by…

  17. Gα and regulator of G-protein signaling (RGS) protein pairs maintain functional compatibility and conserved interaction interfaces throughout evolution despite frequent loss of RGS proteins in plants.

    Science.gov (United States)

    Hackenberg, Dieter; McKain, Michael R; Lee, Soon Goo; Roy Choudhury, Swarup; McCann, Tyler; Schreier, Spencer; Harkess, Alex; Pires, J Chris; Wong, Gane Ka-Shu; Jez, Joseph M; Kellogg, Elizabeth A; Pandey, Sona

    2017-10-01

    Signaling pathways regulated by heterotrimeric G-proteins exist in all eukaryotes. The regulator of G-protein signaling (RGS) proteins are key interactors and critical modulators of the Gα protein of the heterotrimer. However, while G-proteins are widespread in plants, RGS proteins have been reported to be missing from the entire monocot lineage, with two exceptions. A single amino acid substitution-based adaptive coevolution of the Gα:RGS proteins was proposed to enable the loss of RGS in monocots. We used a combination of evolutionary and biochemical analyses and homology modeling of the Gα and RGS proteins to address their expansion and its potential effects on the G-protein cycle in plants. Our results show that RGS proteins are widely distributed in the monocot lineage, despite their frequent loss. There is no support for the adaptive coevolution of the Gα:RGS protein pair based on single amino acid substitutions. RGS proteins interact with, and affect the activity of, Gα proteins from species with or without endogenous RGS. This cross-functional compatibility expands between the metazoan and plant kingdoms, illustrating striking conservation of their interaction interface. We propose that additional proteins or alternative mechanisms may exist which compensate for the loss of RGS in certain plant species. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  18. Analysis of protein folds using protein contact networks

    Indian Academy of Sciences (India)

    is a well-recognized classification system of proteins, which is based on manual in- ... can easily correspond to the information in the 2D matrix. ..... [7] U K Muppirala and Zhijun Li, Protein Engineering, Design & Selection 19, 265 (2006).

  19. Triplet excited States as a source of relevant (bio)chemical information.

    Science.gov (United States)

    Jiménez, M Consuelo; Miranda, Miguel A

    2014-01-01

    The properties of triplet excited states are markedly medium-dependent, which turns this species into valuable tools for investigating the microenvironments existing in protein binding pockets. Monitoring of the triplet excited state behavior of drugs within transport proteins (serum albumins and α1-acid glycoproteins) by laser flash photolysis constitutes a valuable source of information on the strength of interaction, conformational freedom and protection from oxygen or other external quenchers. With proteins, formation of spatially confined triplet excited states is favored over competitive processes affording ionic species. Remarkably, under aerobic atmosphere, the triplet decay of drug@protein complexes is dramatically longer than in bulk solution. This offers a convenient dynamic range for assignment of different triplet populations or for stereochemical discrimination. In this review, selected examples of the application of the laser flash photolysis technique are described, including drug distribution between the bulk solution and the protein cavities, or between two types of proteins, detection of drug-drug interactions inside proteins, and enzyme-like activity processes mediated by proteins. Finally, protein encapsulation can also modify the photoreactivity of the guest. This is illustrated by presenting an example of retarded photooxidation.

  20. Prediction of heterodimeric protein complexes from weighted protein-protein interaction networks using novel features and kernel functions.

    Directory of Open Access Journals (Sweden)

    Peiying Ruan

    Full Text Available Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.

  1. Identification of group specific motifs in Beta-lactamase family of proteins

    Directory of Open Access Journals (Sweden)

    Saxena Akansha

    2009-12-01

    Full Text Available Abstract Background Beta-lactamases are one of the most serious threats to public health. In order to combat this threat we need to study the molecular and functional diversity of these enzymes and identify signatures specific to these enzymes. These signatures will enable us to develop inhibitors and diagnostic probes specific to lactamases. The existing classification of beta-lactamases was developed nearly 30 years ago when few lactamases were available. DLact database contain more than 2000 beta-lactamase, which can be used to study the molecular diversity and to identify signatures specific to this family. Methods A set of 2020 beta-lactamase proteins available in the DLact database http://59.160.102.202/DLact were classified using graph-based clustering of Best Bi-Directional Hits. Non-redundant (> 90 percent identical protein sequences from each group were aligned using T-Coffee and annotated using information available in literature. Motifs specific to each group were predicted using PRATT program. Results The graph-based classification of beta-lactamase proteins resulted in the formation of six groups (Four major groups containing 191, 726, 774 and 73 proteins while two minor groups containing 50 and 8 proteins. Based on the information available in literature, we found that each of the four major groups correspond to the four classes proposed by Ambler. The two minor groups were novel and do not contain molecular signatures of beta-lactamase proteins reported in literature. The group-specific motifs showed high sensitivity (> 70% and very high specificity (> 90%. The motifs from three groups (corresponding to class A, C and D had a high level of conservation at DNA as well as protein level whereas the motifs from the fourth group (corresponding to class B showed conservation at only protein level. Conclusion The graph-based classification of beta-lactamase proteins corresponds with the classification proposed by Ambler, thus there is

  2. 75 FR 51094 - Agency Information Collection Activities: Form N-600; Extension of an Existing Information...

    Science.gov (United States)

    2010-08-18

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services [OMB Control No. 1615..., Application for Certificate of Citizenship; OMB Control No. 1615- 0057. The Department of Homeland Security, U.S. Citizenship and Immigration Services (USCIS) will be submitting the following information...

  3. 77 FR 34052 - Agency Information Collection Activities: Form I-102; Revision of an Existing Information...

    Science.gov (United States)

    2012-06-08

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information... Nonimmigrant Arrival-Departure Document. The Department of Homeland Security, U.S. Citizenship and Immigration...-102; U.S. Citizenship and Immigration Services (USCIS). (4) Affected public who will be asked or...

  4. 76 FR 17144 - Agency Information Collection Activities: Form N-300; Extension of an Existing Information...

    Science.gov (United States)

    2011-03-28

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information... Intention; OMB Control No. 1615-0078. The Department Homeland Security, U.S. Citizenship and Immigration... the Department of Homeland Security sponsoring the collection: Form N-300; U.S. Citizenship and...

  5. 75 FR 51096 - Agency Information Collection Activities: Form N-470; Extension of an Existing Information...

    Science.gov (United States)

    2010-08-18

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information... for Naturalization; OMB Control No. 1615-0056. The Department of Homeland Security, U.S. Citizenship... of Homeland Security sponsoring the collection: Form N-470; U.S. Citizenship and Immigration Services...

  6. 75 FR 6212 - Agency Information Collection Activities: Form I-129, Revision of an Existing Information...

    Science.gov (United States)

    2010-02-08

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information... Control Number 1615-0009. The Department of Homeland Security, U.S. Citizenship and Immigration Services... the Department of Homeland Security sponsoring the collection: Form I-129. U.S. Citizenship and...

  7. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-01-01

    operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching

  8. Detrended cross-correlation coefficient: Application to predict apoptosis protein subcellular localization.

    Science.gov (United States)

    Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

    2016-12-01

    Apoptosis, or programed cell death, plays a central role in the development and homeostasis of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful for understanding the apoptosis mechanism. The prediction of subcellular localization of an apoptosis protein is still a challenging task, and existing methods mainly based on protein primary sequences. In this paper, we introduce a new position-specific scoring matrix (PSSM)-based method by using detrended cross-correlation (DCCA) coefficient of non-overlapping windows. Then a 190-dimensional (190D) feature vector is constructed on two widely used datasets: CL317 and ZD98, and support vector machine is adopted as classifier. To evaluate the proposed method, objective and rigorous jackknife cross-validation tests are performed on the two datasets. The results show that our approach offers a novel and reliable PSSM-based tool for prediction of apoptosis protein subcellular localization. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

    Science.gov (United States)

    Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

    2018-05-08

    Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on

  10. 76 FR 72209 - Agency Information Collection Activities: Form N-300; Revision of an Existing Information...

    Science.gov (United States)

    2011-11-22

    ...) will be submitting the following information collection request to the Office of Management and Budget... Security (DHS), and to the Office of Management and Budget (OMB) USCIS Desk Officer. Comments may be... documentary requirements for those seeking to work in certain occupations [[Page 72210

  11. 75 FR 30050 - Agency Information Collection Activities: Form N-648, Revision of an Existing Information...

    Science.gov (United States)

    2010-05-28

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information... Disability Exceptions. OMB Control No. 1615-0060. The Department of Homeland Security, U.S. Citizenship and...-648. U.S. Citizenship and Immigration Services (USCIS). (4) Affected public who will be asked or...

  12. 75 FR 51096 - Agency Information Collection Activities: Form N-400; Extension of an Existing Information...

    Science.gov (United States)

    2010-08-18

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information...; OMB Control No. 1615-0052. The Department of Homeland Security, U.S. Citizenship and Immigration.... Citizenship and Immigration Services (USCIS). (4) Affected public who will be asked or required to respond, as...

  13. Isotopomer distributions in amino acids from a highly expressed protein as a proxy for those from total protein

    Energy Technology Data Exchange (ETDEWEB)

    Shaikh, Afshan; Shaikh, Afshan S.; Tang, Yinjie; Mukhopadhyay, Aindrila; Keasling, Jay D.

    2008-06-27

    {sup 13}C-based metabolic flux analysis provides valuable information about bacterial physiology. Though many biological processes rely on the synergistic functions of microbial communities, study of individual organisms in a mixed culture using existing flux analysis methods is difficult. Isotopomer-based flux analysis typically relies on hydrolyzed amino acids from a homogeneous biomass. Thus metabolic flux analysis of a given organism in a mixed culture requires its separation from the mixed culture. Swift and efficient cell separation is difficult and a major hurdle for isotopomer-based flux analysis of mixed cultures. Here we demonstrate the use of a single highly-expressed protein to analyze the isotopomer distribution of amino acids from one organism. Using the model organism E. coli expressing a plasmid-borne, his-tagged Green Fluorescent Protein (GFP), we show that induction of GFP does not affect E. coli growth kinetics or the isotopomer distribution in nine key metabolites. Further, the isotopomer labeling patterns of amino acids derived from purified GFP and total cell protein are indistinguishable, indicating that amino acids from a purified protein can be used to infer metabolic fluxes of targeted organisms in a mixed culture. This study provides the foundation to extend isotopomer-based flux analysis to study metabolism of individual strains in microbial communities.

  14. PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system

    Directory of Open Access Journals (Sweden)

    Picard-Cloutier Aude

    2007-12-01

    Full Text Available Abstract Background In the "post-genome" era, mass spectrometry (MS has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools. Description We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified. Conclusion Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5.

  15. Induced variation in protein mutants after multiple EMS and X-ray treatments

    International Nuclear Information System (INIS)

    Walther, H.; Seibold, K.H.

    1978-01-01

    Results from two experiments with spring barley in the M 3 and M 8 generations gave information on the efficiency of selected mutants, improved in protein yield by different mutagenic treatments. A selection rate of 1-2% was found to be realistic according to a 5% significance level. However, differences between mutagenic treatments with EMS and X-rays and between varieties were found to be not only incidental. To evaluate the results within a protein mutation breeding programme a bivariate selection model was applied and was found to allow clear decisions on mutagenic improvements in protein production, measured in g protein/m 2 . When substituting protein yield in g/seed for protein yield in g/m 2 in the early generations, all relations to protein and grain yield in g/m 2 were found to be low and negative. We conclude that this substituted selection character can be of only limited aid. But very high positive correlations exist between protein yield/m 2 , lysine yield/m 2 and grain yield/m 2 , which means that these selection characters would render a more reliable basis for selection in early generations. (author)

  16. 76 FR 12750 - Agency Information Collection Activities: Form I-829, Extension of an Existing Information...

    Science.gov (United States)

    2011-03-08

    ... ACTION: 30-Day Notice of Information Collection Under Review: Form I- 829, Petition by Entrepreneur to... the Form/Collection: Petition by Entrepreneur to Remove Conditions. (3) Agency form number, if any... conditional resident alien entrepreneur who obtained such status through a qualifying investment, to apply to...

  17. A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis.

    Science.gov (United States)

    Tucker, George; Loh, Po-Ru; Berger, Bonnie

    2013-10-04

    Comprehensive protein-protein interaction (PPI) maps are a powerful resource for uncovering the molecular basis of genetic interactions and providing mechanistic insights. Over the past decade, high-throughput experimental techniques have been developed to generate PPI maps at proteome scale, first using yeast two-hybrid approaches and more recently via affinity purification combined with mass spectrometry (AP-MS). Unfortunately, data from both protocols are prone to both high false positive and false negative rates. To address these issues, many methods have been developed to post-process raw PPI data. However, with few exceptions, these methods only analyze binary experimental data (in which each potential interaction tested is deemed either observed or unobserved), neglecting quantitative information available from AP-MS such as spectral counts. We propose a novel method for incorporating quantitative information from AP-MS data into existing PPI inference methods that analyze binary interaction data. Our approach introduces a probabilistic framework that models the statistical noise inherent in observations of co-purifications. Using a sampling-based approach, we model the uncertainty of interactions with low spectral counts by generating an ensemble of possible alternative experimental outcomes. We then apply the existing method of choice to each alternative outcome and aggregate results over the ensemble. We validate our approach on three recent AP-MS data sets and demonstrate performance comparable to or better than state-of-the-art methods. Additionally, we provide an in-depth discussion comparing the theoretical bases of existing approaches and identify common aspects that may be key to their performance. Our sampling framework extends the existing body of work on PPI analysis using binary interaction data to apply to the richer quantitative data now commonly available through AP-MS assays. This framework is quite general, and many enhancements are likely

  18. TOF-SIMS imaging technique with information entropy

    International Nuclear Information System (INIS)

    Aoyagi, Satoka; Kawashima, Y.; Kudo, Masahiro

    2005-01-01

    Time-of-flight secondary ion mass spectrometry (TOF-SIMS) is capable of chemical imaging of proteins on insulated samples in principal. However, selection of specific peaks related to a particular protein, which are necessary for chemical imaging, out of numerous candidates had been difficult without an appropriate spectrum analysis technique. Therefore multivariate analysis techniques, such as principal component analysis (PCA), and analysis with mutual information defined by information theory, have been applied to interpret SIMS spectra of protein samples. In this study mutual information was applied to select specific peaks related to proteins in order to obtain chemical images. Proteins on insulated materials were measured with TOF-SIMS and then SIMS spectra were analyzed by means of the analysis method based on the comparison using mutual information. Chemical mapping of each protein was obtained using specific peaks related to each protein selected based on values of mutual information. The results of TOF-SIMS images of proteins on the materials provide some useful information on properties of protein adsorption, optimality of immobilization processes and reaction between proteins. Thus chemical images of proteins by TOF-SIMS contribute to understand interactions between material surfaces and proteins and to develop sophisticated biomaterials

  19. Protein docking prediction using predicted protein-protein interface

    Directory of Open Access Journals (Sweden)

    Li Bin

    2012-01-01

    Full Text Available Abstract Background Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. Results We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm, is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. Conclusion We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  20. Protein docking prediction using predicted protein-protein interface.

    Science.gov (United States)

    Li, Bin; Kihara, Daisuke

    2012-01-10

    Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  1. Whey Protein

    Science.gov (United States)

    ... reliable information about the safety of taking whey protein if you are pregnant or breast feeding. Stay on the safe side and avoid use. Milk allergy: If you are allergic to cow's milk, avoid using whey protein.

  2. 75 FR 32800 - Agency Information Collection Activities: Form N-300; Extension of an Existing Information...

    Science.gov (United States)

    2010-06-09

    ... Services (USCIS) has submitted the following information collection request to the Office of Management and... Homeland Security (DHS), and to the Office of Management and Budget (OMB) USCIS Desk Officer. Comments may... citizen of the United States. This collection is also used to satisfy documentary requirements for those...

  3. 77 FR 35991 - Agency Information Collection Activities: Form I-829, Extension of an Existing Information...

    Science.gov (United States)

    2012-06-15

    ... Entrepreneur to Remove Conditions. On June 7, 2012, the Department of Homeland Security (DHS), U.S. Citizenship... information collection. (2) Title of the Form/Collection: Petition by Entrepreneur to Remove Conditions. (3.... This form is used by a conditional resident alien entrepreneur who obtained such status through a...

  4. Information flow analysis of interactome networks.

    Directory of Open Access Journals (Sweden)

    Patrycja Vasilyev Missiuro

    2009-04-01

    Full Text Available Recent studies of cellular networks have revealed modular organizations of genes and proteins. For example, in interactome networks, a module refers to a group of interacting proteins that form molecular complexes and/or biochemical pathways and together mediate a biological process. However, it is still poorly understood how biological information is transmitted between different modules. We have developed information flow analysis, a new computational approach that identifies proteins central to the transmission of biological information throughout the network. In the information flow analysis, we represent an interactome network as an electrical circuit, where interactions are modeled as resistors and proteins as interconnecting junctions. Construing the propagation of biological signals as flow of electrical current, our method calculates an information flow score for every protein. Unlike previous metrics of network centrality such as degree or betweenness that only consider topological features, our approach incorporates confidence scores of protein-protein interactions and automatically considers all possible paths in a network when evaluating the importance of each protein. We apply our method to the interactome networks of Saccharomyces cerevisiae and Caenorhabditis elegans. We find that the likelihood of observing lethality and pleiotropy when a protein is eliminated is positively correlated with the protein's information flow score. Even among proteins of low degree or low betweenness, high information scores serve as a strong predictor of loss-of-function lethality or pleiotropy. The correlation between information flow scores and phenotypes supports our hypothesis that the proteins of high information flow reside in central positions in interactome networks. We also show that the ranks of information flow scores are more consistent than that of betweenness when a large amount of noisy data is added to an interactome. Finally, we

  5. Protein-protein docking using region-based 3D Zernike descriptors

    Directory of Open Access Journals (Sweden)

    Sael Lee

    2009-12-01

    Full Text Available Abstract Background Protein-protein interactions are a pivotal component of many biological processes and mediate a variety of functions. Knowing the tertiary structure of a protein complex is therefore essential for understanding the interaction mechanism. However, experimental techniques to solve the structure of the complex are often found to be difficult. To this end, computational protein-protein docking approaches can provide a useful alternative to address this issue. Prediction of docking conformations relies on methods that effectively capture shape features of the participating proteins while giving due consideration to conformational changes that may occur. Results We present a novel protein docking algorithm based on the use of 3D Zernike descriptors as regional features of molecular shape. The key motivation of using these descriptors is their invariance to transformation, in addition to a compact representation of local surface shape characteristics. Docking decoys are generated using geometric hashing, which are then ranked by a scoring function that incorporates a buried surface area and a novel geometric complementarity term based on normals associated with the 3D Zernike shape description. Our docking algorithm was tested on both bound and unbound cases in the ZDOCK benchmark 2.0 dataset. In 74% of the bound docking predictions, our method was able to find a near-native solution (interface C-αRMSD ≤ 2.5 Å within the top 1000 ranks. For unbound docking, among the 60 complexes for which our algorithm returned at least one hit, 60% of the cases were ranked within the top 2000. Comparison with existing shape-based docking algorithms shows that our method has a better performance than the others in unbound docking while remaining competitive for bound docking cases. Conclusion We show for the first time that the 3D Zernike descriptors are adept in capturing shape complementarity at the protein-protein interface and useful for

  6. Fast mapping rapidly integrates information into existing memory networks.

    Science.gov (United States)

    Coutanche, Marc N; Thompson-Schill, Sharon L

    2014-12-01

    Successful learning involves integrating new material into existing memory networks. A learning procedure known as fast mapping (FM), thought to simulate the word-learning environment of children, has recently been linked to distinct neuroanatomical substrates in adults. This idea has suggested the (never-before tested) hypothesis that FM may promote rapid incorporation into cortical memory networks. We test this hypothesis here in 2 experiments. In our 1st experiment, we introduced 50 participants to 16 unfamiliar animals and names through FM or explicit encoding (EE) and tested participants on the training day, and again after sleep. Learning through EE produced strong declarative memories, without immediate lexical competition, as expected from slow-consolidation models. Learning through FM, however, led to almost immediate lexical competition, which continued to the next day. Additionally, the learned words began to prime related concepts on the day following FM (but not EE) training. In a 2nd experiment, we replicated the lexical integration results and determined that presenting an already-known item during learning was crucial for rapid integration through FM. The findings presented here indicate that learned items can be integrated into cortical memory networks at an accelerated rate through fast mapping. The retrieval of a related known concept, in order to infer the target of the FM question, is critical for this effect. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  7. Controversies surrounding high-protein diet intake: satiating effect and kidney and bone health.

    Science.gov (United States)

    Cuenca-Sánchez, Marta; Navas-Carrillo, Diana; Orenes-Piñero, Esteban

    2015-05-01

    Long-term consumption of a high-protein diet could be linked with metabolic and clinical problems, such as loss of bone mass and renal dysfunction. However, although it is well accepted that a high-protein diet may be detrimental to individuals with existing kidney dysfunction, there is little evidence that high protein intake is dangerous for healthy individuals. High-protein meals and foods are thought to have a greater satiating effect than high-carbohydrate or high-fat meals. The effect of high-protein diets on the modulation of satiety involves multiple metabolic pathways. Protein intake induces complex signals, with peptide hormones being released from the gastrointestinal tract and blood amino acids and derived metabolites being released in the blood. Protein intake also stimulates metabolic hormones that communicate information about energy status to the brain. Long-term ingestion of high amounts of protein seems to decrease food intake, body weight, and body adiposity in many well-documented studies. The aim of this article is to provide an extensive overview of the efficacy of high protein consumption in weight loss and maintenance, as well as the potential consequences in human health of long-term intake. © 2015 American Society for Nutrition.

  8. Evolution of an intricate J-protein network driving protein disaggregation in eukaryotes.

    Science.gov (United States)

    Nillegoda, Nadinath B; Stank, Antonia; Malinverni, Duccio; Alberts, Niels; Szlachcic, Anna; Barducci, Alessandro; De Los Rios, Paolo; Wade, Rebecca C; Bukau, Bernd

    2017-05-15

    Hsp70 participates in a broad spectrum of protein folding processes extending from nascent chain folding to protein disaggregation. This versatility in function is achieved through a diverse family of J-protein cochaperones that select substrates for Hsp70. Substrate selection is further tuned by transient complexation between different classes of J-proteins, which expands the range of protein aggregates targeted by metazoan Hsp70 for disaggregation. We assessed the prevalence and evolutionary conservation of J-protein complexation and cooperation in disaggregation. We find the emergence of a eukaryote-specific signature for interclass complexation of canonical J-proteins. Consistently, complexes exist in yeast and human cells, but not in bacteria, and correlate with cooperative action in disaggregation in vitro. Signature alterations exclude some J-proteins from networking, which ensures correct J-protein pairing, functional network integrity and J-protein specialization. This fundamental change in J-protein biology during the prokaryote-to-eukaryote transition allows for increased fine-tuning and broadening of Hsp70 function in eukaryotes.

  9. 78 FR 41062 - Agency Information Collection Activities: Existing Collection; Emergency Extension

    Science.gov (United States)

    2013-07-09

    ... Collection Title: Demographic Information on Federal Job Applicants. OMB Control No.: 3046-0046. Description... agencies to evaluate their employment practices through the collection and analysis of data on the race...

  10. On the analysis of protein-protein interactions via knowledge-based potentials for the prediction of protein-protein docking

    DEFF Research Database (Denmark)

    Feliu, Elisenda; Aloy, Patrick; Oliva, Baldo

    2011-01-01

    Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions for s...... and with independence of the partner. This information is encoded at the residue level and could be easily incorporated in the initial grid scoring for Fast Fourier Transform rigid-body docking methods.......Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions...... for selecting rigid-body docking poses. These potentials include the energetic component that provides the residues with a particular secondary structure and surface accessibility. These scoring functions have been tested on a state-of-art benchmark dataset and on a decoy dataset of permanent interactions. Our...

  11. CLMSVault: A Software Suite for Protein Cross-Linking Mass-Spectrometry Data Analysis and Visualization.

    Science.gov (United States)

    Courcelles, Mathieu; Coulombe-Huntington, Jasmin; Cossette, Émilie; Gingras, Anne-Claude; Thibault, Pierre; Tyers, Mike

    2017-07-07

    Protein cross-linking mass spectrometry (CL-MS) enables the sensitive detection of protein interactions and the inference of protein complex topology. The detection of chemical cross-links between protein residues can identify intra- and interprotein contact sites or provide physical constraints for molecular modeling of protein structure. Recent innovations in cross-linker design, sample preparation, mass spectrometry, and software tools have significantly improved CL-MS approaches. Although a number of algorithms now exist for the identification of cross-linked peptides from mass spectral data, a dearth of user-friendly analysis tools represent a practical bottleneck to the broad adoption of the approach. To facilitate the analysis of CL-MS data, we developed CLMSVault, a software suite designed to leverage existing CL-MS algorithms and provide intuitive and flexible tools for cross-platform data interpretation. CLMSVault stores and combines complementary information obtained from different cross-linkers and search algorithms. CLMSVault provides filtering, comparison, and visualization tools to support CL-MS analyses and includes a workflow for label-free quantification of cross-linked peptides. An embedded 3D viewer enables the visualization of quantitative data and the mapping of cross-linked sites onto PDB structural models. We demonstrate the application of CLMSVault for the analysis of a noncovalent Cdc34-ubiquitin protein complex cross-linked under different conditions. CLMSVault is open-source software (available at https://gitlab.com/courcelm/clmsvault.git ), and a live demo is available at http://democlmsvault.tyerslab.com/ .

  12. Degradation kinetics of fisetin and quercetin in solutions affected by medium pH, temperature and co-existed proteins

    Directory of Open Access Journals (Sweden)

    Wang Jing

    2016-01-01

    Full Text Available Impacts of medium pH, temperature and coexisted proteins on the degradation of two flavonoids fisetin and quercetin were assessed by spectroscopic method in the present study. Based on the measured degradation rate constants (k, fisetin was more stable than quercetin in all cases. Increasing medium pH from 6.0 to 7.5 at 37°C enhanced respective k values of fisetin and quercetin from 8.30x10−3 and 2.81x10−2 to 0.202 and 0.375 h-1 (P<0.05. In comparison with their degradation at 37°C, fisetin and quercetin showed larger k values at higher temperature (0.124 and 0.245 h−1 at 50°C, or 0.490 and 1.42 h−1 at 65°C. Four protein products in medium could stabilize the two flavonoids (P<0.05, as these proteins at 0.10 g L-1 decreased respective k values of fisetin and quercetin to 2.28x10−2-2.98x10−2 and 4.37´10−2-5.97x10−2 h−1. Hydrophobic interaction between the proteins and the two flavonoids was evidenced responsible for the stabilization, as sodium dodecyl sulfate could destroy the stabilization significantly (P<0.05. Casein and soybean protein provided greater stabilization than whey protein isolate. It is thus concluded that higher temperature and alkaline pH can enhance flavonoid loss, whereas coexisted proteins as flavonoid stabilizers can inhibit flavonoid degradation.

  13. Adaptive GDDA-BLAST: fast and efficient algorithm for protein sequence embedding.

    Directory of Open Access Journals (Sweden)

    Yoojin Hong

    2010-10-01

    Full Text Available A major computational challenge in the genomic era is annotating structure/function to the vast quantities of sequence information that is now available. This problem is illustrated by the fact that most proteins lack comprehensive annotations, even when experimental evidence exists. We previously theorized that embedded-alignment profiles (simply "alignment profiles" hereafter provide a quantitative method that is capable of relating the structural and functional properties of proteins, as well as their evolutionary relationships. A key feature of alignment profiles lies in the interoperability of data format (e.g., alignment information, physio-chemical information, genomic information, etc.. Indeed, we have demonstrated that the Position Specific Scoring Matrices (PSSMs are an informative M-dimension that is scored by quantitatively measuring the embedded or unmodified sequence alignments. Moreover, the information obtained from these alignments is informative, and remains so even in the "twilight zone" of sequence similarity (<25% identity. Although our previous embedding strategy was powerful, it suffered from contaminating alignments (embedded AND unmodified and high computational costs. Herein, we describe the logic and algorithmic process for a heuristic embedding strategy named "Adaptive GDDA-BLAST." Adaptive GDDA-BLAST is, on average, up to 19 times faster than, but has similar sensitivity to our previous method. Further, data are provided to demonstrate the benefits of embedded-alignment measurements in terms of detecting structural homology in highly divergent protein sequences and isolating secondary structural elements of transmembrane and ankyrin-repeat domains. Together, these advances allow further exploration of the embedded alignment data space within sufficiently large data sets to eventually induce relevant statistical inferences. We show that sequence embedding could serve as one of the vehicles for measurement of low

  14. Assessment of Existing Information for Atlantic Coastal Fish Habitat Partnership (ACFHP)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The ACFHP database consist of three primary data tables, joined within SQL Server, a relational DBMS: 1. The Bibliographic table provides information on over 500...

  15. Information processing in network architecture of genome controlled signal transduction circuit. A proposed theoretical explanation.

    Science.gov (United States)

    Chakraborty, Chiranjib; Sarkar, Bimal Kumar; Patel, Pratiksha; Agoramoorthy, Govindasamy

    2012-01-01

    In this paper, Shannon information theory has been applied to elaborate cell signaling. It is proposed that in the cellular network architecture, four components viz. source (DNA), transmitter (mRNA), receiver (protein) and destination (another protein) are involved. The message transmits from source (DNA) to transmitter (mRNA) and then passes through a noisy channel reaching finally the receiver (protein). The protein synthesis process is here considered as the noisy channel. Ultimately, signal is transmitted from receiver to destination (another protein). The genome network architecture elements were compared with genetic alphabet L = {A, C, G, T} with a biophysical model based on the popular Shannon information theory. This study found the channel capacity as maximum for zero error (sigma = 0) and at this condition, transition matrix becomes a unit matrix with rank 4. The transition matrix will be erroneous and finally at sigma = 1 channel capacity will be localized maxima with a value of 0.415 due to the increased value at sigma. On the other hand, minima exists at sigma = 0.75, where all transition probabilities become 0.25 and uncertainty will be maximum resulting in channel capacity with the minima value of zero.

  16. Mitochondrial Dysfunction in Protein Conformational Disorders

    Indian Academy of Sciences (India)

    EstherShlomi

    protein misfolding of α-synuclein involves conformational changes in the protein .... upon association with a membrane surface its can adopt a helical form with an 11/3 ... case of α-synuclein electrostatic interactions exist between positively ...

  17. cuticleDB: a relational database of Arthropod cuticular proteins

    Directory of Open Access Journals (Sweden)

    Willis Judith H

    2004-09-01

    Full Text Available Abstract Background The insect exoskeleton or cuticle is a bi-partite composite of proteins and chitin that provides protective, skeletal and structural functions. Little information is available about the molecular structure of this important complex that exhibits a helicoidal architecture. Scores of sequences of cuticular proteins have been obtained from direct protein sequencing, from cDNAs, and from genomic analyses. Most of these cuticular protein sequences contain motifs found only in arthropod proteins. Description cuticleDB is a relational database containing all structural proteins of Arthropod cuticle identified to date. Many come from direct sequencing of proteins isolated from cuticle and from sequences from cDNAs that share common features with these authentic cuticular proteins. It also includes proteins from the Drosophila melanogaster and the Anopheles gambiae genomes, that have been predicted to be cuticular proteins, based on a Pfam motif (PF00379 responsible for chitin binding in Arthropod cuticle. The total number of the database entries is 445: 370 derive from insects, 60 from Crustacea and 15 from Chelicerata. The database can be accessed from our web server at http://bioinformatics.biol.uoa.gr/cuticleDB. Conclusions CuticleDB was primarily designed to contain correct and full annotation of cuticular protein data. The database will be of help to future genome annotators. Users will be able to test hypotheses for the existence of known and also of yet unknown motifs in cuticular proteins. An analysis of motifs may contribute to understanding how proteins contribute to the physical properties of cuticle as well as to the precise nature of their interaction with chitin.

  18. Data requirements for valuing externalities: The role of existing permitting processes

    Energy Technology Data Exchange (ETDEWEB)

    Lee, A.D.; Baechler, M.C.; Callaway, J.M.

    1990-08-01

    While the assessment of externalities, or residual impacts, will place new demands on regulators, utilities, and developers, existing processes already require certain data and information that may fulfill some of the data needs for externality valuation. This paper examines existing siting, permitting, and other processes and highlights similarities and differences between their data requirements and the data required to value environmental externalities. It specifically considers existing requirements for siting new electricity resources in Oregon and compares them with the information and data needed to value externalities for such resources. This paper also presents several observations about how states can take advantage of data acquired through processes already in place as they move into an era when externalities are considered in utility decision-making. It presents other observations on the similarities and differences between the data requirements under existing processes and those for valuing externalities. This paper also briefly discusses the special case of cumulative impacts. And it presents recommendations on what steps to take in future efforts to value externalities. 35 refs., 2 tabs.

  19. Crowdsourcing step-by-step information extraction to enhance existing how-to videos

    OpenAIRE

    Nguyen, Phu Tran; Weir, Sarah; Guo, Philip J.; Miller, Robert C.; Gajos, Krzysztof Z.; Kim, Ju Ho

    2014-01-01

    Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of existing how-to videos with step-by-step annotations. We first performed a formative study to verify that annotations are actually useful to learners. We created ToolScape, an interac...

  20. High Dietary Protein Intake and Protein-Related Acid Load on Bone Health.

    Science.gov (United States)

    Cao, Jay J

    2017-12-01

    Consumption of high-protein diets is increasingly popular due to the benefits of protein on preserving lean mass and controlling appetite and satiety. The paper is to review recent clinical research assessing dietary protein on calcium metabolism and bone health. Epidemiological studies show that long-term, high-protein intake is positively associated with bone mineral density and reduced risk of bone fracture incidence. Short-term interventional studies demonstrate that a high-protein diet does not negatively affect calcium homeostasis. Existing evidence supports that the negative effects of the acid load of protein on urinary calcium excretion are offset by the beneficial skeletal effects of high-protein intake. Future research should focus on the role and the degree of contribution of other dietary and physiological factors, such as intake of fruits and vegetables, in reducing the acid load and further enhancing the anabolic effects of protein on the musculoskeletal system.

  1. From existing data to novel hypotheses : design and application of structure-based Molecular Class Specific Information Systems

    NARCIS (Netherlands)

    Kuipers, R.K.P.

    2012-01-01

    As the active component of many biological systems, proteins are of great interest to life scientists. Proteins are used in a large number of different applications such as the production of precursors and compounds, for bioremediation, as drug targets, to diagnose patients suffering from

  2. 77 FR 39238 - Agency Information Collection Activities: Existing Collection; Emergency Extension

    Science.gov (United States)

    2012-07-02

    ... requirements for elementary and secondary public school districts. The EEOC uses EEO-5 data to investigate charges of employment discrimination against elementary and secondary public school districts. The data... Information Collection--Emergency Request--Revision of a Currently Approved Collection: Elementary-Secondary...

  3. Protein function prediction using neighbor relativity in protein-protein interaction network.

    Science.gov (United States)

    Moosavi, Sobhan; Rahgozar, Masoud; Rahimi, Amir

    2013-04-01

    There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology and bioinformatics. Some researches utilize the proteins interaction information to predict function for un-annotated proteins. In this paper, we propose a novel approach called "Neighbor Relativity Coefficient" (NRC) based on interaction network topology which estimates the functional similarity between two proteins. NRC is calculated for each pair of proteins based on their graph-based features including distance, common neighbors and the number of paths between them. In order to ascribe function to an un-annotated protein, NRC estimates a weight for each neighbor to transfer its annotation to the unknown protein. Finally, the unknown protein will be annotated by the top score transferred functions. We also investigate the effect of using different coefficients for various types of functions. The proposed method has been evaluated on Saccharomyces cerevisiae and Homo sapiens interaction networks. The performance analysis demonstrates that NRC yields better results in comparison with previous protein function prediction approaches that utilize interaction network. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. Modular protein switches derived from antibody mimetic proteins.

    Science.gov (United States)

    Nicholes, N; Date, A; Beaujean, P; Hauk, P; Kanwar, M; Ostermeier, M

    2016-02-01

    Protein switches have potential applications as biosensors and selective protein therapeutics. Protein switches built by fusion of proteins with the prerequisite input and output functions are currently developed using an ad hoc process. A modular switch platform in which existing switches could be readily adapted to respond to any ligand would be advantageous. We investigated the feasibility of a modular protein switch platform based on fusions of the enzyme TEM-1 β-lactamase (BLA) with two different antibody mimetic proteins: designed ankyrin repeat proteins (DARPins) and monobodies. We created libraries of random insertions of the gene encoding BLA into genes encoding a DARPin or a monobody designed to bind maltose-binding protein (MBP). From these libraries, we used a genetic selection system for β-lactamase activity to identify genes that conferred MBP-dependent ampicillin resistance to Escherichia coli. Some of these selected genes encoded switch proteins whose enzymatic activity increased up to 14-fold in the presence of MBP. We next introduced mutations into the antibody mimetic domain of these switches that were known to cause binding to different ligands. To different degrees, introduction of the mutations resulted in switches with the desired specificity, illustrating the potential modularity of these platforms. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  5. A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities.

    Science.gov (United States)

    Bastien, Olivier; Ortet, Philippe; Roy, Sylvaine; Maréchal, Eric

    2005-03-10

    Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.

  6. A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities

    Directory of Open Access Journals (Sweden)

    Maréchal Eric

    2005-03-01

    Full Text Available Abstract Background Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons and be the basis for a novel method of consistent and stable phylogenetic reconstruction. Results We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. Conclusion The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.

  7. Protein-protein interactions and cancer: targeting the central dogma.

    Science.gov (United States)

    Garner, Amanda L; Janda, Kim D

    2011-01-01

    Between 40,000 and 200,000 protein-protein interactions have been predicted to exist within the human interactome. As these interactions are of a critical nature in many important cellular functions and their dysregulation is causal of disease, the modulation of these binding events has emerged as a leading, yet difficult therapeutic arena. In particular, the targeting of protein-protein interactions relevant to cancer is of fundamental importance as the tumor-promoting function of several aberrantly expressed proteins in the cancerous state is directly resultant of its ability to interact with a protein-binding partner. Of significance, these protein complexes play a crucial role in each of the steps of the central dogma of molecular biology, the fundamental processes of genetic transmission. With the many important discoveries being made regarding the mechanisms of these genetic process, the identification of new chemical probes are needed to better understand and validate the druggability of protein-protein interactions related to the central dogma. In this review, we provide an overview of current small molecule-based protein-protein interaction inhibitors for each stage of the central dogma: transcription, mRNA splicing and translation. Importantly, through our analysis we have uncovered a lack of necessary probes targeting mRNA splicing and translation, thus, opening up the possibility for expansion of these fields.

  8. Protein-based stable isotope probing.

    Science.gov (United States)

    Jehmlich, Nico; Schmidt, Frank; Taubert, Martin; Seifert, Jana; Bastida, Felipe; von Bergen, Martin; Richnow, Hans-Hermann; Vogt, Carsten

    2010-12-01

    We describe a stable isotope probing (SIP) technique that was developed to link microbe-specific metabolic function to phylogenetic information. Carbon ((13)C)- or nitrogen ((15)N)-labeled substrates (typically with >98% heavy label) were used in cultivation experiments and the heavy isotope incorporation into proteins (protein-SIP) on growth was determined. The amount of incorporation provides a measure for assimilation of a substrate, and the sequence information from peptide analysis obtained by mass spectrometry delivers phylogenetic information about the microorganisms responsible for the metabolism of the particular substrate. In this article, we provide guidelines for incubating microbial cultures with labeled substrates and a protocol for protein-SIP. The protocol guides readers through the proteomics pipeline, including protein extraction, gel-free and gel-based protein separation, the subsequent mass spectrometric analysis of peptides and the calculation of the incorporation of stable isotopes into peptides. Extraction of proteins and the mass fingerprint measurements of unlabeled and labeled fractions can be performed in 2-3 d.

  9. Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information.

    Science.gov (United States)

    Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming

    2015-01-01

    In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.

  10. Protein complex prediction based on k-connected subgraphs in protein interaction network

    Directory of Open Access Journals (Sweden)

    Habibi Mahnaz

    2010-09-01

    Full Text Available Abstract Background Protein complexes play an important role in cellular mechanisms. Recently, several methods have been presented to predict protein complexes in a protein interaction network. In these methods, a protein complex is predicted as a dense subgraph of protein interactions. However, interactions data are incomplete and a protein complex does not have to be a complete or dense subgraph. Results We propose a more appropriate protein complex prediction method, CFA, that is based on connectivity number on subgraphs. We evaluate CFA using several protein interaction networks on reference protein complexes in two benchmark data sets (MIPS and Aloy, containing 1142 and 61 known complexes respectively. We compare CFA to some existing protein complex prediction methods (CMC, MCL, PCP and RNSC in terms of recall and precision. We show that CFA predicts more complexes correctly at a competitive level of precision. Conclusions Many real complexes with different connectivity level in protein interaction network can be predicted based on connectivity number. Our CFA program and results are freely available from http://www.bioinf.cs.ipm.ir/softwares/cfa/CFA.rar.

  11. PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.

    Directory of Open Access Journals (Sweden)

    Huilin Wang

    Full Text Available X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed 'PredPPCrys' using the support vector machine (SVM. Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I. Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II, which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization

  12. A systematic identification of species-specific protein succinylation sites using joint element features information

    Directory of Open Access Journals (Sweden)

    Hasan MM

    2017-08-01

    Full Text Available Md Mehedi Hasan,1 Mst Shamima Khatun,2 Md Nurul Haque Mollah,2 Cao Yong,3 Dianjing Guo1 1School of Life Sciences and the State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong, People’s Republic of China; 2Laboratory of Bioinformatics, Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh; 3Department of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen Graduate School, Shenzhen, People’s Republic of China Abstract: Lysine succinylation, an important type of protein posttranslational modification, plays significant roles in many cellular processes. Accurate identification of succinylation sites can facilitate our understanding about the molecular mechanism and potential roles of lysine succinylation. However, even in well-studied systems, a majority of the succinylation sites remain undetected because the traditional experimental approaches to succinylation site identification are often costly, time-consuming, and laborious. In silico approach, on the other hand, is potentially an alternative strategy to predict succinylation substrates. In this paper, a novel computational predictor SuccinSite2.0 was developed for predicting generic and species-specific protein succinylation sites. This predictor takes the composition of profile-based amino acid and orthogonal binary features, which were used to train a random forest classifier. We demonstrated that the proposed SuccinSite2.0 predictor outperformed other currently existing implementations on a complementarily independent dataset. Furthermore, the important features that make visible contributions to species-specific and cross-species-specific prediction of protein succinylation site were analyzed. The proposed predictor is anticipated to be a useful computational resource for lysine succinylation site prediction. The integrated species-specific online tool of SuccinSite2.0 is publicly

  13. Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

    Science.gov (United States)

    Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

    2018-01-01

    We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation.  Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases.  We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

  14. 76 FR 55081 - Agency Information Collection Activities: Form I-129S; Extension of an Existing Information...

    Science.gov (United States)

    2011-09-06

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information... Blanket L Petition. The Department Homeland Security, U.S. Citizenship and Immigration Services (USCIS... of the Department of Homeland Security sponsoring the collection: Form I-129S; U.S. Citizenship and...

  15. Architectures and Functional Coverage of Protein-Protein Interfaces

    Science.gov (United States)

    Tuncbag, Nurcan; Gursoy, Attila; Guney, Emre; Nussinov, Ruth; Keskin, Ozlem

    2008-01-01

    The diverse range of cellular functions is performed by a limited number of protein folds existing in nature. One may similarly expect that cellular functional diversity would be covered by a limited number of protein-protein interface architectures. Here, we present 8205 interface clusters, each representing unique interface architecture. This dataset of protein-protein interfaces is analyzed and compared with older datasets. We observe that the number of both biological and crystal interfaces increase significantly compared to the number of PDB entries. Further, we find that the number of distinct interface architectures grows at a much faster rate than the number of folds and is yet to level off. We further analyze the growth trend of the functional coverage by constructing functional interaction networks from interfaces. The functional coverage is also found to steadily increase. Interestingly, we also observe that despite the diversity of interface architectures, some are more favorable and frequently used, and of particular interest, those are the ones which are also preferred in single chains. PMID:18620705

  16. 77 FR 23270 - Agency Information Collection Activities: Form I-290B, Extension of an Existing Information...

    Science.gov (United States)

    2012-04-18

    ... DEPARTMENT OF HOMELAND SECURITY U.S. Citizenship and Immigration Services Agency Information... Department of Homeland Security, U.S. Citizenship and Immigration Services (USCIS) will be [[Page 23271...: Form I-290B. U.S. Citizenship and Immigration Services. (4) Affected public who will be asked or...

  17. SDSL-ESR-based protein structure characterization

    NARCIS (Netherlands)

    Strancar, J.; Kavalenka, A.A.; Urbancic, I.; Ljubetic, A.; Hemminga, M.A.

    2010-01-01

    As proteins are key molecules in living cells, knowledge about their structure can provide important insights and applications in science, biotechnology, and medicine. However, many protein structures are still a big challenge for existing high-resolution structure-determination methods, as can be

  18. Information-processing genes

    International Nuclear Information System (INIS)

    Tahir Shah, K.

    1995-01-01

    There are an estimated 100,000 genes in the human genome of which 97% is non-coding. On the other hand, bacteria have little or no non-coding DNA. Non-coding region includes introns, ALU sequences, satellite DNA, and other segments not expressed as proteins. Why it exists? Why nature has kept non-coding during the long evolutionary period if it has no role in the development of complex life forms? Does complexity of a species somehow correlated to the existence of apparently useless sequences? What kind of capability is encoded within such nucleotide sequences that is a necessary, but not a sufficient condition for the evolution of complex life forms, keeping in mind the C-value paradox and the omnipresence of non-coding segments in higher eurkaryotes and also in many archea and prokaryotes. The physico-chemical description of biological processes is hardware oriented and does not highlight algorithmic or information processing aspect. However, an algorithm without its hardware implementation is useless as much as hardware without its capability to run an algorithm. The nature and type of computation an information-processing hardware can perform depends only on its algorithm and the architecture that reflects the algorithm. Given that enormously difficult tasks such as high fidelity replication, transcription, editing and regulation are all achieved within a long linear sequence, it is natural to think that some parts of a genome are involved is these tasks. If some complex algorithms are encoded with these parts, then it is natural to think that non-coding regions contain processing-information algorithms. A comparison between well-known automatic sequences and sequences constructed out of motifs is found in all species proves the point: noncoding regions are a sort of ''hardwired'' programs, i.e., they are linear representations of information-processing machines. Thus in our model, a noncoding region, e.g., an intron contains a program (or equivalently, it is

  19. RPPAML/RIMS: a metadata format and an information management system for reverse phase protein arrays.

    Science.gov (United States)

    Stanislaus, Romesh; Carey, Mark; Deus, Helena F; Coombes, Kevin; Hennessy, Bryan T; Mills, Gordon B; Almeida, Jonas S

    2008-12-22

    Reverse Phase Protein Arrays (RPPA) are convenient assay platforms to investigate the presence of biomarkers in tissue lysates. As with other high-throughput technologies, substantial amounts of analytical data are generated. Over 1,000 samples may be printed on a single nitrocellulose slide. Up to 100 different proteins may be assessed using immunoperoxidase or immunoflorescence techniques in order to determine relative amounts of protein expression in the samples of interest. In this report an RPPA Information Management System (RIMS) is described and made available with open source software. In order to implement the proposed system, we propose a metadata format known as reverse phase protein array markup language (RPPAML). RPPAML would enable researchers to describe, document and disseminate RPPA data. The complexity of the data structure needed to describe the results and the graphic tools necessary to visualize them require a software deployment distributed between a client and a server application. This was achieved without sacrificing interoperability between individual deployments through the use of an open source semantic database, S3DB. This data service backbone is available to multiple client side applications that can also access other server side deployments. The RIMS platform was designed to interoperate with other data analysis and data visualization tools such as Cytoscape. The proposed RPPAML data format hopes to standardize RPPA data. Standardization of data would result in diverse client applications being able to operate on the same set of data. Additionally, having data in a standard format would enable data dissemination and data analysis.

  20. RPPAML/RIMS: A metadata format and an information management system for reverse phase protein arrays

    Directory of Open Access Journals (Sweden)

    Hennessy Bryan T

    2008-12-01

    Full Text Available Abstract Background Reverse Phase Protein Arrays (RPPA are convenient assay platforms to investigate the presence of biomarkers in tissue lysates. As with other high-throughput technologies, substantial amounts of analytical data are generated. Over 1000 samples may be printed on a single nitrocellulose slide. Up to 100 different proteins may be assessed using immunoperoxidase or immunoflorescence techniques in order to determine relative amounts of protein expression in the samples of interest. Results In this report an RPPA Information Management System (RIMS is described and made available with open source software. In order to implement the proposed system, we propose a metadata format known as reverse phase protein array markup language (RPPAML. RPPAML would enable researchers to describe, document and disseminate RPPA data. The complexity of the data structure needed to describe the results and the graphic tools necessary to visualize them require a software deployment distributed between a client and a server application. This was achieved without sacrificing interoperability between individual deployments through the use of an open source semantic database, S3DB. This data service backbone is available to multiple client side applications that can also access other server side deployments. The RIMS platform was designed to interoperate with other data analysis and data visualization tools such as Cytoscape. Conclusion The proposed RPPAML data format hopes to standardize RPPA data. Standardization of data would result in diverse client applications being able to operate on the same set of data. Additionally, having data in a standard format would enable data dissemination and data analysis.

  1. Protein space: a natural method for realizing the nature of protein universe.

    Science.gov (United States)

    Yu, Chenglong; Deng, Mo; Cheng, Shiu-Yuen; Yau, Shek-Chung; He, Rong L; Yau, Stephen S-T

    2013-02-07

    Current methods cannot tell us what the nature of the protein universe is concretely. They are based on different models of amino acid substitution and multiple sequence alignment which is an NP-hard problem and requires manual intervention. Protein structural analysis also gives a direction for mapping the protein universe. Unfortunately, now only a minuscule fraction of proteins' 3-dimensional structures are known. Furthermore, the phylogenetic tree representations are not unique for any existing tree construction methods. Here we develop a novel method to realize the nature of protein universe. We show the protein universe can be realized as a protein space in 60-dimensional Euclidean space using a distance based on a normalized distribution of amino acids. Every protein is in one-to-one correspondence with a point in protein space, where proteins with similar properties stay close together. Thus the distance between two points in protein space represents the biological distance of the corresponding two proteins. We also propose a natural graphical representation for inferring phylogenies. The representation is natural and unique based on the biological distances of proteins in protein space. This will solve the fundamental question of how proteins are distributed in the protein universe. Copyright © 2012 Elsevier Ltd. All rights reserved.

  2. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy.

    Directory of Open Access Journals (Sweden)

    Lina Zhang

    Full Text Available Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information, PSSM (Position Specific Scoring Matrix, RSA (Relative Solvent Accessibility, and CTD (Composition, Transition, Distribution. The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest, SMO (Sequential Minimal Optimization, NNA (Nearest Neighbor Algorithm, and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew's Correlation Coefficient of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.

  3. DomPep--a general method for predicting modular domain-mediated protein-protein interactions.

    Directory of Open Access Journals (Sweden)

    Lei Li

    Full Text Available Protein-protein interactions (PPIs are frequently mediated by the binding of a modular domain in one protein to a short, linear peptide motif in its partner. The advent of proteomic methods such as peptide and protein arrays has led to the accumulation of a wealth of interaction data for modular interaction domains. Although several computational programs have been developed to predict modular domain-mediated PPI events, they are often restricted to a given domain type. We describe DomPep, a method that can potentially be used to predict PPIs mediated by any modular domains. DomPep combines proteomic data with sequence information to achieve high accuracy and high coverage in PPI prediction. Proteomic binding data were employed to determine a simple yet novel parameter Ligand-Binding Similarity which, in turn, is used to calibrate Domain Sequence Identity and Position-Weighted-Matrix distance, two parameters that are used in constructing prediction models. Moreover, DomPep can be used to predict PPIs for both domains with experimental binding data and those without. Using the PDZ and SH2 domain families as test cases, we show that DomPep can predict PPIs with accuracies superior to existing methods. To evaluate DomPep as a discovery tool, we deployed DomPep to identify interactions mediated by three human PDZ domains. Subsequent in-solution binding assays validated the high accuracy of DomPep in predicting authentic PPIs at the proteome scale. Because DomPep makes use of only interaction data and the primary sequence of a domain, it can be readily expanded to include other types of modular domains.

  4. Utilizing knowledge base of amino acids structural neighborhoods to predict protein-protein interaction sites.

    Science.gov (United States)

    Jelínek, Jan; Škoda, Petr; Hoksza, David

    2017-12-06

    Protein-protein interactions (PPI) play a key role in an investigation of various biochemical processes, and their identification is thus of great importance. Although computational prediction of which amino acids take part in a PPI has been an active field of research for some time, the quality of in-silico methods is still far from perfect. We have developed a novel prediction method called INSPiRE which benefits from a knowledge base built from data available in Protein Data Bank. All proteins involved in PPIs were converted into labeled graphs with nodes corresponding to amino acids and edges to pairs of neighboring amino acids. A structural neighborhood of each node was then encoded into a bit string and stored in the knowledge base. When predicting PPIs, INSPiRE labels amino acids of unknown proteins as interface or non-interface based on how often their structural neighborhood appears as interface or non-interface in the knowledge base. We evaluated INSPiRE's behavior with respect to different types and sizes of the structural neighborhood. Furthermore, we examined the suitability of several different features for labeling the nodes. Our evaluations showed that INSPiRE clearly outperforms existing methods with respect to Matthews correlation coefficient. In this paper we introduce a new knowledge-based method for identification of protein-protein interaction sites called INSPiRE. Its knowledge base utilizes structural patterns of known interaction sites in the Protein Data Bank which are then used for PPI prediction. Extensive experiments on several well-established datasets show that INSPiRE significantly surpasses existing PPI approaches.

  5. Functional NifD-K fusion protein in Azotobacter vinelandii is a homodimeric complex equivalent to the native heterotetrameric MoFe protein

    International Nuclear Information System (INIS)

    Lahiri, Surobhi; Pulakat, Lakshmi; Gavini, Nara

    2005-01-01

    The MoFe protein of the complex metalloenzyme nitrogenase folds as a heterotetramer containing two copies each of the homologous α and β subunits, encoded by the nifD and the nifK genes respectively. Recently, the functional expression of a fusion NifD-K protein of nitrogenase was demonstrated in Azotobacter vinelandii, strongly implying that the MoFe protein is flexible as it could accommodate major structural changes, yet remain functional [M.H. Suh, L. Pulakat, N. Gavini, J. Biol. Chem. 278 (2003) 5353-5360]. This finding led us to further explore the type of interaction between the fused MoFe protein units. We aimed to determine whether an interaction exists between the two fusion MoFe proteins to form a homodimer that is equivalent to native heterotetrameric MoFe protein. Using the Bacteriomatch Two-Hybrid System, translationally fused constructs of NifD-K (fusion) with the full-length λCI of the pBT bait vector and also NifD-K (fusion) with the N-terminal α-RNAP of the pTRG target vector were made. To compare the extent of interaction between the fused NifD-K proteins to that of the β-β interactions in the native MoFe protein, we proceeded to generate translationally fused constructs of NifK with the α-RNAP of the pTRG vector and λCI protein of the pBT vector. The strength of the interaction between the proteins in study was determined by measuring the β-galactosidase activity and extent of ampicillin resistance of the colonies expressing these proteins. This analysis demonstrated that direct protein-protein interaction exists between NifD-K fusion proteins, suggesting that they exist as homodimers. As the interaction takes place at the β-interfaces of the NifD-K fusion proteins, we propose that these homodimers of NifD-K fusion protein may function in a similar manner as that of the heterotetrameric native MoFe protein. The observation that the extent of protein-protein interaction between the β-subunits of the native MoFe protein in Bacterio

  6. Mining Proteomic Data to Expose Protein Modifications in Methanosarcina mazei strain Gö1

    Directory of Open Access Journals (Sweden)

    Deborah eLeon

    2015-03-01

    Full Text Available Proteomic tools identify constituents of complex mixtures, often delivering long lists of identified proteins. The high-throughput methods excel at matching tandem mass spectrometry data to spectra predicted from sequence databases. Unassigned mass spectra are ignored, but could, in principle, provide valuable information on unanticipated modifications and improve protein annotations while consuming limited quantities of material. Strategies to mine information from these discards are presented, along with discussion of features that, when present, provide strong support for modifications. In this study we mined LC-MS/MS datasets of proteolytically-digested concanavalin A pull down fractions from Methanosarcina mazei Gö1 cell lysates. Analyses identified 154 proteins. Many of the observed proteins displayed post-translationally modified forms, including O-formylated and methyl-esterified segments that appear biologically relevant (i.e., not artifacts of sample handling. Interesting cleavages and modifications (e.g., S-cyanylation and trimethylation were observed near catalytic sites of methanogenesis enzymes. Of 31 Methanosarcina protein N-termini recovered by concanavalin A binding or from a previous study, only M. mazei S-layer protein MM1976 and its M. acetivorans C2A orthologue, MA0829, underwent signal peptide excision. Experimental results contrast with predictions from algorithms SignalP 3.0 and Exprot, which were found to over-predict the presence of signal peptides. Proteins MM0002, MM0716, MM1364, and MM1976 were found to be glycosylated, and employing chromatography tailored specifically for glycopeptides will likely reveal more.This study supplements limited, existing experimental datasets of mature archaeal N-termini, including presence or absence of signal peptides, translation initiation sites, and other processing. Methanosarcina surface and membrane proteins are richly modified.

  7. A guild of 45 CRISPR-associated (Cas protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.

    Directory of Open Access Journals (Sweden)

    Daniel H Haft

    2005-11-01

    Full Text Available Clustered regularly interspaced short palindromic repeats (CRISPRs are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer "immunity" against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.

  8. Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels

    Directory of Open Access Journals (Sweden)

    McDermott Drew

    2009-08-01

    Full Text Available Abstract Background Proteins interact through specific binding interfaces that contain many residues in domains. Protein interactions thus occur on three different levels of a concept hierarchy: whole-proteins, domains, and residues. Each level offers a distinct and complementary set of features for computationally predicting interactions, including functional genomic features of whole proteins, evolutionary features of domain families and physical-chemical features of individual residues. The predictions at each level could benefit from using the features at all three levels. However, it is not trivial as the features are provided at different granularity. Results To link up the predictions at the three levels, we propose a multi-level machine-learning framework that allows for explicit information flow between the levels. We demonstrate, using representative yeast interaction networks, that our algorithm is able to utilize complementary feature sets to make more accurate predictions at the three levels than when the three problems are approached independently. To facilitate application of our multi-level learning framework, we discuss three key aspects of multi-level learning and the corresponding design choices that we have made in the implementation of a concrete learning algorithm. 1 Architecture of information flow: we show the greater flexibility of bidirectional flow over independent levels and unidirectional flow; 2 Coupling mechanism of the different levels: We show how this can be accomplished via augmenting the training sets at each level, and discuss the prevention of error propagation between different levels by means of soft coupling; 3 Sparseness of data: We show that the multi-level framework compounds data sparsity issues, and discuss how this can be dealt with by building local models in information-rich parts of the data. Our proof-of-concept learning algorithm demonstrates the advantage of combining levels, and opens up

  9. Dietary Proteins and Angiogenesis

    Directory of Open Access Journals (Sweden)

    Miguel Ángel Medina

    2014-01-01

    Full Text Available Both defective and persistent angiogenesis are linked to pathological situations in the adult. Compounds able to modulate angiogenesis have a potential value for the treatment of such pathologies. Several small molecules present in the diet have been shown to have modulatory effects on angiogenesis. This review presents the current state of knowledge on the potential modulatory roles of dietary proteins on angiogenesis. There is currently limited available information on the topic. Milk contains at least three proteins for which modulatory effects on angiogenesis have been previously demonstrated. On the other hand, there is some scarce information on the potential of dietary lectins, edible plant proteins and high protein diets to modulate angiogenesis.

  10. Proteins of unknown function in the Protein Data Bank (PDB): an inventory of true uncharacterized proteins and computational tools for their analysis.

    Science.gov (United States)

    Nadzirin, Nurul; Firdaus-Raih, Mohd

    2012-10-08

    Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files) that were categorized under "unknown function" are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.

  11. Proteins of Unknown Function in the Protein Data Bank (PDB: An Inventory of True Uncharacterized Proteins and Computational Tools for Their Analysis

    Directory of Open Access Journals (Sweden)

    Nurul Nadzirin

    2012-10-01

    Full Text Available Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB. Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files that were categorized under “unknown function” are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.

  12. Thiazolidine-2,4-dione derivatives: programmed chemical weapons for key protein targets of various pathological conditions.

    Science.gov (United States)

    Chadha, Navriti; Bahia, Malkeet Singh; Kaur, Maninder; Silakari, Om

    2015-07-01

    Thiazolidine-2,4-dione is an extensively explored heterocyclic nucleus for designing of novel agents implicated for a wide variety of pathophysiological conditions, that is, diabetes, diabetic complications, cancer, arthritis, inflammation, microbial infection, and melanoma, etc. The current paradigm of drug development has shifted to the structure-based drug design, since high-throughput screenings have continued to generate disappointing results. The gap between hit generation and drug establishment can be narrowed down by investigation of ligand interactions with its receptor protein. Therefore, it would always be highly beneficial to gain knowledge of molecular level interactions between specific protein target and developed ligands; since this information can be maneuvered to design new molecules with improved protein fitting. Thus, considering this aspect, we have corroborated the information about molecular (target) level implementations of thiazolidine-2,4-diones (TZD) derivatives having therapeutic implementations such as, but not limited to, anti-diabetic (glitazones), anti-cancer, anti-arthritic, anti-inflammatory, anti-oxidant and anti-microbial, etc. The structure based SAR of TZD derivatives for various protein targets would serve as a benchmark for the alteration of existing ligands to design new ones with better binding interactions. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Analysis of secreted proteins from Aspergillus flavus.

    Science.gov (United States)

    Medina, Martha L; Haynes, Paul A; Breci, Linda; Francisco, Wilson A

    2005-08-01

    MS/MS techniques in proteomics make possible the identification of proteins from organisms with little or no genome sequence information available. Peptide sequences are obtained from tandem mass spectra by matching peptide mass and fragmentation information to protein sequence information from related organisms, including unannotated genome sequence data. This peptide identification data can then be grouped and reconstructed into protein data. In this study, we have used this approach to study protein secretion by Aspergillus flavus, a filamentous fungus for which very little genome sequence information is available. A. flavus is capable of degrading the flavonoid rutin (quercetin 3-O-glycoside), as the only source of carbon via an extracellular enzyme system. In this continuing study, a proteomic analysis was used to identify secreted proteins from A. flavus when grown on rutin. The growth media glucose and potato dextrose were used to identify differentially expressed secreted proteins. The secreted proteins were analyzed by 1- and 2-DE and MS/MS. A total of 51 unique A. flavus secreted proteins were identified from the three growth conditions. Ten proteins were unique to rutin-, five to glucose- and one to potato dextrose-grown A. flavus. Sixteen secreted proteins were common to all three media. Fourteen identifications were of hypothetical proteins or proteins of unknown functions. To our knowledge, this is the first extensive proteomic study conducted to identify the secreted proteins from a filamentous fungus.

  14. A method for investigating protein-protein interactions related to Salmonella typhimurium pathogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Chowdhury, Saiful M. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Shi, Liang [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Yoon, Hyunjin [Dartmouth College, Hanover, NH (United States); Ansong, Charles [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Rommereim, Leah M. [Dartmouth College, Hanover, NH (United States); Norbeck, Angela D. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Auberry, Kenneth J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Moore, R. J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Adkins, Joshua N. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Heffron, Fred [Oregon Health and Science Univ., Portland, OR (United States); Smith, Richard D. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2009-02-10

    We successfully modified an existing method to investigate protein-protein interactions in the pathogenic bacterium Salmonella typhimurium (STM). This method includes i) addition of a histidine-biotin-histidine tag to the bait proteins via recombinant DNA techniques; ii) in vivo cross-linking with formaldehyde; iii) tandem affinity purification of bait proteins under fully denaturing conditions; and iv) identification of the proteins cross-linked to the bait proteins by liquid-chromatography in conjunction with tandem mass-spectrometry. In vivo cross-linking stabilized protein interactions permitted the subsequent two-step purification step conducted under denaturing conditions. The two-step purification greatly reduced nonspecific binding of non-cross-linked proteins to bait proteins. Two different negative controls were employed to reduce false-positive identification. In an initial demonstration of this approach, we tagged three selected STM proteins- HimD, PduB and PhoP- with known binding partners that ranged from stable (e.g., HimD) to transient (i.e., PhoP). Distinct sets of interacting proteins were identified with each bait protein, including the known binding partners such as HimA for HimD, as well as anticipated and unexpected binding partners. Our results suggest that novel protein-protein interactions may be critical to pathogenesis by Salmonella typhimurium. .

  15. Information and Informality

    DEFF Research Database (Denmark)

    Larsson, Magnus; Segerstéen, Solveig; Svensson, Cathrin

    2011-01-01

    leaders on the basis of their possession of reliable knowledge in technical as well as organizational domains. The informal leaders engaged in interpretation and brokering of information and knowledge, as well as in mediating strategic values and priorities on both formal and informal arenas. Informal...... leaders were thus seen to function on the level of the organization as a whole, and in cooperation with formal leaders. Drawing on existing theory of leadership in creative and professional contexts, this cooperation can be specified to concern task structuring. The informal leaders in our study...... contributed to task structuring through sensemaking activities, while formal leaders focused on aspects such as clarifying output expectations, providing feedback, project structure, and diversity....

  16. Serum peptide/protein profiling by mass spectrometry provides diagnostic information independently of CA125 in women with an ovarian tumor

    DEFF Research Database (Denmark)

    Callesen, Anne; Madsen, Jonna S; Iachina, Maria

    2010-01-01

    In the present study, the use of a robust and sensitive mass spectrometry based protein profiling analysis was tested as diagnostic tools for women with an ovarian tumor. The potential additional diagnostic value of serum protein profiles independent of the information provided by CA125 were also...... investigated. Protein profiles of 113 serum samples from women with an ovarian tumor (54 malign and 59 benign) were generated using MALDI-TOF MS. A total of 98 peaks with a significant difference (pwomen with benign tumors/cysts and malignant ovarian tumors were identified. After...... average linkage clustering, a profile of 46 statistical significant mass peaks was identified to distinguish malignant tumors and benign tumors/cysts. In the subgroup of women with normal CA125 values (

  17. Analyzing the security of an existing computer system

    Science.gov (United States)

    Bishop, M.

    1986-01-01

    Most work concerning secure computer systems has dealt with the design, verification, and implementation of provably secure computer systems, or has explored ways of making existing computer systems more secure. The problem of locating security holes in existing systems has received considerably less attention; methods generally rely on thought experiments as a critical step in the procedure. The difficulty is that such experiments require that a large amount of information be available in a format that makes correlating the details of various programs straightforward. This paper describes a method of providing such a basis for the thought experiment by writing a special manual for parts of the operating system, system programs, and library subroutines.

  18. Measurements of Protein Crystal Face Growth Rates

    Science.gov (United States)

    Gorti, S.

    2014-01-01

    Protein crystal growth rates will be determined for several hyperthermophile proteins.; The growth rates will be assessed using available theoretical models, including kinetic roughening.; If/when kinetic roughening supersaturations are established, determinations of protein crystal quality over a range of supersaturations will also be assessed.; The results of our ground based effort may well address the existence of a correlation between fundamental growth mechanisms and protein crystal quality.

  19. 77 FR 62267 - Proposed Extension of Existing Information Collection; Gamma Radiation Surveys

    Science.gov (United States)

    2012-10-12

    ... debilitating occupational diseases. Natural sources include rocks, soils, and ground water. Gamma radiation..., electronic, mechanical, or other technological collection techniques or other forms of information technology...

  20. mPLR-Loc: an adaptive decision multi-label classifier based on penalized logistic regression for protein subcellular localization prediction.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2015-03-15

    Proteins located in appropriate cellular compartments are of paramount importance to exert their biological functions. Prediction of protein subcellular localization by computational methods is required in the post-genomic era. Recent studies have been focusing on predicting not only single-location proteins but also multi-location proteins. However, most of the existing predictors are far from effective for tackling the challenges of multi-label proteins. This article proposes an efficient multi-label predictor, namely mPLR-Loc, based on penalized logistic regression and adaptive decisions for predicting both single- and multi-location proteins. Specifically, for each query protein, mPLR-Loc exploits the information from the Gene Ontology (GO) database by using its accession number (AC) or the ACs of its homologs obtained via BLAST. The frequencies of GO occurrences are used to construct feature vectors, which are then classified by an adaptive decision-based multi-label penalized logistic regression classifier. Experimental results based on two recent stringent benchmark datasets (virus and plant) show that mPLR-Loc remarkably outperforms existing state-of-the-art multi-label predictors. In addition to being able to rapidly and accurately predict subcellular localization of single- and multi-label proteins, mPLR-Loc can also provide probabilistic confidence scores for the prediction decisions. For readers' convenience, the mPLR-Loc server is available online (http://bioinfo.eie.polyu.edu.hk/mPLRLocServer). Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Completion of autobuilt protein models using a database of protein fragments

    International Nuclear Information System (INIS)

    Cowtan, Kevin

    2012-01-01

    Two developments in the process of automated protein model building in the Buccaneer software are described: the use of a database of protein fragments in improving the model completeness and the assembly of disconnected chain fragments into complete molecules. Two developments in the process of automated protein model building in the Buccaneer software are presented. A general-purpose library for protein fragments of arbitrary size is described, with a highly optimized search method allowing the use of a larger database than in previous work. The problem of assembling an autobuilt model into complete chains is discussed. This involves the assembly of disconnected chain fragments into complete molecules and the use of the database of protein fragments in improving the model completeness. Assembly of fragments into molecules is a standard step in existing model-building software, but the methods have not received detailed discussion in the literature

  2. Binomial probability distribution model-based protein identification algorithm for tandem mass spectrometry utilizing peak intensity information.

    Science.gov (United States)

    Xiao, Chuan-Le; Chen, Xiao-Zhou; Du, Yang-Li; Sun, Xuesong; Zhang, Gong; He, Qing-Yu

    2013-01-04

    Mass spectrometry has become one of the most important technologies in proteomic analysis. Tandem mass spectrometry (LC-MS/MS) is a major tool for the analysis of peptide mixtures from protein samples. The key step of MS data processing is the identification of peptides from experimental spectra by searching public sequence databases. Although a number of algorithms to identify peptides from MS/MS data have been already proposed, e.g. Sequest, OMSSA, X!Tandem, Mascot, etc., they are mainly based on statistical models considering only peak-matches between experimental and theoretical spectra, but not peak intensity information. Moreover, different algorithms gave different results from the same MS data, implying their probable incompleteness and questionable reproducibility. We developed a novel peptide identification algorithm, ProVerB, based on a binomial probability distribution model of protein tandem mass spectrometry combined with a new scoring function, making full use of peak intensity information and, thus, enhancing the ability of identification. Compared with Mascot, Sequest, and SQID, ProVerB identified significantly more peptides from LC-MS/MS data sets than the current algorithms at 1% False Discovery Rate (FDR) and provided more confident peptide identifications. ProVerB is also compatible with various platforms and experimental data sets, showing its robustness and versatility. The open-source program ProVerB is available at http://bioinformatics.jnu.edu.cn/software/proverb/ .

  3. Ultra-high resolution protein crystallography

    International Nuclear Information System (INIS)

    Takeda, Kazuki; Hirano, Yu; Miki, Kunio

    2010-01-01

    Many protein structures have been determined by X-ray crystallography and deposited with the Protein Data Bank. However, these structures at usual resolution (1.5< d<3.0 A) are insufficient in their precision and quantity for elucidating the molecular mechanism of protein functions directly from structural information. Several studies at ultra-high resolution (d<0.8 A) have been performed with synchrotron radiation in the last decade. The highest resolution of the protein crystals was achieved at 0.54 A resolution for a small protein, crambin. In such high resolution crystals, almost all of hydrogen atoms of proteins and some hydrogen atoms of bound water molecules are experimentally observed. In addition, outer-shell electrons of proteins can be analyzed by the multipole refinement procedure. However, the influence of X-rays should be precisely estimated in order to derive meaningful information from the crystallographic results. In this review, we summarize refinement procedures, current status and perspectives for ultra high resolution protein crystallography. (author)

  4. Structural deformation upon protein-protein interaction: a structural alphabet approach.

    Science.gov (United States)

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-02-28

    In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Our study provides qualitative information about induced fit. These results could be of help for flexible docking.

  5. Structural deformation upon protein-protein interaction: A structural alphabet approach

    Directory of Open Access Journals (Sweden)

    Lecornet Hélène

    2008-02-01

    Full Text Available Abstract Background In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. Results In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%. This proportion is even greater in the interface regions (41%. Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Conclusion Our study provides qualitative information about induced fit. These results could be of help for flexible docking.

  6. Additional phase information from UV damage of selenomethionine labelled proteins

    Energy Technology Data Exchange (ETDEWEB)

    Sanctis, Daniele de [ESRF, Structural Biology Group, 6 rue Jules Horowitz, 38043 Grenoble Cedex (France); Tucker, Paul A.; Panjikar, Santosh, E-mail: panjikar@embl-hamburg.de [EMBL Hamburg Outstation, c/o DESY, Notkestrasse 85, D-22603 Hamburg (Germany)

    2011-05-01

    Successful examples of ultraviolet radiation-damage-induced phasing with anomalous scattering from selenomethionine protein crystals have been demonstrated. Currently, selenium is the most widely used phasing vehicle for experimental phasing, either by single anomalous scattering or multiple-wavelength anomalous dispersion (MAD) procedures. The use of the single isomorphous replacement anomalous scattering (SIRAS) phasing procedure with selenomethionine containing proteins is not so commonly used, as it requires isomorphous native data. Here it is demonstrated that isomorphous differences can be measured from intensity changes measured from a selenium labelled protein crystal before and after UV exposure. These can be coupled with the anomalous signal from the dataset collected at the selenium absorption edge to obtain SIRAS phases in a UV-RIPAS phasing experiment. The phasing procedure for two selenomethionine proteins, the feruloyl esterase module of xylanase 10B from Clostridium thermocellum and the Mycobacterium tuberculosis chorismate synthase, have been investigated using datasets collected near the absorption edge of selenium before and after UV radiation. The utility of UV radiation in measuring radiation damage data for isomorphous differences is highlighted and it is shown that, after such measurements, the UV-RIPAS procedure yields comparable phase sets with those obtained from the conventional MAD procedure. The results presented are encouraging for the development of alternative phasing approaches for selenomethionine proteins in difficult cases.

  7. NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

    DEFF Research Database (Denmark)

    Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl

    2010-01-01

    is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino......β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method...... NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which...

  8. 78 FR 56940 - Agency Information Collection Activities; Existing Collection-Reinstatement, Comments Requested...

    Science.gov (United States)

    2013-09-16

    ... Services Division (CJIS), Biometric Services Section, Customer Support Unit, Module E-1, 1000 Custer Hollow... techniques of other forms of information technology, e.g., permitting electronic submission of responses...

  9. HIV protein sequence hotspots for crosstalk with host hub proteins.

    Directory of Open Access Journals (Sweden)

    Mahdi Sarmady

    Full Text Available HIV proteins target host hub proteins for transient binding interactions. The presence of viral proteins in the infected cell results in out-competition of host proteins in their interaction with hub proteins, drastically affecting cell physiology. Functional genomics and interactome datasets can be used to quantify the sequence hotspots on the HIV proteome mediating interactions with host hub proteins. In this study, we used the HIV and human interactome databases to identify HIV targeted host hub proteins and their host binding partners (H2. We developed a high throughput computational procedure utilizing motif discovery algorithms on sets of protein sequences, including sequences of HIV and H2 proteins. We identified as HIV sequence hotspots those linear motifs that are highly conserved on HIV sequences and at the same time have a statistically enriched presence on the sequences of H2 proteins. The HIV protein motifs discovered in this study are expressed by subsets of H2 host proteins potentially outcompeted by HIV proteins. A large subset of these motifs is involved in cleavage, nuclear localization, phosphorylation, and transcription factor binding events. Many such motifs are clustered on an HIV sequence in the form of hotspots. The sequential positions of these hotspots are consistent with the curated literature on phenotype altering residue mutations, as well as with existing binding site data. The hotspot map produced in this study is the first global portrayal of HIV motifs involved in altering the host protein network at highly connected hub nodes.

  10. Future Protein Supply and Demand: Strategies and Factors Influencing a Sustainable Equilibrium

    Directory of Open Access Journals (Sweden)

    Maeve Henchion

    2017-07-01

    Full Text Available A growing global population, combined with factors such as changing socio-demographics, will place increased pressure on the world’s resources to provide not only more but also different types of food. Increased demand for animal-based protein in particular is expected to have a negative environmental impact, generating greenhouse gas emissions, requiring more water and more land. Addressing this “perfect storm” will necessitate more sustainable production of existing sources of protein as well as alternative sources for direct human consumption. This paper outlines some potential demand scenarios and provides an overview of selected existing and novel protein sources in terms of their potential to sustainably deliver protein for the future, considering drivers and challenges relating to nutritional, environmental, and technological and market/consumer domains. It concludes that different factors influence the potential of existing and novel sources. Existing protein sources are primarily hindered by their negative environmental impacts with some concerns around health. However, they offer social and economic benefits, and have a high level of consumer acceptance. Furthermore, recent research emphasizes the role of livestock as part of the solution to greenhouse gas emissions, and indicates that animal-based protein has an important role as part of a sustainable diet and as a contributor to food security. Novel proteins require the development of new value chains, and attention to issues such as production costs, food safety, scalability and consumer acceptance. Furthermore, positive environmental impacts cannot be assumed with novel protein sources and care must be taken to ensure that comparisons between novel and existing protein sources are valid. Greater alignment of political forces, and the involvement of wider stakeholders in a governance role, as well as development/commercialization role, is required to address both sources of

  11. Future Protein Supply and Demand: Strategies and Factors Influencing a Sustainable Equilibrium

    Science.gov (United States)

    Henchion, Maeve; Hayes, Maria; Mullen, Anne Maria; Fenelon, Mark; Tiwari, Brijesh

    2017-01-01

    A growing global population, combined with factors such as changing socio-demographics, will place increased pressure on the world’s resources to provide not only more but also different types of food. Increased demand for animal-based protein in particular is expected to have a negative environmental impact, generating greenhouse gas emissions, requiring more water and more land. Addressing this “perfect storm” will necessitate more sustainable production of existing sources of protein as well as alternative sources for direct human consumption. This paper outlines some potential demand scenarios and provides an overview of selected existing and novel protein sources in terms of their potential to sustainably deliver protein for the future, considering drivers and challenges relating to nutritional, environmental, and technological and market/consumer domains. It concludes that different factors influence the potential of existing and novel sources. Existing protein sources are primarily hindered by their negative environmental impacts with some concerns around health. However, they offer social and economic benefits, and have a high level of consumer acceptance. Furthermore, recent research emphasizes the role of livestock as part of the solution to greenhouse gas emissions, and indicates that animal-based protein has an important role as part of a sustainable diet and as a contributor to food security. Novel proteins require the development of new value chains, and attention to issues such as production costs, food safety, scalability and consumer acceptance. Furthermore, positive environmental impacts cannot be assumed with novel protein sources and care must be taken to ensure that comparisons between novel and existing protein sources are valid. Greater alignment of political forces, and the involvement of wider stakeholders in a governance role, as well as development/commercialization role, is required to address both sources of protein and ensure

  12. Towards New Ambient Light Systems: a Close Look at Existing Encodings of Ambient Light Systems

    Directory of Open Access Journals (Sweden)

    Andrii Matviienko

    2015-10-01

    Full Text Available Ambient systems provide information in the periphery of a user’s attention. Their aim is to present information as unobtrusively as possible to avoid interrupting primary tasks (e.g. writing or reading. In recent years, light has been used to create ambient systems to display information. Examples of ambient light systems range from simple notification systems such as displaying messages or calendar event reminders, to more complex systems such as focusing on conveying information regarding health activity tracking. However, for ambient light systems, there is a broad design space that lacks guidelines on when to make use of light displays and how to design them. In this paper we provide a systematic overview of existing ambient light systems over four identified information classes derived from 72 existing ambient light systems. The most prominent encoding parameters among the surveyed ambient light systems are color, brightness, and their combination. By analyzing existing ambient light systems, we provide a first step towards developing guidelines for designing future ambient light systems.

  13. Information asymmetries, information externalities, oil companies strategies and oil exploration information efficiency

    International Nuclear Information System (INIS)

    Nyouki, E.

    1998-07-01

    Both for economics (in general) and energy economics matters, it is important to reach oil exploration efficiency. To achieve this aim, a pragmatic approach is to use the concept of information efficiency which means that the different tracts have to be drilled in the decreasing order of estimated profitabilities, estimations being made on the basis of the best (in the sense of reliability) available information. What does 'best available information' mean? It corresponds either to the information held by the most experienced oil companies (due to the existence of information asymmetries to the profit of these companies), or to information revealed by the drilling and which allows to revise probabilities of success on neighboring tracts with similar geological features (due to the existence of information externalities). In consideration of these information asymmetries and externalities, we will say that exploration is information efficient when. -- on the one hand, initial exploration choices are directed by the most experienced companies, - and, on the other hand, during the drilling phase, in the face of the information externality, companies adopt a sequential drilling, i.e. excluding both over-investment and strategic under-investment. The topic we deal with in this thesis is then to know if oil companies, when they are put in normal competition conditions, are likely to make emerge a state of information efficiency in exploration, the analysis being conducted theoretically and empirically. (author)

  14. Protein sequence comparison and protein evolution

    Energy Technology Data Exchange (ETDEWEB)

    Pearson, W.R. [Univ. of Virginia, Charlottesville, VA (United States). Dept. of Biochemistry

    1995-12-31

    This tutorial was one of eight tutorials selected to be presented at the Third International Conference on Intelligent Systems for Molecular Biology which was held in the United Kingdom from July 16 to 19, 1995. This tutorial examines how the information conserved during the evolution of a protein molecule can be used to infer reliably homology, and thus a shared proteinfold and possibly a shared active site or function. The authors start by reviewing a geological/evolutionary time scale. Next they look at the evolution of several protein families. During the tutorial, these families will be used to demonstrate that homologous protein ancestry can be inferred with confidence. They also examine different modes of protein evolution and consider some hypotheses that have been presented to explain the very earliest events in protein evolution. The next part of the tutorial will examine the technical aspects of protein sequence comparison. Both optimal and heuristic algorithms and their associated parameters that are used to characterize protein sequence similarities are discussed. Perhaps more importantly, they survey the statistics of local similarity scores, and how these statistics can both be used to improve the selectivity of a search and to evaluate the significance of a match. They them examine distantly related members of three protein families, the serine proteases, the glutathione transferases, and the G-protein-coupled receptors (GCRs). Finally, the discuss how sequence similarity can be used to examine internal repeated or mosaic structures in proteins.

  15. Extracting Various Classes of Data From Biological Text Using the Concept of Existence Dependency.

    Science.gov (United States)

    Taha, Kamal

    2015-11-01

    One of the key goals of biological natural language processing (NLP) is the automatic information extraction from biomedical publications. Most current constituency and dependency parsers overlook the semantic relationships between the constituents comprising a sentence and may not be well suited for capturing complex long-distance dependences. We propose in this paper a hybrid constituency-dependency parser for biological NLP information extraction called EDCC. EDCC aims at enhancing the state of the art of biological text mining by applying novel linguistic computational techniques that overcome the limitations of current constituency and dependency parsers outlined earlier, as follows: 1) it determines the semantic relationship between each pair of constituents in a sentence using novel semantic rules; and 2) it applies a semantic relationship extraction model that extracts information from different structural forms of constituents in sentences. EDCC can be used to extract different types of data from biological texts for purposes such as protein function prediction, genetic network construction, and protein-protein interaction detection. We evaluated the quality of EDCC by comparing it experimentally with six systems. Results showed marked improvement.

  16. Structure and expression of the maize (Zea mays L. SUN-domain protein gene family: evidence for the existence of two divergent classes of SUN proteins in plants

    Directory of Open Access Journals (Sweden)

    Simmons Carl R

    2010-12-01

    Full Text Available Abstract Background The nuclear envelope that separates the contents of the nucleus from the cytoplasm provides a surface for chromatin attachment and organization of the cortical nucleoplasm. Proteins associated with it have been well characterized in many eukaryotes but not in plants. SUN (Sad1p/Unc-84 domain proteins reside in the inner nuclear membrane and function with other proteins to form a physical link between the nucleoskeleton and the cytoskeleton. These bridges transfer forces across the nuclear envelope and are increasingly recognized to play roles in nuclear positioning, nuclear migration, cell cycle-dependent breakdown and reformation of the nuclear envelope, telomere-led nuclear reorganization during meiosis, and karyogamy. Results We found and characterized a family of maize SUN-domain proteins, starting with a screen of maize genomic sequence data. We characterized five different maize ZmSUN genes (ZmSUN1-5, which fell into two classes (probably of ancient origin, as they are also found in other monocots, eudicots, and even mosses. The first (ZmSUN1, 2, here designated canonical C-terminal SUN-domain (CCSD, includes structural homologs of the animal and fungal SUN-domain protein genes. The second (ZmSUN3, 4, 5, here designated plant-prevalent mid-SUN 3 transmembrane (PM3, includes a novel but conserved structural variant SUN-domain protein gene class. Mircroarray-based expression analyses revealed an intriguing pollen-preferred expression for ZmSUN5 mRNA but low-level expression (50-200 parts per ten million in multiple tissues for all the others. Cloning and characterization of a full-length cDNA for a PM3-type maize gene, ZmSUN4, is described. Peptide antibodies to ZmSUN3, 4 were used in western-blot and cell-staining assays to show that they are expressed and show concentrated staining at the nuclear periphery. Conclusions The maize genome encodes and expresses at least five different SUN-domain proteins, of which the PM3

  17. Artificial Association of Pre-stored Information to Generate a Qualitatively New Memory

    Directory of Open Access Journals (Sweden)

    Noriaki Ohkawa

    2015-04-01

    Full Text Available Memory is thought to be stored in the brain as an ensemble of cells activated during learning. Although optical stimulation of a cell ensemble triggers the retrieval of the corresponding memory, it is unclear how the association of information occurs at the cell ensemble level. Using optogenetic stimulation without any sensory input in mice, we found that an artificial association between stored, non-related contextual, and fear information was generated through the synchronous activation of distinct cell ensembles corresponding to the stored information. This artificial association shared characteristics with physiologically associated memories, such as N-methyl-D-aspartate receptor activity and protein synthesis dependence. These findings suggest that the association of information is achieved through the synchronous activity of distinct cell ensembles. This mechanism may underlie memory updating by incorporating novel information into pre-existing networks to form qualitatively new memories.

  18. Protein Data Bank (PDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Protein Data Bank (PDB) archive is the single worldwide repository of information about the 3D structures of large biological molecules, including proteins and...

  19. Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs.

    Directory of Open Access Journals (Sweden)

    Md Mehedi Hasan

    Full Text Available Prokaryotic proteins are regulated by pupylation, a type of post-translational modification that contributes to cellular function in bacterial organisms. In pupylation process, the prokaryotic ubiquitin-like protein (Pup tagging is functionally analogous to ubiquitination in order to tag target proteins for proteasomal degradation. To date, several experimental methods have been developed to identify pupylated proteins and their pupylation sites, but these experimental methods are generally laborious and costly. Therefore, computational methods that can accurately predict potential pupylation sites based on protein sequence information are highly desirable. In this paper, a novel predictor termed as pbPUP has been developed for accurate prediction of pupylation sites. In particular, a sophisticated sequence encoding scheme [i.e. the profile-based composition of k-spaced amino acid pairs (pbCKSAAP] is used to represent the sequence patterns and evolutionary information of the sequence fragments surrounding pupylation sites. Then, a Support Vector Machine (SVM classifier is trained using the pbCKSAAP encoding scheme. The final pbPUP predictor achieves an AUC value of 0.849 in 10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset. The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites. The web server and curated datasets in this study are freely available at http://protein.cau.edu.cn/pbPUP/.

  20. An ensemble method for predicting subnuclear localizations from primary protein structures.

    Directory of Open Access Journals (Sweden)

    Guo Sheng Han

    Full Text Available BACKGROUND: Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. METHODOLOGY/PRINCIPAL FINDINGS: A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. CONCLUSIONS: It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method

  1. 77 FR 51578 - Agency Information Collection Activities: Existing Collection; Comments Requested: Extension of a...

    Science.gov (United States)

    2012-08-24

    ... First and last names Demographic information: Sex; race; Hispanic origin; education level Offense type... military service, date and type of last discharge BJS uses the information gathered in NCRP in published...

  2. Biophysics of protein evolution and evolutionary protein biophysics

    Science.gov (United States)

    Sikosek, Tobias; Chan, Hue Sun

    2014-01-01

    The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence–structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by ‘hidden’ conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution. PMID:25165599

  3. A comparative approach for the investigation of biological information processing: An examination of the structure and function of computer hard drives and DNA

    OpenAIRE

    D'Onofrio, David J; An, Gary

    2010-01-01

    Abstract Background The robust storage, updating and utilization of information are necessary for the maintenance and perpetuation of dynamic systems. These systems can exist as constructs of metal-oxide semiconductors and silicon, as in a digital computer, or in the "wetware" of organic compounds, proteins and nucleic acids that make up biological organisms. We propose that there are essential functional properties of centralized information-processing systems; for digital computers these pr...

  4. Validation of protein carbonyl measurement: A multi-centre study

    Directory of Open Access Journals (Sweden)

    Edyta Augustyniak

    2015-04-01

    Full Text Available Protein carbonyls are widely analysed as a measure of protein oxidation. Several different methods exist for their determination. A previous study had described orders of magnitude variance that existed when protein carbonyls were analysed in a single laboratory by ELISA using different commercial kits. We have further explored the potential causes of variance in carbonyl analysis in a ring study. A soluble protein fraction was prepared from rat liver and exposed to 0, 5 and 15 min of UV irradiation. Lyophilised preparations were distributed to six different laboratories that routinely undertook protein carbonyl analysis across Europe. ELISA and Western blotting techniques detected an increase in protein carbonyl formation between 0 and 5 min of UV irradiation irrespective of method used. After irradiation for 15 min, less oxidation was detected by half of the laboratories than after 5 min irradiation. Three of the four ELISA carbonyl results fell within 95% confidence intervals. Likely errors in calculating absolute carbonyl values may be attributed to differences in standardisation. Out of up to 88 proteins identified as containing carbonyl groups after tryptic cleavage of irradiated and control liver proteins, only seven were common in all three liver preparations. Lysine and arginine residues modified by carbonyls are likely to be resistant to tryptic proteolysis. Use of a cocktail of proteases may increase the recovery of oxidised peptides. In conclusion, standardisation is critical for carbonyl analysis and heavily oxidised proteins may not be effectively analysed by any existing technique.

  5. Protein Circular Dichroism Data Bank (PCDDB): data bank and website design.

    Science.gov (United States)

    Whitmore, Lee; Janes, Robert W; Wallace, B A

    2006-06-01

    The Protein Circular Dichroism Data Bank (PCDDB) is a new deposition data bank for validated circular dichroism spectra of biomacromolecules. Its aim is to be a resource for the structural biology and bioinformatics communities, providing open access and archiving facilities for circular dichroism and synchrotron radiation circular dichroism spectra. It is named in parallel with the Protein Data Bank (PDB), a long-existing valuable reference data bank for protein crystal and NMR structures. In this article, we discuss the design of the data bank structure and the deposition website located at http://pcddb.cryst.bbk.ac.uk. Our aim is to produce a flexible and comprehensive archive, which enables user-friendly spectral deposition and searching. In the case of a protein whose crystal structure and sequence are known, the PCDDB entry will be linked to the appropriate PDB and sequence data bank files, respectively. It is anticipated that the PCDDB will provide a readily accessible biophysical catalogue of information on folded proteins that may be of value in structural genomics programs, for quality control and archiving in industrial and academic labs, as a resource for programs developing spectroscopic structural analysis methods, and in bioinformatics studies. Copyright 2006 Wiley-Liss, Inc.

  6. Hierarchical organization in aggregates of protein molecules

    DEFF Research Database (Denmark)

    Bohr, Henrik; Kyhle, Anders; Sørensen, Alexis Hammer

    1997-01-01

    of the solution and the density of protein are varied shows the existence of specific growth processes resulting in different branch-like structures. The resulting structures are strongly influenced by the shape of each protein molecule. Lysozyme and ribonuclease are found to form spherical structures...

  7. Existing and Expected Service Quality of Grameenphone Users in Bangladesh

    Directory of Open Access Journals (Sweden)

    Azmat Ullah

    2015-12-01

    Full Text Available The Grameenphone (GP is a market leader in the telecommunication industry in Bangladesh. This study investigates the existing and expected service quality of Grameenphone users in Bangladesh. The Study reveals that there are significant gap between existing and expected perceived service network, 3G, customer care, physical facilities, billing cost, information service, mobile banking and GP offers. The study concludes that customer satisfaction is a dynamic phenomenon. Maintaining desired level of customer satisfaction requires corporate proactive responsiveness in accessing, building & retaining satisfied customers for sustainable competitive advantages in the marketplace.

  8. Prediction of protein-protein interaction sites in sequences and 3D structures by random forests.

    Directory of Open Access Journals (Sweden)

    Mile Sikić

    2009-01-01

    Full Text Available Identifying interaction sites in proteins provides important clues to the function of a protein and is becoming increasingly relevant in topics such as systems biology and drug discovery. Although there are numerous papers on the prediction of interaction sites using information derived from structure, there are only a few case reports on the prediction of interaction residues based solely on protein sequence. Here, a sliding window approach is combined with the Random Forests method to predict protein interaction sites using (i a combination of sequence- and structure-derived parameters and (ii sequence information alone. For sequence-based prediction we achieved a precision of 84% with a 26% recall and an F-measure of 40%. When combined with structural information, the prediction performance increases to a precision of 76% and a recall of 38% with an F-measure of 51%. We also present an attempt to rationalize the sliding window size and demonstrate that a nine-residue window is the most suitable for predictor construction. Finally, we demonstrate the applicability of our prediction methods by modeling the Ras-Raf complex using predicted interaction sites as target binding interfaces. Our results suggest that it is possible to predict protein interaction sites with quite a high accuracy using only sequence information.

  9. Mining biomarker information in biomedical literature

    Directory of Open Access Journals (Sweden)

    Younesi Erfan

    2012-12-01

    Full Text Available Abstract Background For selection and evaluation of potential biomarkers, inclusion of already published information is of utmost importance. In spite of significant advancements in text- and data-mining techniques, the vast knowledge space of biomarkers in biomedical text has remained unexplored. Existing named entity recognition approaches are not sufficiently selective for the retrieval of biomarker information from the literature. The purpose of this study was to identify textual features that enhance the effectiveness of biomarker information retrieval for different indication areas and diverse end user perspectives. Methods A biomarker terminology was created and further organized into six concept classes. Performance of this terminology was optimized towards balanced selectivity and specificity. The information retrieval performance using the biomarker terminology was evaluated based on various combinations of the terminology's six classes. Further validation of these results was performed on two independent corpora representing two different neurodegenerative diseases. Results The current state of the biomarker terminology contains 119 entity classes supported by 1890 different synonyms. The result of information retrieval shows improved retrieval rate of informative abstracts, which is achieved by including clinical management terms and evidence of gene/protein alterations (e.g. gene/protein expression status or certain polymorphisms in combination with disease and gene name recognition. When additional filtering through other classes (e.g. diagnostic or prognostic methods is applied, the typical high number of unspecific search results is significantly reduced. The evaluation results suggest that this approach enables the automated identification of biomarker information in the literature. A demo version of the search engine SCAIView, including the biomarker retrieval, is made available to the public through http

  10. Computational Approaches for Prediction of Pathogen-Host Protein-Protein Interactions

    Directory of Open Access Journals (Sweden)

    Esmaeil eNourani

    2015-02-01

    Full Text Available Infectious diseases are still among the major and prevalent health problems, mostly because of the drug resistance of novel variants of pathogens. Molecular interactions between pathogens and their hosts are the key part of the infection mechanisms. Novel antimicrobial therapeutics to fight drug resistance is only possible in case of a thorough understanding of pathogen-host interaction (PHI systems. Existing databases, which contain experimentally verified PHI data, suffer from scarcity of reported interactions due to the technically challenging and time consuming process of experiments. This has motivated many researchers to address the problem by proposing computational approaches for analysis and prediction of PHIs. The computational methods primarily utilize sequence information, protein structure and known interactions. Classic machine learning techniques are used when there are sufficient known interactions to be used as training data. On the opposite case, transfer and multi task learning methods are preferred. Here, we present an overview of these computational approaches for PHI prediction, discussing their weakness and abilities, with future directions.

  11. Modulation of protein synthesis by polyamines.

    Science.gov (United States)

    Igarashi, Kazuei; Kashiwagi, Keiko

    2015-03-01

    Polyamines are ubiquitous small basic molecules that play important roles in cell growth and viability. Since polyamines mainly exist as a polyamine-RNA complex, we looked for proteins whose synthesis is preferentially stimulated by polyamines at the level of translation, and thus far identified 17 proteins in Escherichia coli and 6 proteins in eukaryotes. The mechanisms of polyamine stimulation of synthesis of these proteins were investigated. In addition, the role of eIF5A, containing hypusine formed from spermidine, on protein synthesis is described. These results clearly indicate that polyamines and eIF5A contribute to cell growth and viability through modulation of protein synthesis. © 2015 International Union of Biochemistry and Molecular Biology.

  12. Can we replace curation with information extraction software?

    Science.gov (United States)

    Karp, Peter D

    2016-01-01

    Can we use programs for automated or semi-automated information extraction from scientific texts as practical alternatives to professional curation? I show that error rates of current information extraction programs are too high to replace professional curation today. Furthermore, current IEP programs extract single narrow slivers of information, such as individual protein interactions; they cannot extract the large breadth of information extracted by professional curators for databases such as EcoCyc. They also cannot arbitrate among conflicting statements in the literature as curators can. Therefore, funding agencies should not hobble the curation efforts of existing databases on the assumption that a problem that has stymied Artificial Intelligence researchers for more than 60 years will be solved tomorrow. Semi-automated extraction techniques appear to have significantly more potential based on a review of recent tools that enhance curator productivity. But a full cost-benefit analysis for these tools is lacking. Without such analysis it is possible to expend significant effort developing information-extraction tools that automate small parts of the overall curation workflow without achieving a significant decrease in curation costs.Database URL. © The Author(s) 2016. Published by Oxford University Press.

  13. Quality control methodology for high-throughput protein-protein interaction screening.

    Science.gov (United States)

    Vazquez, Alexei; Rual, Jean-François; Venkatesan, Kavitha

    2011-01-01

    Protein-protein interactions are key to many aspects of the cell, including its cytoskeletal structure, the signaling processes in which it is involved, or its metabolism. Failure to form protein complexes or signaling cascades may sometimes translate into pathologic conditions such as cancer or neurodegenerative diseases. The set of all protein interactions between the proteins encoded by an organism constitutes its protein interaction network, representing a scaffold for biological function. Knowing the protein interaction network of an organism, combined with other sources of biological information, can unravel fundamental biological circuits and may help better understand the molecular basics of human diseases. The protein interaction network of an organism can be mapped by combining data obtained from both low-throughput screens, i.e., "one gene at a time" experiments and high-throughput screens, i.e., screens designed to interrogate large sets of proteins at once. In either case, quality controls are required to deal with the inherent imperfect nature of experimental assays. In this chapter, we discuss experimental and statistical methodologies to quantify error rates in high-throughput protein-protein interactions screens.

  14. Improving accuracy of protein-protein interaction prediction by considering the converse problem for sequence representation

    Directory of Open Access Journals (Sweden)

    Wang Yong

    2011-10-01

    Full Text Available Abstract Background With the development of genome-sequencing technologies, protein sequences are readily obtained by translating the measured mRNAs. Therefore predicting protein-protein interactions from the sequences is of great demand. The reason lies in the fact that identifying protein-protein interactions is becoming a bottleneck for eventually understanding the functions of proteins, especially for those organisms barely characterized. Although a few methods have been proposed, the converse problem, if the features used extract sufficient and unbiased information from protein sequences, is almost untouched. Results In this study, we interrogate this problem theoretically by an optimization scheme. Motivated by the theoretical investigation, we find novel encoding methods for both protein sequences and protein pairs. Our new methods exploit sufficiently the information of protein sequences and reduce artificial bias and computational cost. Thus, it significantly outperforms the available methods regarding sensitivity, specificity, precision, and recall with cross-validation evaluation and reaches ~80% and ~90% accuracy in Escherichia coli and Saccharomyces cerevisiae respectively. Our findings here hold important implication for other sequence-based prediction tasks because representation of biological sequence is always the first step in computational biology. Conclusions By considering the converse problem, we propose new representation methods for both protein sequences and protein pairs. The results show that our method significantly improves the accuracy of protein-protein interaction predictions.

  15. Applications of statistical physics and information theory to the analysis of DNA sequences

    Science.gov (United States)

    Grosse, Ivo

    2000-10-01

    DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.

  16. Competitive protein binding assay

    International Nuclear Information System (INIS)

    Kaneko, Toshio; Oka, Hiroshi

    1975-01-01

    The measurement of cyclic GMP (cGMP) by competitive protein binding assay was described and discussed. The principle of binding assay was represented briefly. Procedures of our method by binding protein consisted of preparation of cGMP binding protein, selection of 3 H-cyclic GMP on market, and measurement procedures. In our method, binding protein was isolated from the chrysalis of silk worm. This method was discussed from the points of incubation medium, specificity of binding protein, the separation of bound cGMP from free cGMP, and treatment of tissue from which cGMP was extracted. cGMP existing in the tissue was only one tenth or one scores of cGMP, and in addition, cGMP competed with cGMP in binding with binding protein. Therefore, Murad's technique was applied to the isolation of cGMP. This method provided the measurement with sufficient accuracy; the contamination by cAMP was within several per cent. (Kanao, N.)

  17. Dr. PIAS: an integrative system for assessing the druggability of protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Furuya Toshio

    2011-02-01

    Full Text Available Abstract Background The amount of data on protein-protein interactions (PPIs available in public databases and in the literature has rapidly expanded in recent years. PPI data can provide useful information for researchers in pharmacology and medicine as well as those in interactome studies. There is urgent need for a novel methodology or software allowing the efficient utilization of PPI data in pharmacology and medicine. Results To address this need, we have developed the 'Druggable Protein-protein Interaction Assessment System' (Dr. PIAS. Dr. PIAS has a meta-database that stores various types of information (tertiary structures, drugs/chemicals, and biological functions associated with PPIs retrieved from public sources. By integrating this information, Dr. PIAS assesses whether a PPI is druggable as a target for small chemical ligands by using a supervised machine-learning method, support vector machine (SVM. Dr. PIAS holds not only known druggable PPIs but also all PPIs of human, mouse, rat, and human immunodeficiency virus (HIV proteins identified to date. Conclusions The design concept of Dr. PIAS is distinct from other published PPI databases in that it focuses on selecting the PPIs most likely to make good drug targets, rather than merely collecting PPI data.

  18. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants.

    Science.gov (United States)

    Gromiha, M Michael; Anoosha, P; Huang, Liang-Tsung

    2016-01-01

    Protein stability is the free energy difference between unfolded and folded states of a protein, which lies in the range of 5-25 kcal/mol. Experimentally, protein stability is measured with circular dichroism, differential scanning calorimetry, and fluorescence spectroscopy using thermal and denaturant denaturation methods. These experimental data have been accumulated in the form of a database, ProTherm, thermodynamic database for proteins and mutants. It also contains sequence and structure information of a protein, experimental methods and conditions, and literature information. Different features such as search, display, and sorting options and visualization tools have been incorporated in the database. ProTherm is a valuable resource for understanding/predicting the stability of proteins and it can be accessed at http://www.abren.net/protherm/ . ProTherm has been effectively used to examine the relationship among thermodynamics, structure, and function of proteins. We describe the recent progress on the development of methods for understanding/predicting protein stability, such as (1) general trends on mutational effects on stability, (2) relationship between the stability of protein mutants and amino acid properties, (3) applications of protein three-dimensional structures for predicting their stability upon point mutations, (4) prediction of protein stability upon single mutations from amino acid sequence, and (5) prediction methods for addressing double mutants. A list of online resources for predicting has also been provided.

  19. Technical Meeting on Existing and Proposed Experimental Facilities for Fast Neutron Systems. Presentations

    International Nuclear Information System (INIS)

    2013-01-01

    The objective of the TM on “Existing and proposed experimental facilities for fast neutron systems” is threefold: first, it is intended for presenting and exchanging information about existing and planned experimental facilities in support of the development of innovative fast neutron systems; second, it will allow to create a catalogue of existing and planned experimental facilities currently operated/developed within national or international fast reactors programmes; third, once a clear picture of the existing experimental infrastructures is defined, new experimental facilities will be discussed and proposed, on the basis of the identified R&D needs

  20. Differential scanning microcalorimetry of intrinsically disordered proteins.

    Science.gov (United States)

    Permyakov, Sergei E

    2012-01-01

    Ultrasensitive differential scanning calorimetry (DSC) is an indispensable thermophysical technique enabling to get direct information on enthalpies accompanying heating/cooling of dilute biopolymer solutions. The thermal dependence of protein heat capacity extracted from DSC data is a valuable source of information on intrinsic disorder level of a protein. Application details and limitations of DSC technique in exploration of protein intrinsic disorder are described.

  1. WildSpan: mining structured motifs from protein sequences

    Directory of Open Access Journals (Sweden)

    Chen Chien-Yu

    2011-03-01

    of WildSpan is developed for discovering functional regions of a single protein by referring to a set of related sequences (e.g. its homologues. The discovered W-patterns are used to characterize the protein sequence and the results are compared with the conserved positions identified by multiple sequence alignment (MSA. The family-based mining mode of WildSpan is developed for extracting sequence signatures for a group of related proteins (e.g. a protein family for protein function classification. In this situation, the discovered W-patterns are compared with PROSITE patterns as well as the patterns generated by three existing methods performing the similar task. Finally, analysis on execution time of running WildSpan reveals that the proposed pruning strategy is effective in improving the scalability of the proposed algorithm. Conclusions The mining results conducted in this study reveal that WildSpan is efficient and effective in discovering functional signatures of proteins directly from sequences. The proposed pruning strategy is effective in improving the scalability of WildSpan. It is demonstrated in this study that the W-patterns discovered by WildSpan provides useful information in characterizing protein sequences. The WildSpan executable and open source codes are available on the web (http://biominer.csie.cyu.edu.tw/wildspan.

  2. 77 FR 38323 - Proposed Extension of Existing Information Collection; Respirable Coal Mine Dust Sampling

    Science.gov (United States)

    2012-06-27

    ... Information Collection; Respirable Coal Mine Dust Sampling AGENCY: Mine Safety and Health Administration... Sampling'' to more accurately reflect the type of information that is collected. Chronic exposure to... dust levels since 1970 and, consequently, the prevalence rate of black lung among coal miners, severe...

  3. PDTD: a web-accessible protein database for drug target identification

    Directory of Open Access Journals (Sweden)

    Gao Zhenting

    2008-02-01

    Full Text Available Abstract Background Target identification is important for modern drug discovery. With the advances in the development of molecular docking, potential binding proteins may be discovered by docking a small molecule to a repository of proteins with three-dimensional (3D structures. To complete this task, a reverse docking program and a drug target database with 3D structures are necessary. To this end, we have developed a web server tool, TarFisDock (Target Fishing Docking http://www.dddc.ac.cn/tarfisdock, which has been used widely by others. Recently, we have constructed a protein target database, Potential Drug Target Database (PDTD, and have integrated PDTD with TarFisDock. This combination aims to assist target identification and validation. Description PDTD is a web-accessible protein database for in silico target identification. It currently contains >1100 protein entries with 3D structures presented in the Protein Data Bank. The data are extracted from the literatures and several online databases such as TTD, DrugBank and Thomson Pharma. The database covers diverse information of >830 known or potential drug targets, including protein and active sites structures in both PDB and mol2 formats, related diseases, biological functions as well as associated regulating (signaling pathways. Each target is categorized by both nosology and biochemical function. PDTD supports keyword search function, such as PDB ID, target name, and disease name. Data set generated by PDTD can be viewed with the plug-in of molecular visualization tools and also can be downloaded freely. Remarkably, PDTD is specially designed for target identification. In conjunction with TarFisDock, PDTD can be used to identify binding proteins for small molecules. The results can be downloaded in the form of mol2 file with the binding pose of the probe compound and a list of potential binding targets according to their ranking scores. Conclusion PDTD serves as a comprehensive and

  4. 76 FR 58301 - Proposed Extension of Existing Information Collection; Automatic Fire Sensor and Warning Device...

    Science.gov (United States)

    2011-09-20

    ... Information Collection; Automatic Fire Sensor and Warning Device Systems; Examination and Test Requirements ACTION: Notice of request for public comments. SUMMARY: The Mine Safety and Health Administration (MSHA... public comment version of this information collection package. FOR FURTHER INFORMATION CONTACT: Roslyn B...

  5. Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction

    Directory of Open Access Journals (Sweden)

    Fofanov Viacheslav Y

    2010-05-01

    Full Text Available Abstract Background Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels. Results This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST method uses all-against-all substructure comparison to determine Substructural Clusters (SCs. SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs. Conclusions FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated

  6. Microvillar membrane microdomains exist at physiological temperature. Role of galectin-4 as lipid raft stabilizer revealed by "superrafts"

    DEFF Research Database (Denmark)

    Braccia, Anita; Villani, Maristella; Immerdal, Lissi

    2003-01-01

    rafts prepared by the two protocols were morphologically different but had essentially similar profiles of protein- and lipid components, showing that raft microdomains do exist at 37 degrees C and are not "low temperature artifacts." We also employed a novel method of sequential detergent extraction...... and the transmembrane aminopeptidase N, whereas the peripheral lipid raft protein annexin 2 was essentially absent. In conclusion, in the microvillar membrane, galectin-4, functions as a core raft stabilizer/organizer for other, more loosely raft-associated proteins. The superraft analysis might be applicable to other...

  7. Computational prediction of protein-protein interactions in Leishmania predicted proteomes.

    Directory of Open Access Journals (Sweden)

    Antonio M Rezende

    Full Text Available The Trypanosomatids parasites Leishmania braziliensis, Leishmania major and Leishmania infantum are important human pathogens. Despite of years of study and genome availability, effective vaccine has not been developed yet, and the chemotherapy is highly toxic. Therefore, it is clear just interdisciplinary integrated studies will have success in trying to search new targets for developing of vaccines and drugs. An essential part of this rationale is related to protein-protein interaction network (PPI study which can provide a better understanding of complex protein interactions in biological system. Thus, we modeled PPIs for Trypanosomatids through computational methods using sequence comparison against public database of protein or domain interaction for interaction prediction (Interolog Mapping and developed a dedicated combined system score to address the predictions robustness. The confidence evaluation of network prediction approach was addressed using gold standard positive and negative datasets and the AUC value obtained was 0.94. As result, 39,420, 43,531 and 45,235 interactions were predicted for L. braziliensis, L. major and L. infantum respectively. For each predicted network the top 20 proteins were ranked by MCC topological index. In addition, information related with immunological potential, degree of protein sequence conservation among orthologs and degree of identity compared to proteins of potential parasite hosts was integrated. This information integration provides a better understanding and usefulness of the predicted networks that can be valuable to select new potential biological targets for drug and vaccine development. Network modularity which is a key when one is interested in destabilizing the PPIs for drug or vaccine purposes along with multiple alignments of the predicted PPIs were performed revealing patterns associated with protein turnover. In addition, around 50% of hypothetical protein present in the networks

  8. The research methods and model of protein turnover in animal

    International Nuclear Information System (INIS)

    Wu Xilin; Yang Feng

    2002-01-01

    The author discussed the concept and research methods of protein turnover in animal body. The existing problems and the research results of animal protein turnover in recent years were presented. Meanwhile, the measures to improve the models of animal protein turnover were analyzed

  9. Tau protein

    DEFF Research Database (Denmark)

    Frederiksen, Jette Lautrup Battistini; Kristensen, Kim; Bahl, Jmc

    2011-01-01

    Background: Tau protein has been proposed as biomarker of axonal damage leading to irreversible neurological impairment in MS. CSF concentrations may be useful when determining risk of progression from ON to MS. Objective: To investigate the association between tau protein concentration and 14......-3-3 protein in the cerebrospinal fluid (CSF) of patients with monosymptomatic optic neuritis (ON) versus patients with monosymptomatic onset who progressed to multiple sclerosis (MS). To evaluate results against data found in a complete literature review. Methods: A total of 66 patients with MS and/or ON from...... the Department of Neurology of Glostrup Hospital, University of Copenhagen, Denmark, were included. CSF samples were analysed for tau protein and 14-3-3 protein, and clinical and paraclinical information was obtained from medical records. Results: The study shows a significantly increased concentration of tau...

  10. Scoring functions for protein-protein interactions.

    Science.gov (United States)

    Moal, Iain H; Moretti, Rocco; Baker, David; Fernández-Recio, Juan

    2013-12-01

    The computational evaluation of protein-protein interactions will play an important role in organising the wealth of data being generated by high-throughput initiatives. Here we discuss future applications, report recent developments and identify areas requiring further investigation. Many functions have been developed to quantify the structural and energetic properties of interacting proteins, finding use in interrelated challenges revolving around the relationship between sequence, structure and binding free energy. These include loop modelling, side-chain refinement, docking, multimer assembly, affinity prediction, affinity change upon mutation, hotspots location and interface design. Information derived from models optimised for one of these challenges can be used to benefit the others, and can be unified within the theoretical frameworks of multi-task learning and Pareto-optimal multi-objective learning. Copyright © 2013 Elsevier Ltd. All rights reserved.

  11. MIPS: analysis and annotation of proteins from whole genomes.

    Science.gov (United States)

    Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

    2004-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

  12. Mars - robust automatic backbone assignment of proteins

    International Nuclear Information System (INIS)

    Jung, Young-Sang; Zweckstetter, Markus

    2004-01-01

    MARS a program for robust automatic backbone assignment of 13 C/ 15 N labeled proteins is presented. MARS does not require tight thresholds for establishing sequential connectivity or detailed adjustment of these thresholds and it can work with a wide variety of NMR experiments. Using only 13 C α / 13 C β connectivity information, MARS allows automatic, error-free assignment of 96% of the 370-residue maltose-binding protein. MARS can successfully be used when data are missing for a substantial portion of residues or for proteins with very high chemical shift degeneracy such as partially or fully unfolded proteins. Other sources of information, such as residue specific information or known assignments from a homologues protein, can be included into the assignment process. MARS exports its result in SPARKY format. This allows visual validation and integration of automated and manual assignment

  13. Does representative wind information exist?

    NARCIS (Netherlands)

    Wieringa, J.

    1996-01-01

    Representativity requirements are discussed for various wind data users. It is shown that most applications can be dealt with by using data from wind stations when these are made to conform with WMO specifications. Methods to achieve this WMO normalization are reviewed, giving minimum specifications

  14. Protein-protein docking using region-based 3D Zernike descriptors.

    Science.gov (United States)

    Venkatraman, Vishwesh; Yang, Yifeng D; Sael, Lee; Kihara, Daisuke

    2009-12-09

    Protein-protein interactions are a pivotal component of many biological processes and mediate a variety of functions. Knowing the tertiary structure of a protein complex is therefore essential for understanding the interaction mechanism. However, experimental techniques to solve the structure of the complex are often found to be difficult. To this end, computational protein-protein docking approaches can provide a useful alternative to address this issue. Prediction of docking conformations relies on methods that effectively capture shape features of the participating proteins while giving due consideration to conformational changes that may occur. We present a novel protein docking algorithm based on the use of 3D Zernike descriptors as regional features of molecular shape. The key motivation of using these descriptors is their invariance to transformation, in addition to a compact representation of local surface shape characteristics. Docking decoys are generated using geometric hashing, which are then ranked by a scoring function that incorporates a buried surface area and a novel geometric complementarity term based on normals associated with the 3D Zernike shape description. Our docking algorithm was tested on both bound and unbound cases in the ZDOCK benchmark 2.0 dataset. In 74% of the bound docking predictions, our method was able to find a near-native solution (interface C-alphaRMSD 3D Zernike descriptors are adept in capturing shape complementarity at the protein-protein interface and useful for protein docking prediction. Rigorous benchmark studies show that our docking approach has a superior performance compared to existing methods.

  15. Differential regulation of synaptic and extrasynaptic α4 GABA(A) receptor populations by protein kinase A and protein kinase C in cultured cortical neurons.

    Science.gov (United States)

    Bohnsack, John Peyton; Carlson, Stephen L; Morrow, A Leslie

    2016-06-01

    The GABAA α4 subunit exists in two distinct populations of GABAA receptors. Synaptic GABAA α4 receptors are localized at the synapse and mediate phasic inhibitory neurotransmission, while extrasynaptic GABAA receptors are located outside of the synapse and mediate tonic inhibitory transmission. These receptors have distinct pharmacological and biophysical properties that contribute to interest in how these different subtypes are regulated under physiological and pathological states. We utilized subcellular fractionation procedures to separate these populations of receptors in order to investigate their regulation by protein kinases in cortical cultured neurons. Protein kinase A (PKA) activation decreases synaptic α4 expression while protein kinase C (PKC) activation increases α4 subunit expression, and these effects are associated with increased β3 S408/409 or γ2 S327 phosphorylation respectively. In contrast, PKA activation increases extrasynaptic α4 and δ subunit expression, while PKC activation has no effect. Our findings suggest synaptic and extrasynaptic GABAA α4 subunit expression can be modulated by PKA to inform the development of more specific therapeutics for neurological diseases that involve deficits in GABAergic transmission. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Multiple protonation equilibria in electrostatics of protein-protein binding.

    Science.gov (United States)

    Piłat, Zofia; Antosiewicz, Jan M

    2008-11-27

    All proteins contain groups capable of exchanging protons with their environment. We present here an approach, based on a rigorous thermodynamic cycle and the partition functions for energy levels characterizing protonation states of the associating proteins and their complex, to compute the electrostatic pH-dependent contribution to the free energy of protein-protein binding. The computed electrostatic binding free energies include the pH of the solution as the variable of state, mutual "polarization" of associating proteins reflected as changes in the distribution of their protonation states upon binding and fluctuations between available protonation states. The only fixed property of both proteins is the conformation; the structure of the monomers is kept in the same conformation as they have in the complex structure. As a reference, we use the electrostatic binding free energies obtained from the traditional Poisson-Boltzmann model, computed for a single macromolecular conformation fixed in a given protonation state, appropriate for given solution conditions. The new approach was tested for 12 protein-protein complexes. It is shown that explicit inclusion of protonation degrees of freedom might lead to a substantially different estimation of the electrostatic contribution to the binding free energy than that based on the traditional Poisson-Boltzmann model. This has important implications for the balancing of different contributions to the energetics of protein-protein binding and other related problems, for example, the choice of protein models for Brownian dynamics simulations of their association. Our procedure can be generalized to include conformational degrees of freedom by combining it with molecular dynamics simulations at constant pH. Unfortunately, in practice, a prohibitive factor is an enormous requirement for computer time and power. However, there may be some hope for solving this problem by combining existing constant pH molecular dynamics

  17. Identification and characterization of stable membrane protein complexes

    NARCIS (Netherlands)

    Spelbrink, R.E.J.

    2007-01-01

    Many membrane proteins exist as oligomers. Such oligomers play an important role in a broad variety of cellular processes such as ion transport, energy transduction, osmosensing and cell wall synthesis. We developed an electrophoresis-based method of identifying oligomeric membrane proteins that are

  18. Improved protein surface comparison and application to low-resolution protein structure data

    Directory of Open Access Journals (Sweden)

    Kihara Daisuke

    2010-12-01

    Full Text Available Abstract Background Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM, which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs. The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. Results The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Conclusions Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy.

  19. Improved protein surface comparison and application to low-resolution protein structure data.

    Science.gov (United States)

    Sael, Lee; Kihara, Daisuke

    2010-12-14

    Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM), which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs). The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy.

  20. The Information Tekhnology Share In Management Information System

    Directory of Open Access Journals (Sweden)

    Nur Zeina Maya Sari

    2015-08-01

    Full Text Available Abstract Management Information System growth cause change of role from all manager in decision making the information technology. While prima facie reason for the usage of information technology in business to support such a manner so that information system may operate better OBrienamp Marakas 2004. Its meaning with existence of information tekhnology in management information system SIM company management decision making which initially often pursued by many factor of non technical become accurately is relevant complete and on schedule

  1. Technical Meeting on Existing and Proposed Experimental Facilities for Fast Neutron Systems. Working Material

    International Nuclear Information System (INIS)

    2013-01-01

    The objective of the TM on “Existing and proposed experimental facilities for fast neutron systems” was threefold: 1) presenting and exchanging information about existing and planned experimental facilities in support of the development of innovative fast neutron systems; 2) allow creating a catalogue of existing and planned experimental facilities currently operated/developed within national or international fast reactors programmes; 3) once a clear picture of the existing experimental infrastructures is defined, new experimental facilities are discussed and proposed, on the basis of the identified R&D needs

  2. HDOCK: a web server for protein-protein and protein-DNA/RNA docking based on a hybrid strategy.

    Science.gov (United States)

    Yan, Yumeng; Zhang, Di; Zhou, Pei; Li, Botong; Huang, Sheng-You

    2017-07-03

    Protein-protein and protein-DNA/RNA interactions play a fundamental role in a variety of biological processes. Determining the complex structures of these interactions is valuable, in which molecular docking has played an important role. To automatically make use of the binding information from the PDB in docking, here we have presented HDOCK, a novel web server of our hybrid docking algorithm of template-based modeling and free docking, in which cases with misleading templates can be rescued by the free docking protocol. The server supports protein-protein and protein-DNA/RNA docking and accepts both sequence and structure inputs for proteins. The docking process is fast and consumes about 10-20 min for a docking run. Tested on the cases with weakly homologous complexes of server. The HDOCK web server is available at http://hdock.phys.hust.edu.cn/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Escherichia coli Protein Expression System for Acetylcholine Binding Proteins (AChBPs.

    Directory of Open Access Journals (Sweden)

    Nikita Abraham

    Full Text Available Nicotinic acetylcholine receptors (nAChR are ligand gated ion channels, identified as therapeutic targets for a range of human diseases. Drug design for nAChR related disorders is increasingly using structure-based approaches. Many of these structural insights for therapeutic lead development have been obtained from co-crystal structures of nAChR agonists and antagonists with the acetylcholine binding protein (AChBP. AChBP is a water soluble, structural and functional homolog of the extracellular, ligand-binding domain of nAChRs. Currently, AChBPs are recombinantly expressed in eukaryotic expression systems for structural and biophysical studies. Here, we report the establishment of an Escherichia coli (E. coli expression system that significantly reduces the cost and time of production compared to the existing expression systems. E. coli can efficiently express unglycosylated AChBP for crystallography and makes the expression of isotopically labelled forms feasible for NMR. We used a pHUE vector containing an N-terminal His-tagged ubiquitin fusion protein to facilitate AChBP expression in the soluble fractions, and thus avoid the need to recover protein from inclusion bodies. The purified protein yield obtained from the E. coli expression system is comparable to that obtained from existing AChBP expression systems. E. coli expressed AChBP bound nAChR agonists and antagonists with affinities matching those previously reported. Thus, the E. coli expression system significantly simplifies the expression and purification of functional AChBP for structural and biophysical studies.

  4. Function and structure of GFP-like proteins in the protein data bank.

    Science.gov (United States)

    Ong, Wayne J-H; Alvarez, Samuel; Leroux, Ivan E; Shahid, Ramza S; Samma, Alex A; Peshkepija, Paola; Morgan, Alicia L; Mulcahy, Shawn; Zimmer, Marc

    2011-04-01

    The RCSB protein databank contains 266 crystal structures of green fluorescent proteins (GFP) and GFP-like proteins. This is the first systematic analysis of all the GFP-like structures in the pdb. We have used the pdb to examine the function of fluorescent proteins (FP) in nature, aspects of excited state proton transfer (ESPT) in FPs, deformation from planarity of the chromophore and chromophore maturation. The conclusions reached in this review are that (1) The lid residues are highly conserved, particularly those on the "top" of the β-barrel. They are important to the function of GFP-like proteins, perhaps in protecting the chromophore or in β-barrel formation. (2) The primary/ancestral function of GFP-like proteins may well be to aid in light induced electron transfer. (3) The structural prerequisites for light activated proton pumps exist in many structures and it's possible that like bioluminescence, proton pumps are secondary functions of GFP-like proteins. (4) In most GFP-like proteins the protein matrix exerts a significant strain on planar chromophores forcing most GFP-like proteins to adopt non-planar chromophores. These chromophoric deviations from planarity play an important role in determining the fluorescence quantum yield. (5) The chemospatial characteristics of the chromophore cavity determine the isomerization state of the chromophore. The cavities of highlighter proteins that can undergo cis/trans isomerization have chemospatial properties that are common to both cis and trans GFP-like proteins.

  5. The Prediction of Key Cytoskeleton Components Involved in Glomerular Diseases Based on a Protein-Protein Interaction Network.

    Science.gov (United States)

    Ding, Fangrui; Tan, Aidi; Ju, Wenjun; Li, Xuejuan; Li, Shao; Ding, Jie

    2016-01-01

    Maintenance of the physiological morphologies of different types of cells and tissues is essential for the normal functioning of each system in the human body. Dynamic variations in cell and tissue morphologies depend on accurate adjustments of the cytoskeletal system. The cytoskeletal system in the glomerulus plays a key role in the normal process of kidney filtration. To enhance the understanding of the possible roles of the cytoskeleton in glomerular diseases, we constructed the Glomerular Cytoskeleton Network (GCNet), which shows the protein-protein interaction network in the glomerulus, and identified several possible key cytoskeletal components involved in glomerular diseases. In this study, genes/proteins annotated to the cytoskeleton were detected by Gene Ontology analysis, and glomerulus-enriched genes were selected from nine available glomerular expression datasets. Then, the GCNet was generated by combining these two sets of information. To predict the possible key cytoskeleton components in glomerular diseases, we then examined the common regulation of the genes in GCNet in the context of five glomerular diseases based on their transcriptomic data. As a result, twenty-one cytoskeleton components as potential candidate were highlighted for consistently down- or up-regulating in all five glomerular diseases. And then, these candidates were examined in relation to existing known glomerular diseases and genes to determine their possible functions and interactions. In addition, the mRNA levels of these candidates were also validated in a puromycin aminonucleoside(PAN) induced rat nephropathy model and were also matched with existing Diabetic Nephropathy (DN) transcriptomic data. As a result, there are 15 of 21 candidates in PAN induced nephropathy model were consistent with our predication and also 12 of 21 candidates were matched with differentially expressed genes in the DN transcriptomic data. By providing a novel interaction network and prediction, GCNet

  6. Affinity-based, biophysical methods to detect and analyze ligand binding to recombinant proteins: matching high information content with high throughput.

    Science.gov (United States)

    Holdgate, Geoff A; Anderson, Malcolm; Edfeldt, Fredrik; Geschwindner, Stefan

    2010-10-01

    Affinity-based technologies have become impactful tools to detect, monitor and characterize molecular interactions using recombinant target proteins. This can aid the understanding of biological function by revealing mechanistic details, and even more importantly, enables the identification of new improved ligands that can modulate the biological activity of those targets in a desired fashion. The selection of the appropriate technology is a key step in that process, as each one of the currently available technologies offers a characteristic type of biophysical information about the ligand-binding event. Alongside the indisputable advantages of each of those technologies they naturally display diverse restrictions that are quite frequently related to the target system to be studied but also to the affinity, solubility and molecular size of the ligands. This paper discusses some of the theoretical and experimental aspects of the most common affinity-based methods, what type of information can be gained from each one of those approaches, and what requirements as well as limitations are expected from working with recombinant proteins on those platforms and how those can be optimally addressed.

  7. Aging Is Accompanied by a Blunted Muscle Protein Synthetic Response to Protein Ingestion.

    Directory of Open Access Journals (Sweden)

    Benjamin Toby Wall

    Full Text Available Progressive loss of skeletal muscle mass with aging (sarcopenia forms a global health concern. It has been suggested that an impaired capacity to increase muscle protein synthesis rates in response to protein intake is a key contributor to sarcopenia. We assessed whether differences in post-absorptive and/or post-prandial muscle protein synthesis rates exist between large cohorts of healthy young and older men.We performed a cross-sectional, retrospective study comparing in vivo post-absorptive muscle protein synthesis rates determined with stable isotope methodologies between 34 healthy young (22±1 y and 72 older (75±1 y men, and post-prandial muscle protein synthesis rates between 35 healthy young (22±1 y and 40 older (74±1 y men.Post-absorptive muscle protein synthesis rates did not differ significantly between the young and older group. Post-prandial muscle protein synthesis rates were 16% lower in the older subjects when compared with the young. Muscle protein synthesis rates were >3 fold more responsive to dietary protein ingestion in the young. Irrespective of age, there was a strong negative correlation between post-absorptive muscle protein synthesis rates and the increase in muscle protein synthesis rate following protein ingestion.Aging is associated with the development of muscle anabolic inflexibility which represents a key physiological mechanism underpinning sarcopenia.

  8. The presence of blogs in the field of Library and Information Sciences in Spain: Can we talk about the existence of a “Biblogsfera”?

    Directory of Open Access Journals (Sweden)

    Miguel Ángel Vera Baceta

    2014-03-01

    Full Text Available The blog phenomenon has meant a natural way of grouping and establishing communication amongst those who have similar concerns. This has happened the same way in the field of science, thus creating communities of special interest in a context in which blogs are consolidating as sources of information and extra tools for research. While the authors’ freedom to organise and offer their contents may be considered one of the main advantages and a trigger for the success of blogs, it is also one of their main drawbacks. The contents and communities of blogs are not easy to identify, classify and quantify, so that they constitute a vague concept under the name of “blogosphere”. This article is meant to identify the blogs related to the field of Library and Information Sciences in Spain, their authorship, their contents and their connections in order to clarify if we can really talk about the existence of a “Biblogsfera”.

  9. Identification of Protein Complexes Using Weighted PageRank-Nibble Algorithm and Core-Attachment Structure.

    Science.gov (United States)

    Peng, Wei; Wang, Jianxin; Zhao, Bihai; Wang, Lusheng

    2015-01-01

    Protein complexes play a significant role in understanding the underlying mechanism of most cellular functions. Recently, many researchers have explored computational methods to identify protein complexes from protein-protein interaction (PPI) networks. One group of researchers focus on detecting local dense subgraphs which correspond to protein complexes by considering local neighbors. The drawback of this kind of approach is that the global information of the networks is ignored. Some methods such as Markov Clustering algorithm (MCL), PageRank-Nibble are proposed to find protein complexes based on random walk technique which can exploit the global structure of networks. However, these methods ignore the inherent core-attachment structure of protein complexes and treat adjacent node equally. In this paper, we design a weighted PageRank-Nibble algorithm which assigns each adjacent node with different probability, and propose a novel method named WPNCA to detect protein complex from PPI networks by using weighted PageRank-Nibble algorithm and core-attachment structure. Firstly, WPNCA partitions the PPI networks into multiple dense clusters by using weighted PageRank-Nibble algorithm. Then the cores of these clusters are detected and the rest of proteins in the clusters will be selected as attachments to form the final predicted protein complexes. The experiments on yeast data show that WPNCA outperforms the existing methods in terms of both accuracy and p-value. The software for WPNCA is available at "http://netlab.csu.edu.cn/bioinfomatics/weipeng/WPNCA/download.html".

  10. Protein fiber linear dichroism for structure determination and kinetics in a low-volume, low-wavelength couette flow cell.

    Science.gov (United States)

    Dafforn, Timothy R; Rajendra, Jacindra; Halsall, David J; Serpell, Louise C; Rodger, Alison

    2004-01-01

    High-resolution structure determination of soluble globular proteins relies heavily on x-ray crystallography techniques. Such an approach is often ineffective for investigations into the structure of fibrous proteins as these proteins generally do not crystallize. Thus investigations into fibrous protein structure have relied on less direct methods such as x-ray fiber diffraction and circular dichroism. Ultraviolet linear dichroism has the potential to provide additional information on the structure of such biomolecular systems. However, existing systems are not optimized for the requirements of fibrous proteins. We have designed and built a low-volume (200 microL), low-wavelength (down to 180 nm), low-pathlength (100 microm), high-alignment flow-alignment system (couette) to perform ultraviolet linear dichroism studies on the fibers formed by a range of biomolecules. The apparatus has been tested using a number of proteins for which longer wavelength linear dichroism spectra had already been measured. The new couette cell has also been used to obtain data on two medically important protein fibers, the all-beta-sheet amyloid fibers of the Alzheimer's derived protein Abeta and the long-chain assemblies of alpha1-antitrypsin polymers.

  11. Efficient extraction of protein-protein interactions from full-text articles.

    Science.gov (United States)

    Hakenberg, Jörg; Leaman, Robert; Vo, Nguyen Ha; Jonnalagadda, Siddhartha; Sullivan, Ryan; Miller, Christopher; Tari, Luis; Baral, Chitta; Gonzalez, Graciela

    2010-01-01

    Proteins and their interactions govern virtually all cellular processes, such as regulation, signaling, metabolism, and structure. Most experimental findings pertaining to such interactions are discussed in research papers, which, in turn, get curated by protein interaction databases. Authors, editors, and publishers benefit from efforts to alleviate the tasks of searching for relevant papers, evidence for physical interactions, and proper identifiers for each protein involved. The BioCreative II.5 community challenge addressed these tasks in a competition-style assessment to evaluate and compare different methodologies, to make aware of the increasing accuracy of automated methods, and to guide future implementations. In this paper, we present our approaches for protein-named entity recognition, including normalization, and for extraction of protein-protein interactions from full text. Our overall goal is to identify efficient individual components, and we compare various compositions to handle a single full-text article in between 10 seconds and 2 minutes. We propose strategies to transfer document-level annotations to the sentence-level, which allows for the creation of a more fine-grained training corpus; we use this corpus to automatically derive around 5,000 patterns. We rank sentences by relevance to the task of finding novel interactions with physical evidence, using a sentence classifier built from this training corpus. Heuristics for paraphrasing sentences help to further remove unnecessary information that might interfere with patterns, such as additional adjectives, clauses, or bracketed expressions. In BioCreative II.5, we achieved an f-score of 22 percent for finding protein interactions, and 43 percent for mapping proteins to UniProt IDs; disregarding species, f-scores are 30 percent and 55 percent, respectively. On average, our best-performing setup required around 2 minutes per full text. All data and pattern sets as well as Java classes that

  12. False positive reduction in protein-protein interaction predictions using gene ontology annotations

    Directory of Open Access Journals (Sweden)

    Lin Yen-Han

    2007-07-01

    Full Text Available Abstract Background Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. Results Gene Ontology (GO annotations were used to reduce false positive protein-protein interactions (PPI pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. Conclusion Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially

  13. Commercial Milk Enzyme-Linked Immunosorbent Assay (ELISA) Kit Reactivities to Purified Milk Proteins and Milk-Derived Ingredients.

    Science.gov (United States)

    Ivens, Katherine O; Baumert, Joseph L; Taylor, Steve L

    2016-07-01

    Numerous commercial enzyme-linked immunosorbent assay (ELISA) kits exist to quantitatively detect bovine milk residues in foods. Milk contains many proteins that can serve as ELISA targets including caseins (α-, β-, or κ-casein) and whey proteins (α-lactalbumin or β-lactoglobulin). Nine commercially-available milk ELISA kits were selected to compare the specificity and sensitivity with 5 purified milk proteins and 3 milk-derived ingredients. All of the milk kits were capable of quantifying nonfat dry milk (NFDM), but did not necessarily detect all individual protein fractions. While milk-derived ingredients were detected by the kits, their quantitation may be inaccurate due to the use of different calibrators, reference materials, and antibodies in kit development. The establishment of a standard reference material for the calibration of milk ELISA kits is increasingly important. The appropriate selection and understanding of milk ELISA kits for food analysis is critical to accurate quantification of milk residues and informed risk management decisions. © 2016 Institute of Food Technologists®

  14. General Protein Data Bank-Based Collective Variables for Protein Folding.

    Science.gov (United States)

    Ardevol, Albert; Palazzesi, Ferruccio; Tribello, Gareth A; Parrinello, Michele

    2016-01-12

    New, automated forms of data analysis are required to understand the high-dimensional trajectories that are obtained from molecular dynamics simulations on proteins. Dimensionality reduction algorithms are particularly appealing in this regard as they allow one to construct unbiased, low-dimensional representations of the trajectory using only the information encoded in the trajectory. The downside of this approach is that a different set of coordinates are required for each different chemical system under study precisely because the coordinates are constructed using information from the trajectory. In this paper, we show how one can resolve this problem by using the sketch-map algorithm that we recently proposed to construct a low-dimensional representation of the structures contained in the protein data bank. We show that the resulting coordinates are as useful for analyzing trajectory data as coordinates constructed using landmark configurations taken from the trajectory and that these coordinates can thus be used for understanding protein folding across a range of systems.

  15. Leveraging Existing Heritage Documentation for Animations: Senate Virtual Tour

    Science.gov (United States)

    Dhanda, A.; Fai, S.; Graham, K.; Walczak, G.

    2017-08-01

    The use of digital documentation techniques has led to an increase in opportunities for using documentation data for valorization purposes, in addition to technical purposes. Likewise, building information models (BIMs) made from these data sets hold valuable information that can be as effective for public education as it is for rehabilitation. A BIM can reveal the elements of a building, as well as the different stages of a building over time. Valorizing this information increases the possibility for public engagement and interest in a heritage place. Digital data sets were leveraged by the Carleton Immersive Media Studio (CIMS) for parts of a virtual tour of the Senate of Canada. For the tour, workflows involving four different programs were explored to determine an efficient and effective way to leverage the existing documentation data to create informative and visually enticing animations for public dissemination: Autodesk Revit, Enscape, Autodesk 3ds Max, and Bentley Pointools. The explored workflows involve animations of point clouds, BIMs, and a combination of the two.

  16. Inform: Efficient Information-Theoretic Analysis of Collective Behaviors

    Directory of Open Access Journals (Sweden)

    Douglas G. Moore

    2018-06-01

    Full Text Available The study of collective behavior has traditionally relied on a variety of different methodological tools ranging from more theoretical methods such as population or game-theoretic models to empirical ones like Monte Carlo or multi-agent simulations. An approach that is increasingly being explored is the use of information theory as a methodological framework to study the flow of information and the statistical properties of collectives of interacting agents. While a few general purpose toolkits exist, most of the existing software for information theoretic analysis of collective systems is limited in scope. We introduce Inform, an open-source framework for efficient information theoretic analysis that exploits the computational power of a C library while simplifying its use through a variety of wrappers for common higher-level scripting languages. We focus on two such wrappers here: PyInform (Python and rinform (R. Inform and its wrappers are cross-platform and general-purpose. They include classical information-theoretic measures, measures of information dynamics and information-based methods to study the statistical behavior of collective systems, and expose a lower-level API that allow users to construct measures of their own. We describe the architecture of the Inform framework, study its computational efficiency and use it to analyze three different case studies of collective behavior: biochemical information storage in regenerating planaria, nest-site selection in the ant Temnothorax rugatulus, and collective decision making in multi-agent simulations.

  17. Synthesis and Comparison of Baseline Avian and Bat Use, Raptor Nesting and Mortality Information from Proposed and Existing Wind Developments: Final Report.

    Energy Technology Data Exchange (ETDEWEB)

    Erickson, Wallace P.

    2002-12-01

    Primarily due to concerns generated from observed raptor mortality at the Altamont Pass (CA) wind plant, one of the first commercial electricity generating wind plants in the U.S., new proposed wind projects both within and outside of California have received a great deal of scrutiny and environmental review. A large amount of baseline and operational monitoring data have been collected at proposed and existing U.S. wind plants. The primary use of the avian baseline data collected at wind developments has been to estimate the overall project impacts (e.g., very low, low, moderate, and high relative mortality) on birds, especially raptors and sensitive species (e.g., state and federally listed species). In a few cases, these data have also been used for guiding placement of turbines within a project boundary. This new information has strengthened our ability to accurately predict and mitigate impacts from new projects. This report should assist various stakeholders in the interpretation and use of this large information source in evaluating new projects. This report also suggests that the level of baseline data (e.g., avian use data) required to adequately assess expected impacts of some projects may be reduced. This report provides an evaluation of the ability to predict direct impacts on avian resources (primarily raptors and waterfowl/waterbirds) using less than an entire year of baseline avian use data (one season, two seasons, etc.). This evaluation is important because pre-construction wildlife surveys can be one of the most time-consuming aspects of permitting wind power projects. For baseline data, this study focuses primarily on standardized avian use data usually collected using point count survey methodology and raptor nest survey data. In addition to avian use and raptor nest survey data, other baseline data is usually collected at a proposed project to further quantify potential impacts. These surveys often include vegetation mapping and state or

  18. Awareness of patients about existing oral precancerous lesions/conditions in Nashik city of Maharashtra

    Directory of Open Access Journals (Sweden)

    Bhushan Sukdeo Ahire

    2016-01-01

    Full Text Available Introduction: Many oral squamous cell carcinomas develop from premalignant lesions/conditions of oral cavity. Hence, the awareness of such lesions/conditions is important. Aim: To assess the awareness about existing oral precancerous lesions/conditions among patients arriving for dental treatment at a dental hospital, in Nashik city of Maharashtra. Materials and Methods: A questionnaire was used to collect information from 80 patients with existing oral precancerous lesions/conditions attending the dental hospital, in Nashik city of Maharashtra. The questionnaire included questions to ascertain information on sociodemographic parameters, awareness, and sources of information about of oral precancerous lesions/conditions, habit of tobacco, areca nut chewing, smoking, alcohol, and combined habits. Results: We found that 40% (n = 32 respondents knew about the existence of lesion in their mouth of which only 50% (out of 40% had thought that it was precancerous lesion/condition. Among all subjects, only 47.5% (n = 38 were aware of oral precancerous lesions/conditions. Television was the major source of information about oral precancerous lesions/conditions almost all the subjects (97.5% wanted more information about oral precancerous lesions/conditions but through television (42.5% and lectures (27.5%. Conclusion: Awareness of patients (coming to hospital about oral precancerous lesions/conditions was found to be low. The people must be made aware of symptoms, signs, and preventive strategies of oral precancerous lesions/conditions through their preferred media – television and lectures.

  19. Predicting Genes Involved in Human Cancer Using Network Contextual Information

    Directory of Open Access Journals (Sweden)

    Rahmani Hossein

    2012-03-01

    Full Text Available Protein-Protein Interaction (PPI networks have been widely used for the task of predicting proteins involved in cancer. Previous research has shown that functional information about the protein for which a prediction is made, proximity to specific other proteins in the PPI network, as well as local network structure are informative features in this respect. In this work, we introduce two new types of input features, reflecting additional information: (1 Functional Context: the functions of proteins interacting with the target protein (rather than the protein itself; and (2 Structural Context: the relative position of the target protein with respect to specific other proteins selected according to a novel ANOVA (analysis of variance based measure. We also introduce a selection strategy to pinpoint the most informative features. Results show that the proposed feature types and feature selection strategy yield informative features. A standard machine learning method (Naive Bayes that uses the features proposed here outperforms the current state-of-the-art methods by more than 5% with respect to F-measure. In addition, manual inspection confirms the biological relevance of the top-ranked features.

  20. Deep learning methods for protein torsion angle prediction.

    Science.gov (United States)

    Li, Haiou; Hou, Jie; Adhikari, Badri; Lyu, Qiang; Cheng, Jianlin

    2017-09-18

    Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. We design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20-21° and 29-30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method. Our experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy.

  1. Extracting knowledge from protein structure geometry

    DEFF Research Database (Denmark)

    Røgen, Peter; Koehl, Patrice

    2013-01-01

    potential from geometric knowledge extracted from native and misfolded conformers of protein structures. This new potential, Metric Protein Potential (MPP), has two main features that are key to its success. Firstly, it is composite in that it includes local and nonlocal geometric information on proteins...

  2. Kirkwood-Buff Approach Rescues Overcollapse of a Disordered Protein in Canonical Protein Force Fields.

    Science.gov (United States)

    Mercadante, Davide; Milles, Sigrid; Fuertes, Gustavo; Svergun, Dmitri I; Lemke, Edward A; Gräter, Frauke

    2015-06-25

    Understanding the function of intrinsically disordered proteins is intimately related to our capacity to correctly sample their conformational dynamics. So far, a gap between experimentally and computationally derived ensembles exists, as simulations show overcompacted conformers. Increasing evidence suggests that the solvent plays a crucial role in shaping the ensembles of intrinsically disordered proteins and has led to several attempts to modify water parameters and thereby favor protein-water over protein-protein interactions. This study tackles the problem from a different perspective, which is the use of the Kirkwood-Buff theory of solutions to reproduce the correct conformational ensemble of intrinsically disordered proteins (IDPs). A protein force field recently developed on such a basis was found to be highly effective in reproducing ensembles for a fragment from the FG-rich nucleoporin 153, with dimensions matching experimental values obtained from small-angle X-ray scattering and single molecule FRET experiments. Kirkwood-Buff theory presents a complementary and fundamentally different approach to the recently developed four-site TIP4P-D water model, both of which can rescue the overcollapse observed in IDPs with canonical protein force fields. As such, our study provides a new route for tackling the deficiencies of current protein force fields in describing protein solvation.

  3. SDSL-ESR-based protein structure characterization.

    Science.gov (United States)

    Strancar, Janez; Kavalenka, Aleh; Urbancic, Iztok; Ljubetic, Ajasja; Hemminga, Marcus A

    2010-03-01

    As proteins are key molecules in living cells, knowledge about their structure can provide important insights and applications in science, biotechnology, and medicine. However, many protein structures are still a big challenge for existing high-resolution structure-determination methods, as can be seen in the number of protein structures published in the Protein Data Bank. This is especially the case for less-ordered, more hydrophobic and more flexible protein systems. The lack of efficient methods for structure determination calls for urgent development of a new class of biophysical techniques. This work attempts to address this problem with a novel combination of site-directed spin labelling electron spin resonance spectroscopy (SDSL-ESR) and protein structure modelling, which is coupled by restriction of the conformational spaces of the amino acid side chains. Comparison of the application to four different protein systems enables us to generalize the new method and to establish a general procedure for determination of protein structure.

  4. Cell penetrating peptides to dissect host-pathogen protein-protein interactions in Theileria -transformed leukocytes

    KAUST Repository

    Haidar, Malak

    2017-09-08

    One powerful application of cell penetrating peptides is the delivery into cells of molecules that function as specific competitors or inhibitors of protein-protein interactions. Ablating defined protein-protein interactions is a refined way to explore their contribution to a particular cellular phenotype in a given disease context. Cell-penetrating peptides can be synthetically constrained through various chemical modifications that stabilize a given structural fold with the potential to improve competitive binding to specific targets. Theileria-transformed leukocytes display high PKA activity, but PKAis an enzyme that plays key roles in multiple cellular processes; consequently genetic ablation of kinase activity gives rise to a myriad of confounding phenotypes. By contrast, ablation of a specific kinase-substrate interaction has the potential to give more refined information and we illustrate this here by describing how surgically ablating PKA interactions with BAD gives precise information on the type of glycolysis performed by Theileria-transformed leukocytes. In addition, we provide two other examples of how ablating specific protein-protein interactions in Theileria-infected leukocytes leads to precise phenotypes and argue that constrained penetrating peptides have great therapeutic potential to combat infectious diseases in general.

  5. Information,Informal finance,and SME financing

    Institute of Scientific and Technical Information of China (English)

    LIN Justin Yifu; SUN Xifang

    2006-01-01

    Informal finance exists extensively and has been playing an important role in small-and medium-sized enterprise (SME) financing in developing economies,This paper tries to rationalize the extensiveness of informal finance.SME financing suffers more serious information asymmetry to the extent that most SMEs are more opaque and can only provide less collateral.Informal lenders have an advantage over formal financial institutions in collecting "soft information" about SME borrowers.This paper establishes a model including formal and informal lenders and high-and low-risk borrowers with or without sufficient collateral and shows that the credit market in which informal finance is eliminated will allocate funds in some inefficient way,and the efficiency of allocating credit funds can be improved once informal finance is allowed to coexist with formal finance.

  6. Use of operational data for the assessment of pre-existing software

    International Nuclear Information System (INIS)

    Helminen, Atte; Gran, Bjoern Axel; Kristiansen, Monica; Winther, Rune

    2004-01-01

    To build sufficient confidence on the reliability of the safety systems of nuclear power plants all available sources of information should be used. One important data source is the operational experience collected for the system. The operational experience is particularly applicable for systems of pre-existing software. Even though systems and devices involving pre-existing software are not considered for the functions of highest safety levels of nuclear power plants, they will most probably be introduced to functions of lower safety levels and to none-safety related applications. In the paper we shortly discuss the use of operational experience data for the reliability assessment of pre-existing software in general, and the role of pre-existing software in relation to safety applications. Then we discuss the modelling of operational profiles, the application of expert judgement on operational profiles and the need of a realistic test case. Finally, we discuss the application of operational experience data in Bayesian statistics. (Author)

  7. InSilico Proteomics System: Integration and Application of Protein and Protein-Protein Interaction Data using Microsoft .NET

    Directory of Open Access Journals (Sweden)

    Straßer Wolfgang

    2006-12-01

    Full Text Available In the last decades, biological databases became the major knowledge resource for researchers in the field of molecular biology. The distribution of information among these databases is one of the major problems. An overview about the subject area of data access and representation of protein and protein-protein interaction data within public biological databases is described. For a comprehensive and consistent way of searching and analysing integrated protein and protein-protein interaction data, the InSilico Proteomics (ISP project has been initiated. Its three main objectives are (1 to provide an integrated knowledge pool for data investigation and global network analysis functions for a better understanding of a cell’s interactome, (2 employment of public data for plausibility analysis and validation of in-house experimental data and (3 testing the applicability of Microsoft’s .NET architecture for bioinformatics applications. Data integrated into the ISP database can be queried through the Web portal PRIMOS (PRotein Interaction and MOlecule Search which is freely available at http://biomis.fh-hagenberg.at/isp/primos.

  8. Chemical cross-linking and mass spectrometry for protein structural modeling

    NARCIS (Netherlands)

    Back, Jaap Willem; de Jong, Luitzen; Muijsers, Anton O.; de Koster, Chris G.

    2003-01-01

    The growth of gene and protein sequence information is currently so rapid that three-dimensional structural information is lacking for the overwhelming majority of known proteins. In this review, efforts towards rapid and sensitive methods for protein structural characterization are described,

  9. Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.

    Science.gov (United States)

    Ruan, Peiying; Hayashida, Morihiro; Akutsu, Tatsuya; Vert, Jean-Philippe

    2018-02-19

    Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.

  10. Protein binding of psychotropic agents

    International Nuclear Information System (INIS)

    Hassan, H.A.

    1990-01-01

    Based upon fluorescence measurements, protein binding of some psychotropic agents (chlorpromazine, promethazine, and trifluoperazine) to human IgG and HSA was studied in aqueous cacodylate buffer, PH7. The interaction parameters determined from emission quenching of the proteins. The interaction parameters determined include the equilibrium constant (K), calculated from equations derived by Borazan and coworkers, the number of binding sites (n) available to the monomer molecules on a single protein molecule. The results revealed a high level of affinity, as reflected by high values of K, and the existence of specific binding sites, since a limited number of n values are obtained. 39 tabs.; 37 figs.; 83 refs

  11. 77 FR 57080 - Proposed Agency Information Collection

    Science.gov (United States)

    2012-09-17

    ...-0149: Existing; (2) Information Collection Request Title: Evaluation of the Financial Reporting System... existing data collection, the Financial Reporting System, EIA-28. This is not a request to collect data... DEPARTMENT OF ENERGY Energy Information Administration Proposed Agency Information Collection...

  12. Protein status elicits compensatory changes in food intake and food preferences

    NARCIS (Netherlands)

    Griffioen-Roose, S.; Mars, M.; Siebelink, E.; Finlayson, G.; Tome, D.; Graaf, de C.

    2012-01-01

    Background: Protein is an indispensable component within the human diet. It is unclear, however, whether behavioral strategies exist to avoid shortages. Objective: The objective was to investigate the effect of a low protein status compared with a high protein status on food intake and food

  13. Identification of DNA-Binding Proteins Using Mixed Feature Representation Methods.

    Science.gov (United States)

    Qu, Kaiyang; Han, Ke; Wu, Song; Wang, Guohua; Wei, Leyi

    2017-09-22

    DNA-binding proteins play vital roles in cellular processes, such as DNA packaging, replication, transcription, regulation, and other DNA-associated activities. The current main prediction method is based on machine learning, and its accuracy mainly depends on the features extraction method. Therefore, using an efficient feature representation method is important to enhance the classification accuracy. However, existing feature representation methods cannot efficiently distinguish DNA-binding proteins from non-DNA-binding proteins. In this paper, a multi-feature representation method, which combines three feature representation methods, namely, K-Skip-N-Grams, Information theory, and Sequential and structural features (SSF), is used to represent the protein sequences and improve feature representation ability. In addition, the classifier is a support vector machine. The mixed-feature representation method is evaluated using 10-fold cross-validation and a test set. Feature vectors, which are obtained from a combination of three feature extractions, show the best performance in 10-fold cross-validation both under non-dimensional reduction and dimensional reduction by max-relevance-max-distance. Moreover, the reduced mixed feature method performs better than the non-reduced mixed feature technique. The feature vectors, which are a combination of SSF and K-Skip-N-Grams, show the best performance in the test set. Among these methods, mixed features exhibit superiority over the single features.

  14. Identification of DNA-Binding Proteins Using Mixed Feature Representation Methods

    Directory of Open Access Journals (Sweden)

    Kaiyang Qu

    2017-09-01

    Full Text Available DNA-binding proteins play vital roles in cellular processes, such as DNA packaging, replication, transcription, regulation, and other DNA-associated activities. The current main prediction method is based on machine learning, and its accuracy mainly depends on the features extraction method. Therefore, using an efficient feature representation method is important to enhance the classification accuracy. However, existing feature representation methods cannot efficiently distinguish DNA-binding proteins from non-DNA-binding proteins. In this paper, a multi-feature representation method, which combines three feature representation methods, namely, K-Skip-N-Grams, Information theory, and Sequential and structural features (SSF, is used to represent the protein sequences and improve feature representation ability. In addition, the classifier is a support vector machine. The mixed-feature representation method is evaluated using 10-fold cross-validation and a test set. Feature vectors, which are obtained from a combination of three feature extractions, show the best performance in 10-fold cross-validation both under non-dimensional reduction and dimensional reduction by max-relevance-max-distance. Moreover, the reduced mixed feature method performs better than the non-reduced mixed feature technique. The feature vectors, which are a combination of SSF and K-Skip-N-Grams, show the best performance in the test set. Among these methods, mixed features exhibit superiority over the single features.

  15. Co-existing institutional logics and agency among top-level public servants

    DEFF Research Database (Denmark)

    Bjerregaard, Toke

    2011-01-01

    to address parts of this void. This study examines the agency exerted by top-level public servants through their everyday strategy and policy work in face of co-existing logics of public administration. The findings illustrate how their action strategies span from more passive strategies of coping...... with coexisting logics of administration to more skilled agency of combining logics aimed at enhancing their opportunity and action space. The study suggests that the interplay between co-existing institutional logics, action strategies and the practical skills of top-level public servants provides the basis...... for both coping and more proactive strategies in pluralistic public administrations. Findings illustrate the role of public servants' practical sense of realizable opportunities that inform such strategies of handling co-existing institutional logics. Implications for institutional studies of organizations...

  16. Hidden Structural Codes in Protein Intrinsic Disorder.

    Science.gov (United States)

    Borkosky, Silvia S; Camporeale, Gabriela; Chemes, Lucía B; Risso, Marikena; Noval, María Gabriela; Sánchez, Ignacio E; Alonso, Leonardo G; de Prat Gay, Gonzalo

    2017-10-17

    Intrinsic disorder is a major structural category in biology, accounting for more than 30% of coding regions across the domains of life, yet consists of conformational ensembles in equilibrium, a major challenge in protein chemistry. Anciently evolved papillomavirus genomes constitute an unparalleled case for sequence to structure-function correlation in cases in which there are no folded structures. E7, the major transforming oncoprotein of human papillomaviruses, is a paradigmatic example among the intrinsically disordered proteins. Analysis of a large number of sequences of the same viral protein allowed for the identification of a handful of residues with absolute conservation, scattered along the sequence of its N-terminal intrinsically disordered domain, which intriguingly are mostly leucine residues. Mutation of these led to a pronounced increase in both α-helix and β-sheet structural content, reflected by drastic effects on equilibrium propensities and oligomerization kinetics, and uncovers the existence of local structural elements that oppose canonical folding. These folding relays suggest the existence of yet undefined hidden structural codes behind intrinsic disorder in this model protein. Thus, evolution pinpoints conformational hot spots that could have not been identified by direct experimental methods for analyzing or perturbing the equilibrium of an intrinsically disordered protein ensemble.

  17. Review of existing residential energy efficiency certification and rating programs

    Energy Technology Data Exchange (ETDEWEB)

    Hendrickson, P.L.

    1986-11-01

    This report was prepared for the Office of Buildings and Community Systems, US Department of Energy (DOE). The principal objective of the report is to present information on existing Home Energy Rating Systems (HERS) and their features. Much of the information in this report updates a 1982 report (PNL-4359), also prepared by the Pacific Northwest Laboratory (PNL) for DOE. Secondary objectives of the report are to qualitatively examine the benefits and costs of HERS programs, review survey results on the attitudes of various user groups toward the programs, and discuss selected design and implementation issues.

  18. ExoLocator--an online view into genetic makeup of vertebrate proteins.

    Science.gov (United States)

    Khoo, Aik Aun; Ogrizek-Tomas, Mario; Bulovic, Ana; Korpar, Matija; Gürler, Ece; Slijepcevic, Ivan; Šikic, Mile; Mihalek, Ivana

    2014-01-01

    ExoLocator (http://exolocator.eopsf.org) collects in a single place information needed for comparative analysis of protein-coding exons from vertebrate species. The main source of data--the genomic sequences, and the existing exon and homology annotation--is the ENSEMBL database of completed vertebrate genomes. To these, ExoLocator adds the search for ostensibly missing exons in orthologous protein pairs across species, using an extensive computational pipeline to narrow down the search region for the candidate exons and find a suitable template in the other species, as well as state-of-the-art implementations of pairwise alignment algorithms. The resulting complements of exons are organized in a way currently unique to ExoLocator: multiple sequence alignments, both on the nucleotide and on the peptide levels, clearly indicating the exon boundaries. The alignments can be inspected in the web-embedded viewer, downloaded or used on the spot to produce an estimate of conservation within orthologous sets, or functional divergence across paralogues.

  19. Updating flood maps efficiently using existing hydraulic models, very-high-accuracy elevation data, and a geographic information system; a pilot study on the Nisqually River, Washington

    Science.gov (United States)

    Jones, Joseph L.; Haluska, Tana L.; Kresch, David L.

    2001-01-01

    A method of updating flood inundation maps at a fraction of the expense of using traditional methods was piloted in Washington State as part of the U.S. Geological Survey Urban Geologic and Hydrologic Hazards Initiative. Large savings in expense may be achieved by building upon previous Flood Insurance Studies and automating the process of flood delineation with a Geographic Information System (GIS); increases in accuracy and detail result from the use of very-high-accuracy elevation data and automated delineation; and the resulting digital data sets contain valuable ancillary information such as flood depth, as well as greatly facilitating map storage and utility. The method consists of creating stage-discharge relations from the archived output of the existing hydraulic model, using these relations to create updated flood stages for recalculated flood discharges, and using a GIS to automate the map generation process. Many of the effective flood maps were created in the late 1970?s and early 1980?s, and suffer from a number of well recognized deficiencies such as out-of-date or inaccurate estimates of discharges for selected recurrence intervals, changes in basin characteristics, and relatively low quality elevation data used for flood delineation. FEMA estimates that 45 percent of effective maps are over 10 years old (FEMA, 1997). Consequently, Congress has mandated the updating and periodic review of existing maps, which have cost the Nation almost 3 billion (1997) dollars. The need to update maps and the cost of doing so were the primary motivations for piloting a more cost-effective and efficient updating method. New technologies such as Geographic Information Systems and LIDAR (Light Detection and Ranging) elevation mapping are key to improving the efficiency of flood map updating, but they also improve the accuracy, detail, and usefulness of the resulting digital flood maps. GISs produce digital maps without manual estimation of inundated areas between

  20. Social Loafing in the Refugee Crisis: Information about Existing Initiatives Decreases Willingness to Help

    Directory of Open Access Journals (Sweden)

    Simon Schindler

    2017-05-01

    Full Text Available In light of the European refugee situation, we investigate how information about others’ support influences individuals’ willingness to help. When individuals see information about other people supporting refugees, they may either be influenced by a descriptive norm, and act accordingly. Alternatively, they may perceive that others are already doing the job, and thus engage in social loafing. In an experiment (N = 132, we tested these competing predictions. Specifically, participants were exposed to a map of Germany that either indicated many or few helping initiatives across the country. In a control group, no map was shown. Subsequently, participants were asked about their willingness to help. While there was no effect between the two map conditions, results revealed that participants reported lower willingness to help in both map conditions, compared with the control group. Thus, providing information about helping projects results in social loafing, jeopardizing widespread communication strategies to increase solidarity.

  1. Multi-Label Learning via Random Label Selection for Protein Subcellular Multi-Locations Prediction.

    Science.gov (United States)

    Wang, Xiao; Li, Guo-Zheng

    2013-03-12

    Prediction of protein subcellular localization is an important but challenging problem, particularly when proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular localization methods are only used to deal with the single-location proteins. In the past few years, only a few methods have been proposed to tackle proteins with multiple locations. However, they only adopt a simple strategy, that is, transforming the multi-location proteins to multiple proteins with single location, which doesn't take correlations among different subcellular locations into account. In this paper, a novel method named RALS (multi-label learning via RAndom Label Selection), is proposed to learn from multi-location proteins in an effective and efficient way. Through five-fold cross validation test on a benchmark dataset, we demonstrate our proposed method with consideration of label correlations obviously outperforms the baseline BR method without consideration of label correlations, indicating correlations among different subcellular locations really exist and contribute to improvement of prediction performance. Experimental results on two benchmark datasets also show that our proposed methods achieve significantly higher performance than some other state-of-the-art methods in predicting subcellular multi-locations of proteins. The prediction web server is available at http://levis.tongji.edu.cn:8080/bioinfo/MLPred-Euk/ for the public usage.

  2. Introduction of the EISA-PC into existing fusion experiments

    International Nuclear Information System (INIS)

    Tenten, W.; Bertschinger, G.; Mueller, K.D.; Reinhart, P.; Rongen, F.

    1995-01-01

    A general problem in the data acquisition field in fusion research is the lack of sufficient local memory for the storage of information acquired during a single discharge. While it is absolutely necessary to keep these data locally before transferring them to a central node, there has been a steadily increasing demand for more capacity. The introduction of an EISA-Personal-Computer with its vast and cheap memory resources is presenting a very interesting solution for the upgrade of existing installations and the design of new experiments. An innovative PC interface using Programmable Logic techniques was developed that allows easy and fast integration of a PC into an existing experimental setup. Several typical applications of this method are presented, that are of special interest for fusion experiments. (orig.)

  3. Stress proteins, autoimmunity, and autoimmune disease.

    Science.gov (United States)

    Winfield, J B; Jarjour, W N

    1991-01-01

    At birth, the immune system is biased toward recognition of microbial antigens in order to protect the host from infection. Recent data suggest that an important initial line of defense in this regard involves autologous stress proteins, especially conserved peptides of hsp60, which are presented to T cells bearing gamma delta receptors by relatively nonpolymorphic class lb molecules. Natural antibodies may represent a parallel B cell mechanism. Through an evolving process of "physiological" autoreactivity and selection by immunodominant stress proteins common to all prokaryotes, B and T cell repertoires expand during life to meet the continuing challenge of infection. Because stress proteins of bacteria are homologous with stress proteins of the host, there exists in genetically susceptible individuals a constant risk of autoimmune disease due to failure of mechanisms for self-nonself discrimination. That stress proteins actually play a role in autoimmune processes is supported by a growing body of evidence which, collectively, suggests that autoreactivity in chronic inflammatory arthritis involves, at least initially, gamma delta cells which recognize epitopes of the stress protein hsp60. Alternate mechanisms for T cell stimulation by stress proteins undoubtedly also exist, e.g., molecular mimicry of the DR beta third hypervariable region susceptibility locus for rheumatoid arthritis by a DnaJ stress protein epitope in gram-negative bacteria. While there still is confusion with respect to the most relevant stress protein epitopes, a central role for stress proteins in the etiology of arthritis appears likely. Furthermore, insight derived from the work thus far in adjuvant-induced arthritis already is stimulating analyses of related phenomena in autoimmune diseases other than those involving joints. Only limited data are available in the area of humoral autoimmunity to stress proteins. Autoantibodies to a number of stress proteins have been identified in SLE and

  4. Companied P16 genetic and protein status together providing useful information on the clinical outcome of urinary bladder cancer.

    Science.gov (United States)

    Pu, Xiaohong; Zhu, Liya; Fu, Yao; Fan, Zhiwen; Zheng, Jinyu; Zhang, Biao; Yang, Jun; Guan, Wenyan; Wu, Hongyan; Ye, Qing; Huang, Qing

    2018-04-01

    SPEC P16/CEN3/7/17 Probe fluorescence-in-situ-hybridization (FISH) has become the most sensitive method in indentifying the urothelial tumors and loss of P16 has often been identified in low-grade urothelial lesions; however, little is known about the significations of other P16 genetic status (normal and amplification) in bladder cancer.We detected P16 gene status by FISH in 259 urine samples and divided these samples into 3 groups: 1, normal P16; 2, loss of P16; and 3, amplified P16. Meanwhile, p16 protein expression was measured by immunocytochemistry and we characterized the clinicopathologic features of cases with P16 gene status.Loss of P16 occurred in 26.2%, P16 amplification occurred in 41.3% and P16 gene normal occurred in 32.4% of all cases. P16 genetic status was significantly associated with tumor grade and primary tumor status (P = .008 and .017), but not with pathological tumor stage, overall survival, and p16 protein expression. However, P16 gene amplification accompanied protein high-expression has shorter overall survival compared with the overall patients (P = .023), and P16 gene loss accompanied loss of protein also had the tendency to predict bad prognosis (P = .067).Studies show that the genetic status of P16 has a close relation with the stages of bladder cancer. Loss of P16 is associated with low-grade urothelial malignancy while amplified P16 donotes high-grade. Neither P16 gene status nor p16 protein expression alone is an independent predictor of urothelial bladder carcinoma, but combine gene and protein status together providing useful information on the clinical outcome of these patients.

  5. EXIST Perspective for SFXTs

    Science.gov (United States)

    Ubertini, Pietro; Sidoli, L.; Sguera, V.; Bazzano, A.

    2009-12-01

    Supergiant Fast X-ray Transients (SFXTs) are one of the most interesting (and unexpected) results of the INTEGRAL mission. They are a new class of HMXBs displaying short hard X-ray outbursts (duration less tha a day) characterized by fast flares (few hours timescale) and large dinamic range (10E3-10E4). The physical mechanism driving their peculiar behaviour is still unclear and highly debated: some models involve the structure of the supergiant companion donor wind (likely clumpy, in a spherical or non spherical geometry) and the orbital properties (wide separation with eccentric or circular orbit), while others involve the properties of the neutron star compact object and invoke very low magnetic field values (B 1E14 G, magnetars). The picture is still highly unclear from the observational point of view as well: no cyclotron lines have been detected in the spectra, thus the strength of the neutron star magnetic field is unknown. Orbital periods have been measured in only 4 systems, spanning from 3.3 days to 165 days. Even the duty cycle seems to be quite different from source to source. The Energetic X-ray Imaging Survey Telescope (EXIST), with its hard X-ray all-sky survey and large improved limiting sensitivity, will allow us to get a clearer picture of SFXTs. A complete census of their number is essential to enlarge the sample. A long term and continuous as possible X-ray monitoring is crucial to -(1) obtain the duty cycle, -(2 )investigate their unknown orbital properties (separation, orbital period, eccentricity),- (3) to completely cover the whole outburst activity, (4)-to search for cyclotron lines in the high energy spectra. EXIST observations will provide crucial informations to test the different models and shed light on the peculiar behaviour of SFXTs.

  6. Proteomic analysis of the excretory/secretory products and antigenic proteins of Echinococcus granulosus adult worms from infected dogs.

    Science.gov (United States)

    Wang, Ying; Xiao, Di; Shen, Yujuan; Han, Xiuming; Zhao, Fei; Li, Xiaohong; Wu, Weiping; Zhou, Hejun; Zhang, Jianzhong; Cao, Jianping

    2015-05-21

    Cystic echinococcosis, which is caused by Echinococcus granulosus, is one of the most widespread zoonotic helminth diseases that affects humans and livestock. Dogs, which harbor adult worms in their small intestines, are a pivotal source of E. granulosus infection in humans and domestic animals. Therefore, novel molecular approaches for the prevention and diagnosis of this parasite infection in dogs need to be developed. In this study, we performed proteomic analysis to identify excretory/secretory products (ES) and antigenic proteins of E. granulosus adult worms using two-dimensional electrophoresis, tandem matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF/TOF), and Western blotting of sera from infected dogs. This study identified 33 ES product spots corresponding to 9 different proteins and 21 antigenic protein spots corresponding to 13 different proteins. Six antigenic proteins were identified for the first time. The present study extended the existing proteomic data of E. granulosus and provides further information regarding host-parasite interactions and survival mechanisms. The results of this study contribute to vaccination and immunodiagnoses for E. granulosus infections.

  7. HIV-specific probabilistic models of protein evolution.

    Directory of Open Access Journals (Sweden)

    David C Nickle

    2007-06-01

    Full Text Available Comparative sequence analyses, including such fundamental bioinformatics techniques as similarity searching, sequence alignment and phylogenetic inference, have become a mainstay for researchers studying type 1 Human Immunodeficiency Virus (HIV-1 genome structure and evolution. Implicit in comparative analyses is an underlying model of evolution, and the chosen model can significantly affect the results. In general, evolutionary models describe the probabilities of replacing one amino acid character with another over a period of time. Most widely used evolutionary models for protein sequences have been derived from curated alignments of hundreds of proteins, usually based on mammalian genomes. It is unclear to what extent these empirical models are generalizable to a very different organism, such as HIV-1-the most extensively sequenced organism in existence. We developed a maximum likelihood model fitting procedure to a collection of HIV-1 alignments sampled from different viral genes, and inferred two empirical substitution models, suitable for describing between-and within-host evolution. Our procedure pools the information from multiple sequence alignments, and provided software implementation can be run efficiently in parallel on a computer cluster. We describe how the inferred substitution models can be used to generate scoring matrices suitable for alignment and similarity searches. Our models had a consistently superior fit relative to the best existing models and to parameter-rich data-driven models when benchmarked on independent HIV-1 alignments, demonstrating evolutionary biases in amino-acid substitution that are unique to HIV, and that are not captured by the existing models. The scoring matrices derived from the models showed a marked difference from common amino-acid scoring matrices. The use of an appropriate evolutionary model recovered a known viral transmission history, whereas a poorly chosen model introduced phylogenetic

  8. Semi-Supervised Learning for Classification of Protein Sequence Data

    Directory of Open Access Journals (Sweden)

    Brian R. King

    2008-01-01

    Full Text Available Protein sequence data continue to become available at an exponential rate. Annotation of functional and structural attributes of these data lags far behind, with only a small fraction of the data understood and labeled by experimental methods. Classification methods that are based on semi-supervised learning can increase the overall accuracy of classifying partly labeled data in many domains, but very few methods exist that have shown their effect on protein sequence classification. We show how proven methods from text classification can be applied to protein sequence data, as we consider both existing and novel extensions to the basic methods, and demonstrate restrictions and differences that must be considered. We demonstrate comparative results against the transductive support vector machine, and show superior results on the most difficult classification problems. Our results show that large repositories of unlabeled protein sequence data can indeed be used to improve predictive performance, particularly in situations where there are fewer labeled protein sequences available, and/or the data are highly unbalanced in nature.

  9. Modeling complexes of modeled proteins.

    Science.gov (United States)

    Anishchenko, Ivan; Kundrotas, Petras J; Vakser, Ilya A

    2017-03-01

    Structural characterization of proteins is essential for understanding life processes at the molecular level. However, only a fraction of known proteins have experimentally determined structures. This fraction is even smaller for protein-protein complexes. Thus, structural modeling of protein-protein interactions (docking) primarily has to rely on modeled structures of the individual proteins, which typically are less accurate than the experimentally determined ones. Such "double" modeling is the Grand Challenge of structural reconstruction of the interactome. Yet it remains so far largely untested in a systematic way. We present a comprehensive validation of template-based and free docking on a set of 165 complexes, where each protein model has six levels of structural accuracy, from 1 to 6 Å C α RMSD. Many template-based docking predictions fall into acceptable quality category, according to the CAPRI criteria, even for highly inaccurate proteins (5-6 Å RMSD), although the number of such models (and, consequently, the docking success rate) drops significantly for models with RMSD > 4 Å. The results show that the existing docking methodologies can be successfully applied to protein models with a broad range of structural accuracy, and the template-based docking is much less sensitive to inaccuracies of protein models than the free docking. Proteins 2017; 85:470-478. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  10. Texturized dairy proteins.

    Science.gov (United States)

    Onwulata, Charles I; Phillips, John G; Tunick, Michael H; Qi, Phoebi X; Cooke, Peter H

    2010-03-01

    Dairy proteins are amenable to structural modifications induced by high temperature, shear, and moisture; in particular, whey proteins can change conformation to new unfolded states. The change in protein state is a basis for creating new foods. The dairy products, nonfat dried milk (NDM), whey protein concentrate (WPC), and whey protein isolate (WPI) were modified using a twin-screw extruder at melt temperatures of 50, 75, and 100 degrees C, and moistures ranging from 20 to 70 wt%. Viscoelasticity and solubility measurements showed that extrusion temperature was a more significant (P extruded dairy protein ranged from rigid (2500 N) to soft (2.7 N). Extruding at or above 75 degrees C resulted in increased peak force for WPC (138 to 2500 N) and WPI (2.7 to 147.1 N). NDM was marginally texturized; the presence of lactose interfered with its texturization. WPI products extruded at 50 degrees C were not texturized; their solubility values ranged from 71.8% to 92.6%. A wide possibility exists for creating new foods with texturized dairy proteins due to the extensive range of states achievable. Dairy proteins can be used to boost the protein content in puffed snacks made from corn meal, but unmodified, they bind water and form doughy pastes with starch. To minimize the water binding property of dairy proteins, WPI, or WPC, or NDM were modified by extrusion processing. Extrusion temperature conditions were adjusted to 50, 75, or 100 degrees C, sufficient to change the structure of the dairy proteins, but not destroy them. Extrusion modified the structures of these dairy proteins for ease of use in starchy foods to boost nutrient levels. Dairy proteins can be used to boost the protein content in puffed snacks made from corn meal, but unmodified, they bind water and form doughy pastes with starch. To minimize the water binding property of dairy proteins, whey protein isolate, whey protein concentrate, or nonfat dried milk were modified by extrusion processing. Extrusion

  11. SwissPalm: Protein Palmitoylation database.

    Science.gov (United States)

    Blanc, Mathieu; David, Fabrice; Abrami, Laurence; Migliozzi, Daniel; Armand, Florence; Bürgi, Jérôme; van der Goot, Françoise Gisou

    2015-01-01

    Protein S-palmitoylation is a reversible post-translational modification that regulates many key biological processes, although the full extent and functions of protein S-palmitoylation remain largely unexplored. Recent developments of new chemical methods have allowed the establishment of palmitoyl-proteomes of a variety of cell lines and tissues from different species.  As the amount of information generated by these high-throughput studies is increasing, the field requires centralization and comparison of this information. Here we present SwissPalm ( http://swisspalm.epfl.ch), our open, comprehensive, manually curated resource to study protein S-palmitoylation. It currently encompasses more than 5000 S-palmitoylated protein hits from seven species, and contains more than 500 specific sites of S-palmitoylation. SwissPalm also provides curated information and filters that increase the confidence in true positive hits, and integrates predictions of S-palmitoylated cysteine scores, orthologs and isoform multiple alignments. Systems analysis of the palmitoyl-proteome screens indicate that 10% or more of the human proteome is susceptible to S-palmitoylation. Moreover, ontology and pathway analyses of the human palmitoyl-proteome reveal that key biological functions involve this reversible lipid modification. Comparative analysis finally shows a strong crosstalk between S-palmitoylation and other post-translational modifications. Through the compilation of data and continuous updates, SwissPalm will provide a powerful tool to unravel the global importance of protein S-palmitoylation.

  12. Indoor air quality in mechanically ventilated residential dwellings/low-rise buildings: A review of existing information

    DEFF Research Database (Denmark)

    Aganovic, Amar; Hamon, Mathieu; Kolarik, Jakub

    Mechanical ventilation has become a mandatory requirement in multiple European standards addressing indoor air quality (IAQ) and ventilation in residential dwellings (single family houses and low-rise apartment buildings). This article presents the state of the art study through a review...... of the existing literature, to establish a link between ventilation rate and key indoor air pollutants. Design characteristics of a mechanical ventilation system such as supply/exhaustairflow, system and design of supply and exhaust outlets were considered. The performance of various ventilation solutionswas......-house ventilation rate was reported below 0.5h-1 or 14 l/s·person in bedrooms, the concentrations of the pollutants elevated above minimum threshold limits (CO2>1350 ppm; TVOC >3000 μg/m3) defined by the standard. Insufficient or non-existent supply of air was related to significantly higher pollutant...

  13. Large scale analysis of co-existing post-translational modifications in histone tails reveals global fine structure of cross-talk

    DEFF Research Database (Denmark)

    Schwämmle, Veit; Aspalter, Claudia-Maria; Sidoli, Simone

    2014-01-01

    Mass spectrometry (MS) is a powerful analytical method for the identification and quantification of co-existing post-translational modifications in histone proteins. One of the most important challenges in current chromatin biology is to characterize the relationships between co-existing histone...... sample-specific patterns for the co-frequency of histone post-translational modifications. We implemented a new method to identify positive and negative interplay between pairs of methylation and acetylation marks in proteins. Many of the detected features were conserved between different cell types...... sites but negative cross-talk for distant ones, and for discrete methylation states at Lys-9, Lys-27, and Lys-36 of histone H3, suggesting a more differentiated functional role of methylation beyond the general expectation of enhanced activity at higher methylation states....

  14. Microwave-enhanced folding and denaturation of globular proteins

    DEFF Research Database (Denmark)

    Bohr, Henrik; Bohr, Jakob

    2000-01-01

    It is shown that microwave irradiation can affect the kinetics of the folding process of some globular proteins, especially beta-lactoglobulin. At low temperature the folding from the cold denatured phase of the protein is enhanced, while at a higher temperature the denaturation of the protein from...... its folded state is enhanced. In the latter case, a negative temperature gradient is needed for the denaturation process, suggesting that the effects of the microwaves are nonthermal. This supports the notion that coherent topological excitations can exist in proteins. The application of microwaves...

  15. LEVERAGING EXISTING HERITAGE DOCUMENTATION FOR ANIMATIONS: SENATE VIRTUAL TOUR

    Directory of Open Access Journals (Sweden)

    A. Dhanda

    2017-08-01

    Full Text Available The use of digital documentation techniques has led to an increase in opportunities for using documentation data for valorization purposes, in addition to technical purposes. Likewise, building information models (BIMs made from these data sets hold valuable information that can be as effective for public education as it is for rehabilitation. A BIM can reveal the elements of a building, as well as the different stages of a building over time. Valorizing this information increases the possibility for public engagement and interest in a heritage place. Digital data sets were leveraged by the Carleton Immersive Media Studio (CIMS for parts of a virtual tour of the Senate of Canada. For the tour, workflows involving four different programs were explored to determine an efficient and effective way to leverage the existing documentation data to create informative and visually enticing animations for public dissemination: Autodesk Revit, Enscape, Autodesk 3ds Max, and Bentley Pointools. The explored workflows involve animations of point clouds, BIMs, and a combination of the two.

  16. Existing reflection seismic data re-processing

    International Nuclear Information System (INIS)

    Higashinaka, Motonori; Sano, Yukiko; Kozawa, Takeshi

    2005-08-01

    This document is to report the results of existing seismic data re-processing around Horonobe town, Hokkaido, Japan, which is a part of the Horonobe Underground Research Project. The main purpose of this re-processing is to recognize the subsurface structure of Omagari Fault and fold system around Omagari Fault. The seismic lines for re-processing are TYHR-A3 line and SHRB-2 line, which JAPEX surveyed in 1975. Applying weathering static correction using refraction analysis and noise suppression procedure, we have much enhanced seismic profile. Following information was obtained from seismic re-processing results. TYHR-A3 line: There are strong reflections, dipping to the west. These reflections are corresponding western limb of anticline to the west side of Omagari Fault. SHRB-2 line: There are strong reflections, dipping to the west, at CDP 60-140, while there are reflections, dipping to the east, to the east side of CDP 140. These reflections correspond to the western limb and the eastern limb of the anticline, which is parallel to Omagari FAULT. This seismic re-processing provides some useful information to know the geological structure around Omagari Fault. (author)

  17. Local sequence information in cellular retinoic acid-binding protein I: specific residue roles in beta-turns.

    Science.gov (United States)

    Rotondi, Kenneth S; Gierasch, Lila M

    2003-01-01

    We have recently shown that two of the beta-turns (III and IV) in the ten-stranded, beta-clam protein, cellular retinoic acid-binding protein I (CRABP I), are favored in short peptide fragments, arguing that they are encoded by local interactions (K. S. Rotondi and L. M. Gierasch, Biochemistry, 2003, Vol. 42, pp. 7976-7985). In this paper we examine these turns in greater detail to dissect the specific local interactions responsible for their observed native conformational biases. Conformations of peptides corresponding to the turn III and IV fragments were examined under conditions designed to selectively disrupt stabilizing interactions, using pH variation, chaotrope addition, or mutagenesis to probe specific side-chain influences. We find that steric constraints imposed by excluded volume effects between near neighbor residues (i,i+2), favorable polar (i,i+2) interactions, and steric permissiveness of glycines are the principal factors accounting for the observed native bias in these turns. Longer-range stabilizing interactions across the beta-turns do not appear to play a significant role in turn stability in these short peptides, in contrast to their importance in hairpins. Additionally, our data add to a growing number of examples of the 3:5 type I turn with a beta-bulge as a class of turns with high propensity to form locally defined structure. Current work is directed at the interplay between the local sequence information in the turns and more long-range influences in the mechanism of folding of this predominantly beta-sheet protein. Copyright 2004 Wiley Periodicals, Inc.

  18. Dynamical modeling of microRNA action on the protein translation process.

    Science.gov (United States)

    Zinovyev, Andrei; Morozova, Nadya; Nonne, Nora; Barillot, Emmanuel; Harel-Bellan, Annick; Gorban, Alexander N

    2010-02-24

    Protein translation is a multistep process which can be represented as a cascade of biochemical reactions (initiation, ribosome assembly, elongation, etc.), the rate of which can be regulated by small non-coding microRNAs through multiple mechanisms. It remains unclear what mechanisms of microRNA action are the most dominant: moreover, many experimental reports deliver controversial messages on what is the concrete mechanism actually observed in the experiment. Nissan and Parker have recently demonstrated that it might be impossible to distinguish alternative biological hypotheses using the steady state data on the rate of protein synthesis. For their analysis they used two simple kinetic models of protein translation. In contrary to the study by Nissan and Parker, we show that dynamical data allow discriminating some of the mechanisms of microRNA action. We demonstrate this using the same models as developed by Nissan and Parker for the sake of comparison but the methods developed (asymptotology of biochemical networks) can be used for other models. We formulate a hypothesis that the effect of microRNA action is measurable and observable only if it affects the dominant system (generalization of the limiting step notion for complex networks) of the protein translation machinery. The dominant system can vary in different experimental conditions that can partially explain the existing controversy of some of the experimental data. Our analysis of the transient protein translation dynamics shows that it gives enough information to verify or reject a hypothesis about a particular molecular mechanism of microRNA action on protein translation. For multiscale systems only that action of microRNA is distinguishable which affects the parameters of dominant system (critical parameters), or changes the dominant system itself. Dominant systems generalize and further develop the old and very popular idea of limiting step. Algorithms for identifying dominant systems in multiscale

  19. Dynamical modeling of microRNA action on the protein translation process

    Directory of Open Access Journals (Sweden)

    Barillot Emmanuel

    2010-02-01

    Full Text Available Abstract Background Protein translation is a multistep process which can be represented as a cascade of biochemical reactions (initiation, ribosome assembly, elongation, etc., the rate of which can be regulated by small non-coding microRNAs through multiple mechanisms. It remains unclear what mechanisms of microRNA action are the most dominant: moreover, many experimental reports deliver controversial messages on what is the concrete mechanism actually observed in the experiment. Nissan and Parker have recently demonstrated that it might be impossible to distinguish alternative biological hypotheses using the steady state data on the rate of protein synthesis. For their analysis they used two simple kinetic models of protein translation. Results In contrary to the study by Nissan and Parker, we show that dynamical data allow discriminating some of the mechanisms of microRNA action. We demonstrate this using the same models as developed by Nissan and Parker for the sake of comparison but the methods developed (asymptotology of biochemical networks can be used for other models. We formulate a hypothesis that the effect of microRNA action is measurable and observable only if it affects the dominant system (generalization of the limiting step notion for complex networks of the protein translation machinery. The dominant system can vary in different experimental conditions that can partially explain the existing controversy of some of the experimental data. Conclusions Our analysis of the transient protein translation dynamics shows that it gives enough information to verify or reject a hypothesis about a particular molecular mechanism of microRNA action on protein translation. For multiscale systems only that action of microRNA is distinguishable which affects the parameters of dominant system (critical parameters, or changes the dominant system itself. Dominant systems generalize and further develop the old and very popular idea of limiting step

  20. Molecular mechanisms for protein-encoded inheritance

    Science.gov (United States)

    Wiltzius, Jed J. W.; Landau, Meytal; Nelson, Rebecca; Sawaya, Michael R.; Apostol, Marcin I.; Goldschmidt, Lukasz; Soriaga, Angela B.; Cascio, Duilio; Rajashankar, Kanagalaghatta; Eisenberg, David

    2013-01-01

    Strains are phenotypic variants, encoded by nucleic acid sequences in chromosomal inheritance and by protein “conformations” in prion inheritance and transmission. But how is a protein “conformation” stable enough to endure transmission between cells or organisms? Here new polymorphic crystal structures of segments of prion and other amyloid proteins offer structural mechanisms for prion strains. In packing polymorphism, prion strains are encoded by alternative packings (polymorphs) of β-sheets formed by the same segment of a protein; in a second mechanism, segmental polymorphism, prion strains are encoded by distinct β-sheets built from different segments of a protein. Both forms of polymorphism can produce enduring “conformations,” capable of encoding strains. These molecular mechanisms for transfer of information into prion strains share features with the familiar mechanism for transfer of information by nucleic acid inheritance, including sequence specificity and recognition by non-covalent bonds. PMID:19684598

  1. RANKING RELATIONS USING ANALOGIES IN BIOLOGICAL AND INFORMATION NETWORKS1

    Science.gov (United States)

    Silva, Ricardo; Heller, Katherine; Ghahramani, Zoubin; Airoldi, Edoardo M.

    2013-01-01

    Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects S = {A(1) : B(1), A(2) : B(2), …, A(N) : B(N)}, measures how well other pairs A : B fit in with the set S. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in S? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided. PMID:24587838

  2. Thermodynamic database for proteins: features and applications.

    Science.gov (United States)

    Gromiha, M Michael; Sarai, Akinori

    2010-01-01

    We have developed a thermodynamic database for proteins and mutants, ProTherm, which is a collection of a large number of thermodynamic data on protein stability along with the sequence and structure information, experimental methods and conditions, and literature information. This is a valuable resource for understanding/predicting the stability of proteins, and it can be accessible at http://www.gibk26.bse.kyutech.ac.jp/jouhou/Protherm/protherm.html . ProTherm has several features including various search, display, and sorting options and visualization tools. We have analyzed the data in ProTherm to examine the relationship among thermodynamics, structure, and function of proteins. We describe the progress on the development of methods for understanding/predicting protein stability, such as (i) relationship between the stability of protein mutants and amino acid properties, (ii) average assignment method, (iii) empirical energy functions, (iv) torsion, distance, and contact potentials, and (v) machine learning techniques. The list of online resources for predicting protein stability has also been provided.

  3. Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

    Directory of Open Access Journals (Sweden)

    Jinjian Jiang

    2017-07-01

    Full Text Available Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences.

  4. Novel Tripod Amphiphiles for Membrane Protein Analysis

    DEFF Research Database (Denmark)

    Chae, Pil Seok; Kruse, Andrew C; Gotfryd, Kamil

    2013-01-01

    Integral membrane proteins play central roles in controlling the flow of information and molecules across membranes. Our understanding of membrane protein structures and functions, however, is seriously limited, mainly due to difficulties in handling and analysing these proteins in aqueous solution...

  5. Protein folding and protein metallocluster studies using synchrotron small angler X-ray scattering

    International Nuclear Information System (INIS)

    Eliezer, D.

    1994-06-01

    Proteins, biological macromolecules composed of amino-acid building blocks, possess unique three dimensional shapes or conformations which are intimately related to their biological function. All of the information necessary to determine this conformation is stored in a protein's amino acid sequence. The problem of understanding the process by which nature maps protein amino-acid sequences to three-dimensional conformations is known as the protein folding problem, and is one of the central unsolved problems in biophysics today. The possible applications of a solution are broad, ranging from the elucidation of thousands of protein structures to the rational modification and design of protein-based drugs. The scattering of X-rays by matter has long been useful as a tool for the characterization of physical properties of materials, including biological samples. The high photon flux available at synchrotron X-ray sources allows for the measurement of scattering cross-sections of dilute and/or disordered samples. Such measurements do not yield the detailed geometrical information available from crystalline samples, but do allow for lower resolution studies of dynamical processes not observable in the crystalline state. The main focus of the work described here has been the study of the protein folding process using time-resolved small-angle x-ray scattering measurements. The original intention was to observe the decrease in overall size which must accompany the folding of a protein from an extended conformation to its compact native state. Although this process proved too fast for the current time-resolution of the technique, upper bounds were set on the probable compaction times of several small proteins. In addition, an interesting and unexpected process was detected, in which the folding protein passes through an intermediate state which shows a tendency to associate. This state is proposed to be a kinetic molten globule folding intermediate

  6. Dynamics in electron transfer protein complexes

    OpenAIRE

    Bashir, Qamar

    2010-01-01

    Recent studies have provided experimental evidence for the existence of an encounter complex, a transient intermediate in the formation of protein complexes. We have used paramagnetic relaxation enhancement NMR spectroscopy in combination with Monte Carlo simulations to characterize and visualize the ensemble of encounter orientations in the short-lived electron transfer complex of yeast Cc and CcP. The complete conformational space sampled by the protein molecules during the dynamic part of ...

  7. Context-specific protein network miner - an online system for exploring context-specific protein interaction networks from the literature

    KAUST Repository

    Chowdhary, Rajesh

    2012-04-06

    Background: Protein interaction networks (PINs) specific within a particular context contain crucial information regarding many cellular biological processes. For example, PINs may include information on the type and directionality of interaction (e.g. phosphorylation), location of interaction (i.e. tissues, cells), and related diseases. Currently, very few tools are capable of deriving context-specific PINs for conducting exploratory analysis. Results: We developed a literature-based online system, Context-specific Protein Network Miner (CPNM), which derives context-specific PINs in real-time from the PubMed database based on a set of user-input keywords and enhanced PubMed query system. CPNM reports enriched information on protein interactions (with type and directionality), their network topology with summary statistics (e.g. most densely connected proteins in the network; most densely connected protein-pairs; and proteins connected by most inbound/outbound links) that can be explored via a user-friendly interface. Some of the novel features of the CPNM system include PIN generation, ontology-based PubMed query enhancement, real-time, user-queried, up-to-date PubMed document processing, and prediction of PIN directionality. Conclusions: CPNM provides a tool for biologists to explore PINs. It is freely accessible at http://www.biotextminer.com/CPNM/. © 2012 Chowdhary et al.

  8. Context-specific protein network miner - an online system for exploring context-specific protein interaction networks from the literature

    KAUST Repository

    Chowdhary, Rajesh; Tan, Sin Lam; Zhang, Jinfeng; Karnik, Shreyas; Bajic, Vladimir B.; Liu, Jun S.

    2012-01-01

    Background: Protein interaction networks (PINs) specific within a particular context contain crucial information regarding many cellular biological processes. For example, PINs may include information on the type and directionality of interaction (e.g. phosphorylation), location of interaction (i.e. tissues, cells), and related diseases. Currently, very few tools are capable of deriving context-specific PINs for conducting exploratory analysis. Results: We developed a literature-based online system, Context-specific Protein Network Miner (CPNM), which derives context-specific PINs in real-time from the PubMed database based on a set of user-input keywords and enhanced PubMed query system. CPNM reports enriched information on protein interactions (with type and directionality), their network topology with summary statistics (e.g. most densely connected proteins in the network; most densely connected protein-pairs; and proteins connected by most inbound/outbound links) that can be explored via a user-friendly interface. Some of the novel features of the CPNM system include PIN generation, ontology-based PubMed query enhancement, real-time, user-queried, up-to-date PubMed document processing, and prediction of PIN directionality. Conclusions: CPNM provides a tool for biologists to explore PINs. It is freely accessible at http://www.biotextminer.com/CPNM/. © 2012 Chowdhary et al.

  9. 75 FR 41244 - Proposed Collection; Request for Comments on an Existing Information Collection: (OMB Control No...

    Science.gov (United States)

    2010-07-15

    ... information collection. ``Self-Certification of Full-Time School Attendance for the School Year'' (OMB Control... other forms of information technology. We estimate 14,000 RI 25-14s will be processed annually. We... Information Collection: (OMB Control No. 3206-0032; RI 25-14 and RI 25- 14A) AGENCY: Office of Personnel...

  10. Regulation of the vertebrate cell cycle by the cdc2 protein kinase

    International Nuclear Information System (INIS)

    Draetta, G.; Brizuela, L.; Moran, B.; Beach, D.

    1988-01-01

    A homolog of the cdc2/CDC28 protein kinase of yeast is found in all vertebrate species that have been investigated. Human cdc2 exists as a complex with a 13-kD protein that is homologous to the suc1 gene product of fission yeast. In both human and fission yeast cells, the protein kinase also exists in a complex with a 62-kD polypeptide that has not been identified genetically but acts as a substrate in vitro. The authors have studied the properties of the protein kinase in rat and human cells, as well as in Xenopus eggs. They find that in baby rat kidney (BRK) cells, which are quiescent in cell culture, the cdc2 protein is not synthesized. However, synthesis is rapidly induced in response to proliferative activation by infection with adenovirus. In human HeLa cells, the protein kinase is present continuously. It behaves as a cell-cycle oscillator that is inactive in G 1 but displays maximal enzymatic activity during mitotic metaphase. These observations indicate that in a wide variety of vertebrate cells, the cdc2 protein kinase is involved in regulating mitosis. The authors' approach taken toward study of the cdc2 protein kinase highlights the possibilities that now exist for combining the advantages of ascomycete genetics with the cell-free systems of Xenopus and the biochemical advantages of tissue culture cells to investigate fundamental problems of the cell cycle

  11. Protein - Which is Best?

    Science.gov (United States)

    Hoffman, Jay R; Falvo, Michael J

    2004-09-01

    Protein intake that exceeds the recommended daily allowance is widely accepted for both endurance and power athletes. However, considering the variety of proteins that are available much less is known concerning the benefits of consuming one protein versus another. The purpose of this paper is to identify and analyze key factors in order to make responsible recommendations to both the general and athletic populations. Evaluation of a protein is fundamental in determining its appropriateness in the human diet. Proteins that are of inferior content and digestibility are important to recognize and restrict or limit in the diet. Similarly, such knowledge will provide an ability to identify proteins that provide the greatest benefit and should be consumed. The various techniques utilized to rate protein will be discussed. Traditionally, sources of dietary protein are seen as either being of animal or vegetable origin. Animal sources provide a complete source of protein (i.e. containing all essential amino acids), whereas vegetable sources generally lack one or more of the essential amino acids. Animal sources of dietary protein, despite providing a complete protein and numerous vitamins and minerals, have some health professionals concerned about the amount of saturated fat common in these foods compared to vegetable sources. The advent of processing techniques has shifted some of this attention and ignited the sports supplement marketplace with derivative products such as whey, casein and soy. Individually, these products vary in quality and applicability to certain populations. The benefits that these particular proteins possess are discussed. In addition, the impact that elevated protein consumption has on health and safety issues (i.e. bone health, renal function) are also reviewed. Key PointsHigher protein needs are seen in athletic populations.Animal proteins is an important source of protein, however potential health concerns do exist from a diet of protein

  12. Reconstruction of the yeast protein-protein interaction network involved in nutrient sensing and global metabolic regulation

    DEFF Research Database (Denmark)

    Nandy, Subir Kumar; Jouhten, Paula; Nielsen, Jens

    2010-01-01

    proteins. Despite the value of BioGRID for studying protein-protein interactions, there is a need for manual curation of these interactions in order to remove false positives. RESULTS: Here we describe an annotated reconstruction of the protein-protein interactions around four key nutrient......) and for all the interactions between them (edges). The annotated information is readily available utilizing the functionalities of network modelling tools such as Cytoscape and CellDesigner. CONCLUSIONS: The reported fully annotated interaction model serves as a platform for integrated systems biology studies...

  13. Detection of significant protein coevolution.

    Science.gov (United States)

    Ochoa, David; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2015-07-01

    The evolution of proteins cannot be fully understood without taking into account the coevolutionary linkages entangling them. From a practical point of view, coevolution between protein families has been used as a way of detecting protein interactions and functional relationships from genomic information. The most common approach to inferring protein coevolution involves the quantification of phylogenetic tree similarity using a family of methodologies termed mirrortree. In spite of their success, a fundamental problem of these approaches is the lack of an adequate statistical framework to assess the significance of a given coevolutionary score (tree similarity). As a consequence, a number of ad hoc filters and arbitrary thresholds are required in an attempt to obtain a final set of confident coevolutionary signals. In this work, we developed a method for associating confidence estimators (P values) to the tree-similarity scores, using a null model specifically designed for the tree comparison problem. We show how this approach largely improves the quality and coverage (number of pairs that can be evaluated) of the detected coevolution in all the stages of the mirrortree workflow, independently of the starting genomic information. This not only leads to a better understanding of protein coevolution and its biological implications, but also to obtain a highly reliable and comprehensive network of predicted interactions, as well as information on the substructure of macromolecular complexes using only genomic information. The software and datasets used in this work are freely available at: http://csbg.cnb.csic.es/pMT/. pazos@cnb.csic.es Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Prediction of thermodynamic instabilities of protein solutions from simple protein–protein interactions

    International Nuclear Information System (INIS)

    D’Agostino, Tommaso; Solana, José Ramón; Emanuele, Antonio

    2013-01-01

    Highlights: ► We propose a model of effective protein–protein interaction embedding solvent effects. ► A previous square-well model is enhanced by giving to the interaction a free energy character. ► The temperature dependence of the interaction is due to entropic effects of the solvent. ► The validity of the original SW model is extended to entropy driven phase transitions. ► We get good fits for lysozyme and haemoglobin spinodal data taken from literature. - Abstract: Statistical thermodynamics of protein solutions is often studied in terms of simple, microscopic models of particles interacting via pairwise potentials. Such modelling can reproduce the short range structure of protein solutions at equilibrium and predict thermodynamics instabilities of these systems. We introduce a square well model of effective protein–protein interaction that embeds the solvent’s action. We modify an existing model [45] by considering a well depth having an explicit dependence on temperature, i.e. an explicit free energy character, thus encompassing the statistically relevant configurations of solvent molecules around proteins. We choose protein solutions exhibiting demixing upon temperature decrease (lysozyme, enthalpy driven) and upon temperature increase (haemoglobin, entropy driven). We obtain satisfactory fits of spinodal curves for both the two proteins without adding any mean field term, thus extending the validity of the original model. Our results underline the solvent role in modulating or stretching the interaction potential

  15. Greening Existing Tribal Buildings

    Science.gov (United States)

    Guidance about improving sustainability in existing tribal casinos and manufactured homes. Many steps can be taken to make existing buildings greener and healthier. They may also reduce utility and medical costs.

  16. Requirements for existing buildings

    DEFF Research Database (Denmark)

    Thomsen, Kirsten Engelund; Wittchen, Kim Bjarne

    This report collects energy performance requirements for existing buildings in European member states by June 2012.......This report collects energy performance requirements for existing buildings in European member states by June 2012....

  17. Determination of Dynamics of Plant Plasma Membrane Proteins with Fluorescence Recovery and Raster Image Correlation Spectroscopy.

    Science.gov (United States)

    Laňková, Martina; Humpolíčková, Jana; Vosolsobě, Stanislav; Cit, Zdeněk; Lacek, Jozef; Čovan, Martin; Čovanová, Milada; Hof, Martin; Petrášek, Jan

    2016-04-01

    A number of fluorescence microscopy techniques are described to study dynamics of fluorescently labeled proteins, lipids, nucleic acids, and whole organelles. However, for studies of plant plasma membrane (PM) proteins, the number of these techniques is still limited because of the high complexity of processes that determine the dynamics of PM proteins and the existence of cell wall. Here, we report on the usage of raster image correlation spectroscopy (RICS) for studies of integral PM proteins in suspension-cultured tobacco cells and show its potential in comparison with the more widely used fluorescence recovery after photobleaching method. For RICS, a set of microscopy images is obtained by single-photon confocal laser scanning microscopy (CLSM). Fluorescence fluctuations are subsequently correlated between individual pixels and the information on protein mobility are extracted using a model that considers processes generating the fluctuations such as diffusion and chemical binding reactions. As we show here using an example of two integral PM transporters of the plant hormone auxin, RICS uncovered their distinct short-distance lateral mobility within the PM that is dependent on cytoskeleton and sterol composition of the PM. RICS, which is routinely accessible on modern CLSM instruments, thus represents a valuable approach for studies of dynamics of PM proteins in plants.

  18. Testing existing software for safety-related applications. Revision 7.1

    International Nuclear Information System (INIS)

    Scott, J.A.; Lawrence, J.D.

    1995-12-01

    The increasing use of commercial off-the-shelf (COTS) software products in digital safety-critical applications is raising concerns about the safety, reliability, and quality of these products. One of the factors involved in addressing these concerns is product testing. A tester's knowledge of the software product will vary, depending on the information available from the product vendor. In some cases, complete source listings, program structures, and other information from the software development may be available. In other cases, only the complete hardware/software package may exist, with the tester having no knowledge of the internal structure of the software. The type of testing that can be used will depend on the information available to the tester. This report describes six different types of testing, which differ in the information used to create the tests, the results that may be obtained, and the limitations of the test types. An Annex contains background information on types of faults encountered in testing, and a Glossary of pertinent terms is also included. This study is pertinent for safety-related software at reactors

  19. The DExH/D protein family database.

    Science.gov (United States)

    Jankowsky, E; Jankowsky, A

    2000-01-01

    DExH/D proteins are essential for all aspects of cellular RNA metabolism and processing, in the replication of many viruses and in DNA replication. DExH/D proteins are subject to current biological, biochemical and biophysical research which provides a continuous wealth of data. The DExH/D protein family database compiles this information and makes it available over the WWW (http://www.columbia.edu/ ej67/dbhome.htm ). The database can be fully searched by text based queries, facilitating fast access to specific information about this important class of enzymes.

  20. HCVpro: Hepatitis C virus protein interaction database

    KAUST Repository

    Kwofie, Samuel K.

    2011-12-01

    It is essential to catalog characterized hepatitis C virus (HCV) protein-protein interaction (PPI) data and the associated plethora of vital functional information to augment the search for therapies, vaccines and diagnostic biomarkers. In furtherance of these goals, we have developed the hepatitis C virus protein interaction database (HCVpro) by integrating manually verified hepatitis C virus-virus and virus-human protein interactions curated from literature and databases. HCVpro is a comprehensive and integrated HCV-specific knowledgebase housing consolidated information on PPIs, functional genomics and molecular data obtained from a variety of virus databases (VirHostNet, VirusMint, HCVdb and euHCVdb), and from BIND and other relevant biology repositories. HCVpro is further populated with information on hepatocellular carcinoma (HCC) related genes that are mapped onto their encoded cellular proteins. Incorporated proteins have been mapped onto Gene Ontologies, canonical pathways, Online Mendelian Inheritance in Man (OMIM) and extensively cross-referenced to other essential annotations. The database is enriched with exhaustive reviews on structure and functions of HCV proteins, current state of drug and vaccine development and links to recommended journal articles. Users can query the database using specific protein identifiers (IDs), chromosomal locations of a gene, interaction detection methods, indexed PubMed sources as well as HCVpro, BIND and VirusMint IDs. The use of HCVpro is free and the resource can be accessed via http://apps.sanbi.ac.za/hcvpro/ or http://cbrc.kaust.edu.sa/hcvpro/. © 2011 Elsevier B.V.

  1. Incorporating A Structured Writing Process into Existing CLS Curricula.

    Science.gov (United States)

    Honeycutt, Karen; Latshaw, Sandra

    2014-01-01

    Good communication and critical thinking are essential skills for all successful professionals, including Clinical Laboratory Science/Medical Laboratory Science (CLS/MLS) practitioners. Professional programs can incorporate writing assignments into their curricula to improve student written communication and critical thinking skills. Clearly defined, scenario-focused writing assignments provide student practice in clearly articulating responses to proposed problems or situations, researching and utilizing informational resources, and applying and synthesizing relevant information. Assessment rubrics, structured feedback, and revision writing methodologies help guide students through the writing process. This article describes how a CLS Program in a public academic medical center, located in the central United States (US) serving five centrally-located US states has incorporated writing intensive assignments into an existing 11-month academic year using formal, informal and reflective writing to improve student written communication and critical thinking skills. Faculty members and employers of graduates assert that incorporating writing intensive requirements have better prepared students for their professional role to effectively communicate and think critically.

  2. PaperBLAST: Text Mining Papers for Information about Homologs

    International Nuclear Information System (INIS)

    Price, Morgan N.; Arkin, Adam P.

    2017-01-01

    Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions.

  3. A coevolution analysis for identifying protein-protein interactions by Fourier transform

    Science.gov (United States)

    Yin, Changchuan; Yau, Stephen S. -T.

    2017-01-01

    Protein-protein interactions (PPIs) play key roles in life processes, such as signal transduction, transcription regulations, and immune response, etc. Identification of PPIs enables better understanding of the functional networks within a cell. Common experimental methods for identifying PPIs are time consuming and expensive. However, recent developments in computational approaches for inferring PPIs from protein sequences based on coevolution theory avoid these problems. In the coevolution theory model, interacted proteins may show coevolutionary mutations and have similar phylogenetic trees. The existing coevolution methods depend on multiple sequence alignments (MSA); however, the MSA-based coevolution methods often produce high false positive interactions. In this paper, we present a computational method using an alignment-free approach to accurately detect PPIs and reduce false positives. In the method, protein sequences are numerically represented by biochemical properties of amino acids, which reflect the structural and functional differences of proteins. Fourier transform is applied to the numerical representation of protein sequences to capture the dissimilarities of protein sequences in biophysical context. The method is assessed for predicting PPIs in Ebola virus. The results indicate strong coevolution between the protein pairs (NP-VP24, NP-VP30, NP-VP40, VP24-VP30, VP24-VP40, and VP30-VP40). The method is also validated for PPIs in influenza and E.coli genomes. Since our method can reduce false positive and increase the specificity of PPI prediction, it offers an effective tool to understand mechanisms of disease pathogens and find potential targets for drug design. The Python programs in this study are available to public at URL (https://github.com/cyinbox/PPI). PMID:28430779

  4. A coevolution analysis for identifying protein-protein interactions by Fourier transform.

    Directory of Open Access Journals (Sweden)

    Changchuan Yin

    Full Text Available Protein-protein interactions (PPIs play key roles in life processes, such as signal transduction, transcription regulations, and immune response, etc. Identification of PPIs enables better understanding of the functional networks within a cell. Common experimental methods for identifying PPIs are time consuming and expensive. However, recent developments in computational approaches for inferring PPIs from protein sequences based on coevolution theory avoid these problems. In the coevolution theory model, interacted proteins may show coevolutionary mutations and have similar phylogenetic trees. The existing coevolution methods depend on multiple sequence alignments (MSA; however, the MSA-based coevolution methods often produce high false positive interactions. In this paper, we present a computational method using an alignment-free approach to accurately detect PPIs and reduce false positives. In the method, protein sequences are numerically represented by biochemical properties of amino acids, which reflect the structural and functional differences of proteins. Fourier transform is applied to the numerical representation of protein sequences to capture the dissimilarities of protein sequences in biophysical context. The method is assessed for predicting PPIs in Ebola virus. The results indicate strong coevolution between the protein pairs (NP-VP24, NP-VP30, NP-VP40, VP24-VP30, VP24-VP40, and VP30-VP40. The method is also validated for PPIs in influenza and E.coli genomes. Since our method can reduce false positive and increase the specificity of PPI prediction, it offers an effective tool to understand mechanisms of disease pathogens and find potential targets for drug design. The Python programs in this study are available to public at URL (https://github.com/cyinbox/PPI.

  5. Characterising antimicrobial protein-membrane complexes

    International Nuclear Information System (INIS)

    Xun, Gloria; Dingley, Andrew; Tremouilhac, Pierre

    2009-01-01

    Full text: Antimicrobial proteins (AMPs) are host defence molecules that protect organisms from microbial infection. A number of hypotheses for AMP activity have been proposed which involve protein membrane interactions. However, there is a paucity of information describing AMP-membrane complexes in detail. The aim of this project is to characterise the interactions of amoebapore-A (APA-1) with membrane models using primarily solution-state NMR spectroscopy. APA-1 is an AMP which is regulated by a pH-dependent dimerisation event. Based on the atomic resolution solution structure of monomeric APA-1, it is proposed that this dimerisation is a prerequisite for ring-like hexameric pore formation. Due to the cytotoxicity of APA-1, we have developed a cell-free system to produce this protein. To facilitate our studies, we have adapted the cell-free system to isotope label APA-1. 13 C /15 N -enriched APA-1 sample was achieved and we have begun characterising APA-1 dimerisation and membrane interactions using NMR spectroscopy and other biochemical/biophysical methods. Neutron reflectometry is a surface-sensitive technique and therefore represents an ideal technique to probe how APA-1 interacts with membranes at the molecular level under different physiological conditions. Using Platypus, the pH-induced APA-1-membrane interactions should be detectable as an increase of the amount of protein adsorbed at the membrane surface and changes in the membrane properties. Specifically, detailed information of the structure and dimensions of the protein-membrane complex, the position and amount of the protein in the membrane, and the perturbation of the membrane phospholipids on protein incorporation can be extracted from the neutron reflectometry measurement. Such information will enable critical assessment of current proposed mechanisms of AMP activity in bacterial membranes and complement our NMR studies

  6. Multi-level machine learning prediction of protein–protein interactions in Saccharomyces cerevisiae

    Directory of Open Access Journals (Sweden)

    Julian Zubek

    2015-07-01

    Full Text Available Accurate identification of protein–protein interactions (PPI is the key step in understanding proteins’ biological functions, which are typically context-dependent. Many existing PPI predictors rely on aggregated features from protein sequences, however only a few methods exploit local information about specific residue contacts. In this work we present a two-stage machine learning approach for prediction of protein–protein interactions. We start with the carefully filtered data on protein complexes available for Saccharomyces cerevisiae in the Protein Data Bank (PDB database. First, we build linear descriptions of interacting and non-interacting sequence segment pairs based on their inter-residue distances. Secondly, we train machine learning classifiers to predict binary segment interactions for any two short sequence fragments. The final prediction of the protein–protein interaction is done using the 2D matrix representation of all-against-all possible interacting sequence segments of both analysed proteins. The level-I predictor achieves 0.88 AUC for micro-scale, i.e., residue-level prediction. The level-II predictor improves the results further by a more complex learning paradigm. We perform 30-fold macro-scale, i.e., protein-level cross-validation experiment. The level-II predictor using PSIPRED-predicted secondary structure reaches 0.70 precision, 0.68 recall, and 0.70 AUC, whereas other popular methods provide results below 0.6 threshold (recall, precision, AUC. Our results demonstrate that multi-scale sequence features aggregation procedure is able to improve the machine learning results by more than 10% as compared to other sequence representations. Prepared datasets and source code for our experimental pipeline are freely available for download from: http://zubekj.github.io/mlppi/ (open source Python implementation, OS independent.

  7. Fibronectin phosphorylation by ecto-protein kinase

    International Nuclear Information System (INIS)

    Imada, Sumi; Sugiyama, Yayoi; Imada, Masaru

    1988-01-01

    The presence of membrane-associated, extracellular protein kinase (ecto-protein kinase) and its substrate proteins was examined with serum-free cultures of Swiss 3T3 fibroblast. When cells were incubated with [γ- 32 ]ATP for 10 min at 37 degree C, four proteins with apparent molecular weights between 150 and 220 kDa were prominently phosphorylated. These proteins were also radiolabeled by lactoperoxidase catalyzed iodination and were sensitive to mild tryptic digestion, suggesting that they localized on the cell surface or in the extracellular matrix. Phosphorylation of extracellular proteins with [γ- 32 P]ATP in intact cell culture is consistent with the existence of ecto-protein kinase. Anti-fibronectin antibody immunoprecipitated one of the phosphoproteins which comigrated with a monomer and a dimer form of fibronectin under reducing and nonreducing conditions of electrophoresis, respectively. The protein had affinity for gelatin as demonstrated by retention with gelatin-conjugated agarose. This protein substrate of ecto-protein kinase was thus concluded to be fibronectin. The sites of phosphorylation by ecto-protein kinase were compared with those of intracellularly phosphorylated fibronectin by the analysis of radiolabeled amino acids and peptides. Ecto-protein kinase phosphorylated fibronectin at serine and threonine residues which were distinct from the sites of intracellular fibronectin phosphorylation

  8. Structural behaviour characterization of existing adobe constructions in Aveiro

    OpenAIRE

    Varum, H.; Costa, A.; Martins, T.; Pereira, H.; Almeida, J.; Rodrigues, H.; Silveira, D.

    2007-01-01

    Adobe was a widely used construction material in Aveiro, Portugal, till the middle of the 20th century. Nowadays, adobe can still be found in varied types of constructions, many of which are of cultural, historical, and also architectural recognized value. The existing adobe buildings present an important level of structural damage and, in many cases, are even near to ruin, having the majority a high vulnerability to seismic actions. To face the lack of information concerning the mechanica...

  9. Maximal quantum Fisher information matrix

    International Nuclear Information System (INIS)

    Chen, Yu; Yuan, Haidong

    2017-01-01

    We study the existence of the maximal quantum Fisher information matrix in the multi-parameter quantum estimation, which bounds the ultimate precision limit. We show that when the maximal quantum Fisher information matrix exists, it can be directly obtained from the underlying dynamics. Examples are then provided to demonstrate the usefulness of the maximal quantum Fisher information matrix by deriving various trade-off relations in multi-parameter quantum estimation and obtaining the bounds for the scalings of the precision limit. (paper)

  10. Challenging a dogma: co-mutations exist in MAPK pathway genes in colorectal cancer.

    Science.gov (United States)

    Grellety, Thomas; Gros, Audrey; Pedeutour, Florence; Merlio, Jean-Philippe; Duranton-Tanneur, Valerie; Italiano, Antoine; Soubeyran, Isabelle

    2016-10-01

    Sequencing of genes encoding mitogen-activated protein kinase (MAPK) pathway proteins in colorectal cancer (CRC) has established as dogma that of the genes in a pathway only a single one is ever mutated. We searched for cases with a mutation in more than one MAPK pathway gene (co-mutations). Tumor tissue samples of all patients presenting with CRC, and referred between 01/01/2008 and 01/06/2015 to three French cancer centers for determination of mutation status of RAS/RAF+/-PIK3CA, were retrospectively screened for co-mutations using Sanger sequencing or next-generation sequencing. We found that of 1791 colorectal patients with mutations in the MAPK pathway, 20 had a co-mutation, 8 of KRAS/NRAS, and some even with a third mutation. More than half of the mutations were in codons 12 and 13. We also found 3 cases with a co-mutation of NRAS/BRAF and 9 with a co-mutation of KRAS/BRAF. In 2 patients with a co-mutation of KRAS/NRAS, the co-mutation existed in the primary as well as in a metastasis, which suggests that co-mutations occur early during carcinogenesis and are maintained when a tumor disseminates. We conclude that co-mutations exist in the MAPK genes but with low frequency and as yet with unknown outcome implications.

  11. Integration of multiple biological features yields high confidence human protein interactome.

    Science.gov (United States)

    Karagoz, Kubra; Sevimoglu, Tuba; Arga, Kazim Yalcin

    2016-08-21

    The biological function of a protein is usually determined by its physical interaction with other proteins. Protein-protein interactions (PPIs) are identified through various experimental methods and are stored in curated databases. The noisiness of the existing PPI data is evident, and it is essential that a more reliable data is generated. Furthermore, the selection of a set of PPIs at different confidence levels might be necessary for many studies. Although different methodologies were introduced to evaluate the confidence scores for binary interactions, a highly reliable, almost complete PPI network of Homo sapiens is not proposed yet. The quality and coverage of human protein interactome need to be improved to be used in various disciplines, especially in biomedicine. In the present work, we propose an unsupervised statistical approach to assign confidence scores to PPIs of H. sapiens. To achieve this goal PPI data from six different databases were collected and a total of 295,288 non-redundant interactions between 15,950 proteins were acquired. The present scoring system included the context information that was assigned to PPIs derived from eight biological attributes. A high confidence network, which included 147,923 binary interactions between 13,213 proteins, had scores greater than the cutoff value of 0.80, for which sensitivity, specificity, and coverage were 94.5%, 80.9%, and 82.8%, respectively. We compared the present scoring method with others for evaluation. Reducing the noise inherent in experimental PPIs via our scoring scheme increased the accuracy significantly. As it was demonstrated through the assessment of process and cancer subnetworks, this study allows researchers to construct and analyze context-specific networks via valid PPI sets and one can easily achieve subnetworks around proteins of interest at a specified confidence level. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Lipid rafts exist as stable cholesterol-independent microdomains in the brush border membrane of enterocytes

    DEFF Research Database (Denmark)

    Hansen, Gert Helge; Immerdal, Lissi; Thorsen, Evy

    2001-01-01

    Glycosphingolipid/cholesterol-rich membranes ("rafts")can be isolated from many types of cells, but their existence as stable microdomains in the cell membrane has been elusive. Addressing this problem, we studied the distribution of galectin-4, a raft marker, and lactase, a protein excluded from...... rafts, on microvillar vesicles from the enterocyte brush border membrane. Magnetic beads coated with either anti-galectin-4 or anti-lactase antibodies were used for immunoisolation of vesicles followed by double immunogold labeling of the two proteins. A morphometric analysis revealed subpopulations...... of raft-rich and raft-poor vesicles by the following criteria: 1) the lactase/galectin-4 labeling ratio/vesicle captured by the anti-lactase beads was significantly higher (p

  13. Protein-Protein Interaction Network and Gene Ontology

    Science.gov (United States)

    Choi, Yunkyu; Kim, Seok; Yi, Gwan-Su; Park, Jinah

    Evolution of computer technologies makes it possible to access a large amount and various kinds of biological data via internet such as DNA sequences, proteomics data and information discovered about them. It is expected that the combination of various data could help researchers find further knowledge about them. Roles of a visualization system are to invoke human abilities to integrate information and to recognize certain patterns in the data. Thus, when the various kinds of data are examined and analyzed manually, an effective visualization system is an essential part. One instance of these integrated visualizations can be combination of protein-protein interaction (PPI) data and Gene Ontology (GO) which could help enhance the analysis of PPI network. We introduce a simple but comprehensive visualization system that integrates GO and PPI data where GO and PPI graphs are visualized side-by-side and supports quick reference functions between them. Furthermore, the proposed system provides several interactive visualization methods for efficiently analyzing the PPI network and GO directedacyclic- graph such as context-based browsing and common ancestors finding.

  14. Chatty Mitochondria: Keeping Balance in Cellular Protein Homeostasis.

    Science.gov (United States)

    Topf, Ulrike; Wrobel, Lidia; Chacinska, Agnieszka

    2016-08-01

    Mitochondria are multifunctional cellular organelles that host many biochemical pathways including oxidative phosphorylation (OXPHOS). Defective mitochondria pose a threat to cellular homeostasis and compensatory responses exist to curtail the source of stress and/or its consequences. The mitochondrial proteome comprises proteins encoded by the nuclear and mitochondrial genomes. Disturbances in protein homeostasis may originate from mistargeting of nuclear encoded mitochondrial proteins. Defective protein import and accumulation of mistargeted proteins leads to stress that triggers translation alterations and proteasomal activation. These cytosolic pathways are complementary to the mitochondrial unfolded protein response (UPRmt) that aims to increase the capacity of protein quality control mechanisms inside mitochondria. They constitute putative targets for interventions aimed at increasing the fitness, stress resistance, and longevity of cells and organisms. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Existing and new techniques in uranium exploration

    International Nuclear Information System (INIS)

    Bowie, S.H.U.; Cameron, J.

    1976-01-01

    The demands on uranium exploration over the next 25 years will be very great indeed and will call for every possible means of improvement in exploration capability. The first essential is to increase geological knowledge of the mode of occurrence of uranium ore deposits. The second is to improve existing exploration techniques and instrumentation while, at the same time, promoting research and development on new methods to discover uranium ore bodies on the earth's surface and at depth. The present symposium is an effort to increase co-operation and the exchange of information in the critical field of uranium exploration techniques and instrumentation. As an introduction to the symposium a brief review is presented, firstly of what can be considered as existing techniques and, secondly, of techniques which have not yet been used on an appreciable scale. Some fourteen techniques used over the last 30 years are identified and their appropriate application, advantages and limitations are briefly summarized and the possibilities of their further development considered. The aim of future research on new techniques, in addition to finding new ways and means of identifying surface deposits, should be mainly directed to devising methods and instrumentation capable of detecting buried ore bodies that do not give a gamma signal at the surface. To achieve this aim, two contributory factors are essential: adequate financial support for research and development and increased specialized training in uranium exploration and instrumentation design. The papers in this symposium describe developments in the existing techniques, proposals for future research and development and case histories of exploration programmes

  16. PaperBLAST: Text Mining Papers for Information about Homologs.

    Science.gov (United States)

    Price, Morgan N; Arkin, Adam P

    2017-01-01

    Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST's database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/. IMPORTANCE With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins' functions.

  17. Twisting, supercoiling and stretching in protein bound DNA

    Science.gov (United States)

    Lam, Pui-Man; Zhen, Yi

    2018-04-01

    We have calculated theoretical results for the torque and slope of the twisted DNA, with various proteins bound on it, using the Neukirch-Marko model, in the regime where plectonemes exist. We found that the torque in the protein bound DNA decreases compared to that in the bare DNA. This is caused by the decrease in the free energy g(f) , and hence the smaller persistence lengths, in the case of protein bound DNA. We hope our results will encourage experimental investigations of supercoiling in protein bound DNA, which can provide further tests of the Neukirch-Marko model.

  18. Improving protein fold recognition by extracting fold-specific features from predicted residue-residue contacts.

    Science.gov (United States)

    Zhu, Jianwei; Zhang, Haicang; Li, Shuai Cheng; Wang, Chao; Kong, Lupeng; Sun, Shiwei; Zheng, Wei-Mou; Bu, Dongbo

    2017-12-01

    Accurate recognition of protein fold types is a key step for template-based prediction of protein structures. The existing approaches to fold recognition mainly exploit the features derived from alignments of query protein against templates. These approaches have been shown to be successful for fold recognition at family level, but usually failed at superfamily/fold levels. To overcome this limitation, one of the key points is to explore more structurally informative features of proteins. Although residue-residue contacts carry abundant structural information, how to thoroughly exploit these information for fold recognition still remains a challenge. In this study, we present an approach (called DeepFR) to improve fold recognition at superfamily/fold levels. The basic idea of our approach is to extract fold-specific features from predicted residue-residue contacts of proteins using deep convolutional neural network (DCNN) technique. Based on these fold-specific features, we calculated similarity between query protein and templates, and then assigned query protein with fold type of the most similar template. DCNN has showed excellent performance in image feature extraction and image recognition; the rational underlying the application of DCNN for fold recognition is that contact likelihood maps are essentially analogy to images, as they both display compositional hierarchy. Experimental results on the LINDAHL dataset suggest that even using the extracted fold-specific features alone, our approach achieved success rate comparable to the state-of-the-art approaches. When further combining these features with traditional alignment-related features, the success rate of our approach increased to 92.3%, 82.5% and 78.8% at family, superfamily and fold levels, respectively, which is about 18% higher than the state-of-the-art approach at fold level, 6% higher at superfamily level and 1% higher at family level. An independent assessment on SCOP_TEST dataset showed consistent

  19. DichroMatch at the protein circular dichroism data bank (DM@PCDDB): A web-based tool for identifying protein nearest neighbors using circular dichroism spectroscopy.

    Science.gov (United States)

    Whitmore, Lee; Mavridis, Lazaros; Wallace, B A; Janes, Robert W

    2018-01-01

    Circular dichroism spectroscopy is a well-used, but simple method in structural biology for providing information on the secondary structure and folds of proteins. DichroMatch (DM@PCDDB) is an online tool that is newly available in the Protein Circular Dichroism Data Bank (PCDDB), which takes advantage of the wealth of spectral and metadata deposited therein, to enable identification of spectral nearest neighbors of a query protein based on four different methods of spectral matching. DM@PCDDB can potentially provide novel information about structural relationships between proteins and can be used in comparison studies of protein homologs and orthologs. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  20. Quantitative protein localization signatures reveal an association between spatial and functional divergences of proteins.

    Science.gov (United States)

    Loo, Lit-Hsin; Laksameethanasan, Danai; Tung, Yi-Ling

    2014-03-01

    Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein

  1. iPhos-PseEvo: Identifying Human Phosphorylated Proteins by Incorporating Evolutionary Information into General PseAAC via Grey System Theory.

    Science.gov (United States)

    Qiu, Wang-Ren; Sun, Bi-Qian; Xiao, Xuan; Xu, Dong; Chou, Kuo-Chen

    2017-05-01

    Protein phosphorylation plays a critical role in human body by altering the structural conformation of a protein, causing it to become activated/deactivated, or functional modification. Given an uncharacterized protein sequence, can we predict whether it may be phosphorylated or may not? This is no doubt a very meaningful problem for both basic research and drug development. Unfortunately, to our best knowledge, so far no high throughput bioinformatics tool whatsoever has been developed to address such a very basic but important problem due to its extremely complexity and lacking sufficient training data. Here we proposed a predictor called iPhos-PseEvo by (1) incorporating the protein sequence evolutionary information into the general pseudo amino acid composition (PseAAC) via the grey system theory, (2) balancing out the skewed training datasets by the asymmetric bootstrap approach, and (3) constructing an ensemble predictor by fusing an array of individual random forest classifiers thru a voting system. Rigorous jackknife tests have indicated that very promising success rates have been achieved by iPhos-PseEvo even for such a difficult problem. A user-friendly web-server for iPhos-PseEvo has been established at http://www.jci-bioinfo.cn/iPhos-PseEvo, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It has not escaped our notice that the formulation and approach presented here can be used to analyze many other problems in protein science as well. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Bibliography - Existing Guidance for External Hazard Modelling

    International Nuclear Information System (INIS)

    Decker, Kurt

    2015-01-01

    The bibliography of deliverable D21.1 includes existing international and national guidance documents and standards on external hazard assessment together with a selection of recent scientific papers, which are regarded to provide useful information on the state of the art of external event modelling. The literature database is subdivided into International Standards, National Standards, and Science Papers. The deliverable is treated as a 'living document' which is regularly updated as necessary during the lifetime of ASAMPSA-E. The current content of the database is about 140 papers. Most of the articles are available as full-text versions in PDF format. The deliverable is available as an EndNote X4 database and as text files. The database includes the following information: Reference, Key words, Abstract (if available), PDF file of the original paper (if available), Notes (comments by the ASAMPSA-E consortium if available) The database is stored at the ASAMPSA-E FTP server hosted by IRSN. PDF files of original papers are accessible through the EndNote software

  3. Arraying proteins by cell-free synthesis.

    Science.gov (United States)

    He, Mingyue; Wang, Ming-Wei

    2007-10-01

    Recent advances in life science have led to great motivation for the development of protein arrays to study functions of genome-encoded proteins. While traditional cell-based methods have been commonly used for generating protein arrays, they are usually a time-consuming process with a number of technical challenges. Cell-free protein synthesis offers an attractive system for making protein arrays, not only does it rapidly converts the genetic information into functional proteins without the need for DNA cloning, but also presents a flexible environment amenable to production of folded proteins or proteins with defined modifications. Recent advancements have made it possible to rapidly generate protein arrays from PCR DNA templates through parallel on-chip protein synthesis. This article reviews current cell-free protein array technologies and their proteomic applications.

  4. Interactions among tobacco sieve element occlusion (SEO) proteins.

    Science.gov (United States)

    Jekat, Stephan B; Ernst, Antonia M; Zielonka, Sascia; Noll, Gundula A; Prüfer, Dirk

    2012-12-01

    Angiosperms transport their photoassimilates through sieve tubes, which comprise longitudinally-connected sieve elements. In dicots and also some monocots, the sieve elements contain parietal structural proteins known as phloem proteins or P-proteins. Following injury, P proteins disperse and accumulate as viscous plugs at the sieve plates to prevent the loss of valuable transport sugars. Tobacco (Nicotiana tabacum) P-proteins are multimeric complexes comprising subunits encoded by members of the SEO (sieve element occlusion) gene family. The existence of multiple subunits suggests that P-protein assembly involves interactions between SEO proteins, but this process is largely uncharacterized and it is unclear whether the different subunits perform unique roles or are redundant. We therefore extended our analysis of the tobacco P-proteins NtSEO1 and NtSEO2 to investigate potential interactions between them, and found that both proteins can form homomeric and heteromeric complexes in planta.

  5. Analysis of long-range correlation in sequences data of proteins

    OpenAIRE

    ADRIANA ISVORAN; LAURA UNIPAN; DANA CRACIUN; VASILE MORARIU

    2007-01-01

    The results presented here suggest the existence of correlations in the sequence data of proteins. 32 proteins, both globular and fibrous, both monomeric and polymeric, were analyzed. The primary structures of these proteins were treated as time series. Three spatial series of data for each sequence of a protein were generated from numerical correspondences between each amino acid and a physical property associated with it, i.e., its electric charge, its polar character and its dipole moment....

  6. Analysis of deep learning methods for blind protein contact prediction in CASP12.

    Science.gov (United States)

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2018-03-01

    Here we present the results of protein contact prediction achieved in CASP12 by our RaptorX-Contact server, which is an early implementation of our deep learning method for contact prediction. On a set of 38 free-modeling target domains with a median family size of around 58 effective sequences, our server obtained an average top L/5 long- and medium-range contact accuracy of 47% and 44%, respectively (L = length). A complete implementation has an average accuracy of 59% and 57%, respectively. Our deep learning method formulates contact prediction as a pixel-level image labeling problem and simultaneously predicts all residue pairs of a protein using a combination of two deep residual neural networks, taking as input the residue conservation information, predicted secondary structure and solvent accessibility, contact potential, and coevolution information. Our approach differs from existing methods mainly in (1) formulating contact prediction as a pixel-level image labeling problem instead of an image-level classification problem; (2) simultaneously predicting all contacts of an individual protein to make effective use of contact occurrence patterns; and (3) integrating both one-dimensional and two-dimensional deep convolutional neural networks to effectively learn complex sequence-structure relationship including high-order residue correlation. This paper discusses the RaptorX-Contact pipeline, both contact prediction and contact-based folding results, and finally the strength and weakness of our method. © 2017 Wiley Periodicals, Inc.

  7. Fast loop modeling for protein structures

    Science.gov (United States)

    Zhang, Jiong; Nguyen, Son; Shang, Yi; Xu, Dong; Kosztin, Ioan

    2015-03-01

    X-ray crystallography is the main method for determining 3D protein structures. In many cases, however, flexible loop regions of proteins cannot be resolved by this approach. This leads to incomplete structures in the protein data bank, preventing further computational study and analysis of these proteins. For instance, all-atom molecular dynamics (MD) simulation studies of structure-function relationship require complete protein structures. To address this shortcoming, we have developed and implemented an efficient computational method for building missing protein loops. The method is database driven and uses deep learning and multi-dimensional scaling algorithms. We have implemented the method as a simple stand-alone program, which can also be used as a plugin in existing molecular modeling software, e.g., VMD. The quality and stability of the generated structures are assessed and tested via energy scoring functions and by equilibrium MD simulations. The proposed method can also be used in template-based protein structure prediction. Work supported by the National Institutes of Health [R01 GM100701]. Computer time was provided by the University of Missouri Bioinformatics Consortium.

  8. Roles of Apicomplexan protein kinases at each life cycle stage.

    Science.gov (United States)

    Kato, Kentaro; Sugi, Tatsuki; Iwanaga, Tatsuya

    2012-06-01

    Inhibitors of cellular protein kinases have been reported to inhibit the development of Apicomplexan parasites, suggesting that the functions of protozoan protein kinases are critical for their life cycle. However, the specific roles of these protein kinases cannot be determined using only these inhibitors without molecular analysis, including gene disruption. In this report, we describe the functions of Apicomplexan protein kinases in each parasite life stage and the potential of pre-existing protein kinase inhibitors as Apicomplexan drugs against, mainly, Plasmodium and Toxoplasma. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  9. Exploring the Spatiotemporal Organization of Membrane Proteins in Living Plant Cells.

    Science.gov (United States)

    Wang, Li; Xue, Yiqun; Xing, Jingjing; Song, Kai; Lin, Jinxing

    2018-04-29

    Plasma membrane proteins have important roles in transport and signal transduction. Deciphering the spatiotemporal organization of these proteins provides crucial information for elucidating the links between the behaviors of different molecules. However, monitoring membrane proteins without disrupting their membrane environment remains difficult. Over the past decade, many studies have developed single-molecule techniques, opening avenues for probing the stoichiometry and interactions of membrane proteins in their native environment by providing nanometer-scale spatial information and nanosecond-scale temporal information. In this review, we assess recent progress in the development of labeling and imaging technology for membrane protein analysis. We focus in particular on several single-molecule techniques for quantifying the dynamics and assembly of membrane proteins. Finally, we provide examples of how these new techniques are advancing our understanding of the complex biological functions of membrane proteins.

  10. Dynamics in electron transfer protein complexes

    NARCIS (Netherlands)

    Bashir, Qamar

    2010-01-01

    Recent studies have provided experimental evidence for the existence of an encounter complex, a transient intermediate in the formation of protein complexes. We have used paramagnetic relaxation enhancement NMR spectroscopy in combination with Monte Carlo simulations to characterize and visualize

  11. A collaborative filtering approach for protein-protein docking scoring functions.

    Science.gov (United States)

    Bourquard, Thomas; Bernauer, Julie; Azé, Jérôme; Poupon, Anne

    2011-04-22

    A protein-protein docking procedure traditionally consists in two successive tasks: a search algorithm generates a large number of candidate conformations mimicking the complex existing in vivo between two proteins, and a scoring function is used to rank them in order to extract a native-like one. We have already shown that using Voronoi constructions and a well chosen set of parameters, an accurate scoring function could be designed and optimized. However to be able to perform large-scale in silico exploration of the interactome, a near-native solution has to be found in the ten best-ranked solutions. This cannot yet be guaranteed by any of the existing scoring functions. In this work, we introduce a new procedure for conformation ranking. We previously developed a set of scoring functions where learning was performed using a genetic algorithm. These functions were used to assign a rank to each possible conformation. We now have a refined rank using different classifiers (decision trees, rules and support vector machines) in a collaborative filtering scheme. The scoring function newly obtained is evaluated using 10 fold cross-validation, and compared to the functions obtained using either genetic algorithms or collaborative filtering taken separately. This new approach was successfully applied to the CAPRI scoring ensembles. We show that for 10 targets out of 12, we are able to find a near-native conformation in the 10 best ranked solutions. Moreover, for 6 of them, the near-native conformation selected is of high accuracy. Finally, we show that this function dramatically enriches the 100 best-ranking conformations in near-native structures.

  12. Whirlin and PDZ domain-containing 7 (PDZD7) proteins are both required to form the quaternary protein complex associated with Usher syndrome type 2.

    Science.gov (United States)

    Chen, Qian; Zou, Junhuang; Shen, Zuolian; Zhang, Weiping; Yang, Jun

    2014-12-26

    Usher syndrome (USH) is the leading genetic cause of combined hearing and vision loss. Among the three USH clinical types, type 2 (USH2) occurs most commonly. USH2A, GPR98, and WHRN are three known causative genes of USH2, whereas PDZD7 is a modifier gene found in USH2 patients. The proteins encoded by these four USH genes have been proposed to form a multiprotein complex, the USH2 complex, due to interactions found among some of these proteins in vitro, their colocalization in vivo, and mutual dependence of some of these proteins for their normal in vivo localizations. However, evidence showing the formation of the USH2 complex is missing, and details on how this complex is formed remain elusive. Here, we systematically investigated interactions among the intracellular regions of the four USH proteins using colocalization, yeast two-hybrid, and pull-down assays. We show that multiple domains of the four USH proteins interact among one another. Importantly, both WHRN and PDZD7 are required for the complex formation with USH2A and GPR98. In this USH2 quaternary complex, WHRN prefers to bind to USH2A, whereas PDZD7 prefers to bind to GPR98. Interaction between WHRN and PDZD7 is the bridge between USH2A and GPR98. Additionally, the USH2 quaternary complex has a variable stoichiometry. These findings suggest that a non-obligate, short term, and dynamic USH2 quaternary protein complex may exist in vivo. Our work provides valuable insight into the physiological role of the USH2 complex in vivo and informs possible reconstruction of the USH2 complex for future therapy. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

  13. Kinome signaling through regulated protein-protein interactions in normal and cancer cells.

    Science.gov (United States)

    Pawson, Tony; Kofler, Michael

    2009-04-01

    The flow of molecular information through normal and oncogenic signaling pathways frequently depends on protein phosphorylation, mediated by specific kinases, and the selective binding of the resulting phosphorylation sites to interaction domains present on downstream targets. This physical and functional interplay of catalytic and interaction domains can be clearly seen in cytoplasmic tyrosine kinases such as Src, Abl, Fes, and ZAP-70. Although the kinase and SH2 domains of these proteins possess similar intrinsic properties of phosphorylating tyrosine residues or binding phosphotyrosine sites, they also undergo intramolecular interactions when linked together, in a fashion that varies from protein to protein. These cooperative interactions can have diverse effects on substrate recognition and kinase activity, and provide a variety of mechanisms to link the stimulation of catalytic activity to substrate recognition. Taken together, these data have suggested how protein kinases, and the signaling pathways in which they are embedded, can evolve complex properties through the stepwise linkage of domains within single polypeptides or multi-protein assemblies.

  14. Human Information Behaviour and Design, Development and Evaluation of Information Retrieval Systems

    Science.gov (United States)

    Keshavarz, Hamid

    2008-01-01

    Purpose: The purpose of this paper is to introduce the concept of human information behaviour and to explore the relationship between information behaviour of users and the existing approaches dominating design and evaluation of information retrieval (IR) systems and also to describe briefly new design and evaluation methods in which extensive…

  15. APPROACH ON THE EXISTENCE OF INNOVATION IN TOURISM

    Directory of Open Access Journals (Sweden)

    Cristina BURGHELEA

    2015-04-01

    Full Text Available This article aimed to highlight the existence of innovation in tourism based on the international literature. From conceptualization of the research period it was found that definition can be universally valid applied in all sectors of the economy and, equally, in the tertiary sector, where there are tourist services. Coming either from English or French, "innovation" defines both a process and its results. Adapting to the constantly varying wishes of customers, innovation is a key element underpinning the survival and existence of competition in a dynamic environment that is changing radically. Current studies reveal that there are other indirect benefits of innovations such as image enhancement, improved customer loyalty, and ability to attract new ones. In this study, it was paid a special attention to the long-term prospects related to the tourism sector in countries such as Australia, Latin America, Africa, China and emerging markets such as India and Indonesia. This has resulted in tourism expenditure forecast for the period 2013 - 2019 performed using information provided by the Ministry of Business, Innovation and Employment.

  16. Applying new safeguards technology to existing nuclear facilities

    International Nuclear Information System (INIS)

    Harris, W.J.; Wagner, E.P.

    1979-01-01

    The application and operation of safeguards instrumentation in a facility containing special nuclear material is most successful when the installation is desinged for the operation of the specific facility. Experience at the Idaho National Engineering Laboratory demonstrates that installation designs must consider both safeguards and production requirements of specific facilities. Equipment selection and installation design influenced by the training and experience of production operations and safeguards personnel at a specific facility help assure successful installation, reliable operation, and minimal operator training. This minimizes impacts on existing plant production activities while maximizing utility of the safeguards information obtained

  17. Applying new safeguards technology to existing nuclear facilities

    International Nuclear Information System (INIS)

    Johnson, C.E.; Wagner, E.P.

    1979-01-01

    The application and operation of safeguards instrumentation in a facility containing special nuclear material is most successful when the installation is designed for the operation of the specific facility. Experience at the Idaho National Engineering Laboratory demonstrates that installation designs must consider both Safeguards and Production requirements of specific facilities. Equipment selection and installation design influenced by the training and experience of production operations and safeguards personnel at a specific facility help assure successful installation, reliable operation, and minimal operator training. This minimizes impacts on existing plant production activities while maximizing utility of the safeguards information obtained

  18. EXTERNALITIES IN EXCHANGE NETWORKS AN ADAPTATION OF EXISTING THEORIES OF EXCHANGE NETWORKS

    NARCIS (Netherlands)

    Dijkstra, Jacob

    2009-01-01

    The present paper extends the focus of network exchange research to externalities in exchange networks. Externalities of exchange are defined as direct effects on an actor's utility, of an exchange in which this actor is not involved. Existing theories in the field of network exchange do not inform

  19. Local versus nonlocal information in quantum-information theory: Formalism and phenomena

    International Nuclear Information System (INIS)

    Horodecki, Michal; Horodecki, Ryszard; Synak-Radtke, Barbara; Horodecki, Pawel; Oppenheim, Jonathan; Sen, Aditi; Sen, Ujjwal

    2005-01-01

    In spite of many results in quantum information theory, the complex nature of compound systems is far from clear. In general the information is a mixture of local and nonlocal ('quantum') information. It is important from both pragmatic and theoretical points of view to know the relationships between the two components. To make this point more clear, we develop and investigate the quantum-information processing paradigm in which parties sharing a multipartite state distill local information. The amount of information which is lost because the parties must use a classical communication channel is the deficit. This scheme can be viewed as complementary to the notion of distilling entanglement. After reviewing the paradigm in detail, we show that the upper bound for the deficit is given by the relative entropy distance to so-called pseudoclassically correlated states; the lower bound is the relative entropy of entanglement. This implies, in particular, that any entangled state is informationally nonlocal - i.e., has nonzero deficit. We also apply the paradigm to defining the thermodynamical cost of erasing entanglement. We show the cost is bounded from below by relative entropy of entanglement. We demonstrate the existence of several other nonlocal phenomena which can be found using the paradigm of local information. For example, we prove the existence of a form of nonlocality without entanglement and with distinguishability. We analyze the deficit for several classes of multipartite pure states and obtain that in contrast to the GHZ state, the Aharonov state is extremely nonlocal. We also show that there do not exist states for which the deficit is strictly equal to the whole informational content (bound local information). We discuss the relation of the paradigm with measures of classical correlations introduced earlier. It is also proved that in the one-way scenario, the deficit is additive for Bell diagonal states. We then discuss complementary features of

  20. A feedback framework for protein inference with peptides identified from tandem mass spectra

    Directory of Open Access Journals (Sweden)

    Shi Jinhong

    2012-11-01

    Full Text Available Abstract Background Protein inference is an important computational step in proteomics. There exists a natural nest relationship between protein inference and peptide identification, but these two steps are usually performed separately in existing methods. We believe that both peptide identification and protein inference can be improved by exploring such nest relationship. Results In this study, a feedback framework is proposed to process peptide identification reports from search engines, and an iterative method is implemented to exemplify the processing of Sequest peptide identification reports according to the framework. The iterative method is verified on two datasets with known validity of proteins and peptides, and compared with ProteinProphet and PeptideProphet. The results have shown that not only can the iterative method infer more true positive and less false positive proteins than ProteinProphet, but also identify more true positive and less false positive peptides than PeptideProphet. Conclusions The proposed iterative method implemented according to the feedback framework can unify and improve the results of peptide identification and protein inference.