WorldWideScience

Sample records for alpha-helical protein networks

  1. On the diffusion of alpha-helical proteins in solvents

    Science.gov (United States)

    Barredo, Wilson I.; Bornales, Jinky B.; Bernido, Christopher C.; Aringa, Henry P.

    2015-01-01

    The winding probability function for a biopolymer diffusing in a crowded cell is obtained with the drift coefficient f(s) involving Bessel functions of general form f(s) = kJ2p+1 (νs). The variable s is the length along the chain and ν is a constant which can be used to simulate the frequency of appearance of a certain type of amino acid. Application of a particular case p = 3 to protein chains is carried out for different alpha helical proteins found in the Protein Data Bank (PDB). Analysis of our results leads us to an empirical formula that can be used to conveniently predict k/D and ν, where D is the diffusion coefficient of various α-helical proteins in solvents.

  2. Alternative function for the mitochondrial SAM complex in biogenesis of alpha-helical TOM proteins.

    Science.gov (United States)

    Stojanovski, Diana; Guiard, Bernard; Kozjak-Pavlovic, Vera; Pfanner, Nikolaus; Meisinger, Chris

    2007-12-03

    The mitochondrial outer membrane contains two preprotein translocases: the general translocase of outer membrane (TOM) and the beta-barrel-specific sorting and assembly machinery (SAM). TOM functions as the central entry gate for nuclear-encoded proteins. The channel-forming Tom40 is a beta-barrel protein, whereas all Tom receptors and small Tom proteins are membrane anchored by a transmembrane alpha-helical segment in their N- or C-terminal portion. Synthesis of Tom precursors takes place in the cytosol, and their import occurs via preexisting TOM complexes. The precursor of Tom40 is then transferred to SAM for membrane insertion and assembly. Unexpectedly, we find that the biogenesis of alpha-helical Tom proteins with a membrane anchor in the C-terminal portion is SAM dependent. Each SAM protein is necessary for efficient membrane integration of the receptor Tom22, whereas assembly of the small Tom proteins depends on Sam37. Thus, the substrate specificity of SAM is not restricted to beta-barrel proteins but also includes the majority of alpha-helical Tom proteins.

  3. Crystal structure of tetranectin, a trimeric plasminogen-binding protein with an alpha-helical coiled coil

    DEFF Research Database (Denmark)

    Nielsen, B B; Kastrup, J S; Rasmussen, H

    1997-01-01

    Tetranectin is a plasminogen kringle 4-binding protein. The crystal structure has been determined at 2.8 A resolution using molecular replacement. Human tetranectin is a homotrimer forming a triple alpha-helical coiled coil. Each monomer consists of a carbohydrate recognition domain (CRD) connected...... the third is present only in long-form CRDs. Tetranectin represents the first structure of a long-form CRD with intact calcium-binding sites. In tetranectin, the third disulfide bridge tethers the CRD to the long helix in the coiled coil. The trimerization of tetranectin as well as the fixation of the CRDs...... relative to the helices in the coiled coil indicate a demand for high specificity in the recognition and binding of ligands....

  4. HMM_RA: An Improved Method for Alpha-Helical Transmembrane Protein Topology Prediction

    Directory of Open Access Journals (Sweden)

    Changhui Yan

    2008-01-01

    Full Text Available α-helical transmembrane (TM proteins play important and diverse functional roles in cells. The ability to predict the topology of these proteins is important for identifying functional sites and inferring function of membrane proteins. This paper presents a Hidden Markov Model (referred to as HMM_RA that can predict the topology of α-helical transmembrane proteins with improved performance. HMM_RA adopts the same structure as the HMMTOP method, which has five modules: inside loop, inside helix tail, membrane helix, outside helix tail and outside loop. Each module consists of one or multiple states. HMM_RA allows using reduced alphabets to encode protein sequences. Thus, each state of HMM_RA is associated with n emission probabilities, where n is the size of the reduced alphabet set. Direct comparisons using two standard data sets show that HMM_RA consistently outperforms HMMTOP and TMHMM in topology prediction. Specifically, on a high-quality data set of 83 proteins, HMM_RA outperforms HMMTOP by up to 7.6% in topology accuracy and 6.4% in α-helices location accuracy. On the same data set, HMM_RA outperforms TMHMM by up to 6.4% in topology accuracy and 2.9% in location accuracy. Comparison also shows that HMM_RA achieves comparable performance as Phobius, a recently published method.

  5. Plasmodium vivax antigen discovery based on alpha-helical coiled coil protein motif

    DEFF Research Database (Denmark)

    Céspedes, Nora; Habel, Catherine; Lopez-Perez, Mary

    2014-01-01

    Protein α-helical coiled coil structures that elicit antibody responses, which block critical functions of medically important microorganisms, represent a means for vaccine development. By using bioinformatics algorithms, a total of 50 antigens with α-helical coiled coil motifs orthologous to Pla...

  6. Plasmodium vivax antigen discovery based on alpha-helical coiled coil protein motif.

    Directory of Open Access Journals (Sweden)

    Nora Céspedes

    Full Text Available Protein α-helical coiled coil structures that elicit antibody responses, which block critical functions of medically important microorganisms, represent a means for vaccine development. By using bioinformatics algorithms, a total of 50 antigens with α-helical coiled coil motifs orthologous to Plasmodium falciparum were identified in the P. vivax genome. The peptides identified in silico were chemically synthesized; circular dichroism studies indicated partial or high α-helical content. Antigenicity was evaluated using human sera samples from malaria-endemic areas of Colombia and Papua New Guinea. Eight of these fragments were selected and used to assess immunogenicity in BALB/c mice. ELISA assays indicated strong reactivity of serum samples from individuals residing in malaria-endemic regions and sera of immunized mice, with the α-helical coiled coil structures. In addition, ex vivo production of IFN-γ by murine mononuclear cells confirmed the immunogenicity of these structures and the presence of T-cell epitopes in the peptide sequences. Moreover, sera of mice immunized with four of the eight antigens recognized native proteins on blood-stage P. vivax parasites, and antigenic cross-reactivity with three of the peptides was observed when reacted with both the P. falciparum orthologous fragments and whole parasites. Results here point to the α-helical coiled coil peptides as possible P. vivax malaria vaccine candidates as were observed for P. falciparum. Fragments selected here warrant further study in humans and non-human primate models to assess their protective efficacy as single components or assembled as hybrid linear epitopes.

  7. Temperature-dependent structural changes in intrinsically disordered proteins: formation of alpha-helices or loss of polyproline II?

    DEFF Research Database (Denmark)

    Kjærgaard, Magnus; Nørholm, Ann-Beth; Hendus-Altenburger, Ruth

    2010-01-01

    temperature, which most likely reflects formation of transient alpha-helices or loss of polyproline II (PPII) content. Using three IDPs, ACTR, NHE1, and Spd1, we show that the temperature-induced structural change is common among IDPs and is accompanied by a contraction of the conformational ensemble...... with increasing temperature, and accordingly these were not responsible for the change in the CD spectra. In contrast, the nonhelical regions exhibited a general temperature-dependent structural change that was independent of long-range interactions. The temperature-dependent CD spectroscopic signature of IDPs...... that has been amply documented can be rationalized to represent redistribution of the statistical coil involving a general loss of PPII conformations....

  8. Integrability and soliton solutions for an inhomogeneous generalized fourth-order nonlinear Schrödinger equation describing the inhomogeneous alpha helical proteins and Heisenberg ferromagnetic spin chains

    International Nuclear Information System (INIS)

    Wang, Pan; Tian, Bo; Jiang, Yan; Wang, Yu-Feng

    2013-01-01

    For describing the dynamics of alpha helical proteins with internal molecular excitations, nonlinear couplings between lattice vibrations and molecular excitations, and spin excitations in one-dimensional isotropic biquadratic Heisenberg ferromagnetic spin with the octupole–dipole interactions, we consider an inhomogeneous generalized fourth-order nonlinear Schrödinger equation. Based on the Ablowitz–Kaup–Newell–Segur system, infinitely many conservation laws for the equation are derived. Through the auxiliary function, bilinear forms and N-soliton solutions for the equation are obtained. Interactions of solitons are discussed by means of the asymptotic analysis. Effects of linear inhomogeneity on the interactions of solitons are also investigated graphically and analytically. Since the inhomogeneous coefficient of the equation h=α x+β, the soliton takes on the parabolic profile during the evolution. Soliton velocity is related to the parameter α, distance scale coefficient and biquadratic exchange coefficient, but has no relation with the parameter β. Soliton amplitude and width are only related to α. Soliton position is related to β

  9. The C-terminal portion of the fibrinogen-binding protein of Streptococcus equi subsp. equi contains extensive alpha-helical coiled-coil structure and contributes to thermal stability.

    Science.gov (United States)

    Meehan, Mary; Kelly, Sharon M; Price, Nicholas C; Owen, Peter

    2002-01-02

    The major cell wall-associated protein of the equine pathogen Streptococcus equi subsp. equi is a fibrinogen-binding protein (FgBP) which binds horse fibrinogen and equine IgG-Fc avidly through residues located in the N-terminal half and central regions of the molecule, respectively. The molecule is a major virulence factor for the organism and displays protective potential. In the present study, we use circular dichroism spectroscopy to investigate the secondary structure of the protein and show through the analysis of a panel of recombinant FgBP truncates that the C-terminal portion of FgBP contains an extensive alpha-helical coiled-coil structure that contributes to the thermal stability of the molecule.

  10. General architecture of the alpha-helical globule.

    Science.gov (United States)

    Murzin, A G; Finkelstein, A V

    1988-12-05

    A model is presented for the arrangement of alpha-helices in globular proteins. In the model, helices are placed on certain ribs of "quasi-spherical" polyhedra. The polyhedra are chosen so as to allow the close packing of helices around a hydrophobic core and to stress the collective interactions of the individual helices. The model predicts a small set of stable architectures for alpha-helices in globular proteins and describes the geometries of the helix packings. Some of the predicted helix arrangements have already been observed in known protein structures; others are new. An analysis of the three-dimensional structures of all proteins for which co-ordinates are available shows that the model closely approximates the arrangements and packing of helices actually observed. The average deviations of the real helix axes from those in the model polyhedra is +/- 20 degrees in orientation and +/- 2 A in position (1 A = 0.1 nm). We also show that for proteins that are not homologous, but whose helix arrangements are described by the same polyhedron, the root-mean-square difference in the position of the C alpha atoms in the helices is 1.6 to 3.0 A.

  11. Amphipathic alpha-helices and putative cholesterol binding domains of the influenza virus matrix M1 protein are crucial for virion structure organisation.

    Science.gov (United States)

    Tsfasman, Tatyana; Kost, Vladimir; Markushin, Stanislav; Lotte, Vera; Koptiaeva, Irina; Bogacheva, Elena; Baratova, Ludmila; Radyukhin, Victor

    2015-12-02

    The influenza virus matrix M1 protein is an amphitropic membrane-associated protein, forming the matrix layer immediately beneath the virus raft membrane, thereby ensuring the proper structure of the influenza virion. The objective of this study was to elucidate M1 fine structural characteristics, which determine amphitropic properties and raft membrane activities of the protein, via 3D in silico modelling with subsequent mutational analysis. Computer simulations suggest the amphipathic nature of the M1 α-helices and the existence of putative cholesterol binding (CRAC) motifs on six amphipathic α-helices. Our finding explains for the first time many features of this protein, particularly the amphitropic properties and raft/cholesterol binding potential. To verify these results, we generated mutants of the A/WSN/33 strain via reverse genetics. The M1 mutations included F32Y in the CRAC of α-helix 2, W45Y and W45F in the CRAC of α-helix 3, Y100S in the CRAC of α-helix 6, M128A and M128S in the CRAC of α-helix 8 and a double L103I/L130I mutation in both a putative cholesterol consensus motif and the nuclear localisation signal. All mutations resulted in viruses with unusual filamentous morphology. Previous experimental data regarding the morphology of M1-gene mutant influenza viruses can now be explained in structural terms and are consistent with the pivotal role of the CRAC-domains and amphipathic α-helices in M1-lipid interactions. Copyright © 2015 Elsevier B.V. All rights reserved.

  12. Examining the Conservation of Kinks in Alpha Helices.

    Directory of Open Access Journals (Sweden)

    Eleanor C Law

    Full Text Available Kinks are a structural feature of alpha-helices and many are known to have functional roles. Kinks have previously tended to be defined in a binary fashion. In this paper we have deliberately moved towards defining them on a continuum, which given the unimodal distribution of kink angles is a better description. From this perspective, we examine the conservation of kinks in proteins. We find that kink angles are not generally a conserved property of homologs, pointing either to their not being functionally critical or to their function being related to conformational flexibility. In the latter case, the different structures of homologs are providing snapshots of different conformations. Sequence identity between homologous helices is informative in terms of kink conservation, but almost equally so is the sequence identity of residues in spatial proximity to the kink. In the specific case of proline, which is known to be prevalent in kinked helices, loss of a proline from a kinked helix often also results in the loss of a kink or reduction in its kink angle. We carried out a study of the seven transmembrane helices in the GPCR family and found that changes in kinks could be related both to subfamilies of GPCRs and also, in a particular subfamily, to the binding of agonists or antagonists. These results suggest conformational change upon receptor activation within the GPCR family. We also found correlation between kink angles in different helices, and the possibility of concerted motion could be investigated further by applying our method to molecular dynamics simulations. These observations reinforce the belief that helix kinks are key, functional, flexible points in structures.

  13. Temperature profiling of polypeptides in reversed-phase liquid chromatography. I. Monitoring of dimerization and unfolding of amphipathic alpha-helical peptides.

    Science.gov (United States)

    Mant, Colin T; Chen, Yuxin; Hodges, Robert S

    2003-08-15

    The present study sets out to extend the utility of reversed-phase liquid chromatography (RP-HPLC) by demonstrating its ability to monitor dimerization and unfolding of de novo designed synthetic amphipathic alpha-helical peptides on stationary phases of varying hydrophobicity. Thus, we have compared the effect of temperature (5-80 degrees C) on the RP-HPLC (C8 or cyano columns) elution behaviour of mixtures of peptides encompassing amphipathic alpha-helical structure, amphipathic alpha-helical structure with L- or D-substitutions or non-amphipathic alpha-helical structure. By comparing the retention behaviour of the helical peptides to a peptide of negligible secondary structure (a random coil), we rationalize that "temperature profiling" by RP-HPLC can monitor association of peptide molecules, either through oligomerization or aggregation, or monitor unfolding of alpha-helical peptides with increasing temperature. We believe that the conformation-dependent response of peptides to RP-HPLC under changing temperature has implications both for general analysis and purification of peptides but also for the de novo design of peptides and proteins.

  14. Effects of hydrophobicity on the antifungal activity of alpha-helical antimicrobial peptides.

    NARCIS (Netherlands)

    Jiang, Z.; Kullberg, B.J.; Lee, H. van der; Vasil, A.I.; Hale, J.D.; Mant, C.T.; Hancock, R.E.; Vasil, M.L.; Netea, M.G.; Hodges, R.S.

    2008-01-01

    We utilized a series of analogs of D-V13K (a 26-residue amphipathic alpha-helical antimicrobial peptide, denoted D1) to compare and contrast the role of hydrophobicity on antifungal and antibacterial activity to the results obtained previously with Pseudomonas aeruginosa strains. Antifungal activity

  15. The activation energy for insertion of transmembrane alpha-helices is dependent on membrane composition.

    Science.gov (United States)

    Meijberg, Wim; Booth, Paula J

    2002-06-07

    The physical mechanisms that govern the folding and assembly of integral membrane proteins are poorly understood. It appears that certain properties of the lipid bilayer affect membrane protein folding in vitro, either by modulating helix insertion or packing. In order to begin to understand the origin of this effect, we investigate the effect of lipid forces on the insertion of a transmembrane alpha-helix using a water-soluble, alanine-based peptide, KKAAAIAAAAAIAAWAAIAAAKKKK-amide. This peptide binds to preformed 1,2-dioleoyl-l-alpha-phosphatidylcholine (DOPC) vesicles at neutral pH, but spontaneous transmembrane helix insertion directly from the aqueous phase only occurs at high pH when the Lys residues are de-protonated. These results suggest that the translocation of charge is a major determinant of the activation energy for insertion. Time-resolved measurements of the insertion process at high pH indicate biphasic kinetics with time constants of ca 30 and 430 seconds. The slower phase seems to correlate with formation of a predominantly transmembrane alpha-helical conformation, as determined from the transfer of the tryptophan residue to the hydrocarbon region of the membrane. Temperature-dependent measurements showed that insertion can proceed only above a certain threshold temperature and that the Arrhenius activation energy is of the order of 90 kJ mol(-1). The kinetics, threshold temperature and the activation energy change with the mole fraction of 1,2-dioleoyl-l-alpha-phosphatidylethanolamine (DOPE) introduced into the DOPC membrane. The activation energy increases with increasing DOPE content, which could reflect the fact that this lipid drives the bilayer towards a non-bilayer transition and increases the lateral pressure in the lipid chain region. This suggests that folding events involving the insertion of helical segments across the bilayer can be controlled by lipid forces. (c) 2002 Elsevier Science Ltd.

  16. A recurring two-hydrogen-bond motif incorporating a serine or threonine residue is found both at alpha-helical N termini and in other situations.

    Science.gov (United States)

    Wan, W Y; Milner-White, E J

    1999-03-12

    Side-chain hydroxyl residues in protein crystal structures often form hydrogen bonds with main-chain atoms. The most common bond arrangement is a four to five residue motif in which a serine or threonine is the first residue forming two characteristic hydrogen bonds to residues ahead of it in sequence. We call them ST-motifs, by analogy with the term Asx-motif we suggested for the related motifs with aspartate and asparagine residues. ST-motifs are common, there being just under one and a half in a typical protein subunit. Asx-motifs are even more common, such that 9 % of the residues of an average protein consist of Asx or ST-motifs. Of the ST-motifs, three-quarters are at helical N termini, and the rest occur by themselves or in conjunction with beta-bulge loops. A third of all alpha-helices have either ST-motifs or Asx-motifs at their N termini. Previous work has emphasised the occurrence of the capping box at alpha-helical N termini, but the capping box occurs in only 5 % of alpha-helical N termini; also, we point out that it can be regarded as a subset of the ST-motif (or, occasionally, of the Asx-motif). By comparing related sequences, the rates which amino acid residues at the first position of ST or Asx-motifs interchange during evolution are examined. Serine threonine, and aspartate asparagine, interchange is rapid; inter-pair exchange is slower, but much faster than exchange with other amino acid residues. This is consistent with the general similarity of ST-motifs and Asx-motifs combined with some subtle structural differences between them that are described. Copyright 1999 Academic Press.

  17. Chain length dependence of the helix orientation in Langmuir-Blodgett monolayers of alpha-helical diblock copolypeptides

    NARCIS (Netherlands)

    Nguyen, Le-Thu T.; Ardana, Aditya; Vorenkamp, Eltjo J.; ten Brinke, Gerrit; Schouten, Arend J.

    2010-01-01

    The effect of chain length on the helix orientation of alpha-helical diblock copolypeptides in Langmuir and Langmuir-Blodgett monolayers is reported for the first time. Amphiphilic diblock copolypeptides (PLGA-b-PMLGSLGs) of poly(alpha-L-glutamic acid) (PLGA) and

  18. Alpha-helical hydrophobic polypeptides form proton-selective channels in lipid bilayers

    Science.gov (United States)

    Oliver, A. E.; Deamer, D. W.

    1994-01-01

    Proton translocation is important in membrane-mediated processes such as ATP-dependent proton pumps, ATP synthesis, bacteriorhodopsin, and cytochrome oxidase function. The fundamental mechanism, however, is poorly understood. To test the theoretical possibility that bundles of hydrophobic alpha-helices could provide a low energy pathway for ion translocation through the lipid bilayer, polyamino acids were incorporated into extruded liposomes and planar lipid membranes, and proton translocation was measured. Liposomes with incorporated long-chain poly-L-alanine or poly-L-leucine were found to have proton permeability coefficients 5 to 7 times greater than control liposomes, whereas short-chain polyamino acids had relatively little effect. Potassium permeability was not increased markedly by any of the polyamino acids tested. Analytical thin layer chromatography measurements of lipid content and a fluorescamine assay for amino acids showed that there were approximately 135 polyleucine or 65 polyalanine molecules associated with each liposome. Fourier transform infrared spectroscopy indicated that a major fraction of the long-chain hydrophobic peptides existed in an alpha-helical conformation. Single-channel recording in both 0.1 N HCl and 0.1 M KCl was also used to determine whether proton-conducting channels formed in planar lipid membranes (phosphatidylcholine/phosphatidylethanolamine, 1:1). Poly-L-leucine and poly-L-alanine in HCl caused a 10- to 30-fold increase in frequency of conductive events compared to that seen in KCl or by the other polyamino acids in either solution. This finding correlates well with the liposome observations in which these two polyamino acids caused the largest increase in membrane proton permeability but had little effect on potassium permeability. Poly-L-leucine was considerably more conductive than poly-L-alanine due primarily to larger event amplitudes and, to a lesser extent, a higher event frequency. Poly-L-leucine caused two

  19. Alpha helical structures in the leader sequence of human GLUD2 glutamate dehydrogenase responsible for mitochondrial import.

    Science.gov (United States)

    Kotzamani, Dimitra; Plaitakis, Andreas

    2012-09-01

    Human glutamate dehydrogenase (hGDH) exists in two highly homologous isoforms with a distinct regulatory and tissue expression profile: a housekeeping hGDH1 isoprotein encoded by the GLUD1 gene and an hGDH2 isoenzyme encoded by the GLUD2 gene. There is evidence that both isoenzymes are synthesized as pro-enzymes containing a 53 amino acid long N-terminal leader peptide that is cleaved upon translocation into the mitochondria. However, this GDH signal peptide is substantially larger than that of most nuclear DNA-encoded mitochondrial proteins, the leader sequence of which typically contains 17-35 amino acids and they often form a single amphipathic α-helix. To decode the structural elements that are essential for the mitochondrial targeting of human GDHs, we performed secondary structure analyses of their leader sequence. These analyses predicted, with 82% accuracy, that both leader peptides are positively charged and that they form two to three α-helices, separated by intermediate loops. The first α-helix of hGDH2 is strongly amphipathic, displaying both a positively charged surface and a hydrophobic plane. We then constructed GLUD2-EGFP deletion mutants and used them to transfect three mammalian cell lines (HEK293, COS 7 and SHSY-5Y). Confocal laser scanning microscopy, following co-transfection with pDsRed2-Mito mitochondrial targeting vector, revealed that deletion of the entire leader sequence prevented the enzyme from entering the mitochondria, resulting in its retention in the cytoplasm. Deletion of the first strongly amphipathic α-helix only was also sufficient to prevent the mitochondrial localization of the truncated protein. Moreover, truncated leader sequences, retaining the second and/or the third putative α-helix, failed to restore the mitochondrial import of hGDH2. As such, the first N-terminal alpha helical structure is crucial for the mitochondrial import of hGDH2 and these findings may have implications in understanding the evolutionary

  20. Consequences of non-uniformity in the stoichiometry of component fractions within one and two loops models of alpha-helical peptides

    Science.gov (United States)

    Atoms in biomolecular structures like alpha helices contain an array of distances and angles which include abundant multiple patterns of redundancies. Thus all peptides backbones contain the three atom sequence N-C*C, whereas the repeating set of a four atom sequences (N-C*C-N, C*-C-N-C*, and C-N-C...

  1. Functional and genomic analyses of alpha-solenoid proteins.

    Directory of Open Access Journals (Sweden)

    David Fournier

    Full Text Available Alpha-solenoids are flexible protein structural domains formed by ensembles of alpha-helical repeats (Armadillo and HEAT repeats among others. While homology can be used to detect many of these repeats, some alpha-solenoids have very little sequence homology to proteins of known structure and we expect that many remain undetected. We previously developed a method for detection of alpha-helical repeats based on a neural network trained on a dataset of protein structures. Here we improved the detection algorithm and updated the training dataset using recently solved structures of alpha-solenoids. Unexpectedly, we identified occurrences of alpha-solenoids in solved protein structures that escaped attention, for example within the core of the catalytic subunit of PI3KC. Our results expand the current set of known alpha-solenoids. Application of our tool to the protein universe allowed us to detect their significant enrichment in proteins interacting with many proteins, confirming that alpha-solenoids are generally involved in protein-protein interactions. We then studied the taxonomic distribution of alpha-solenoids to discuss an evolutionary scenario for the emergence of this type of domain, speculating that alpha-solenoids have emerged in multiple taxa in independent events by convergent evolution. We observe a higher rate of alpha-solenoids in eukaryotic genomes and in some prokaryotic families, such as Cyanobacteria and Planctomycetes, which could be associated to increased cellular complexity. The method is available at http://cbdm.mdc-berlin.de/~ard2/.

  2. Designed low amphipathic peptides with alpha-helical propensity exhibiting antimicrobial activity via a lipid domain formation mechanism.

    Science.gov (United States)

    Yamamoto, Naoki; Tamura, Atsuo

    2010-05-01

    Although several low amphipathic peptides have been known to exhibit antimicrobial activity, their mode of action has not been completely elucidated. In this study, using designed low amphipathic peptides that retain different alpha-helical content and hydrophobicity, we attempted to investigate the mechanism of these properties. Calorimetric and thermodynamic analyses demonstrated that the peptides induce formation of two lipid domains in an anionic liposome at a high peptide-to-lipid ratio. On the other hand, even at a low peptide-to-lipid ratio, they caused minimal membrane damage, such as flip-flop of membrane lipids or leakage of calcein molecules from liposomes, and never translocated across membranes. Interaction energies between the peptides and anionic liposomes showed good correlation with antimicrobial activity for both Escherichia coli and Bacillus subtilis. We thus propose that the domain formation mechanism in which antimicrobial peptides exhibit activity solely by forming lipid domains without membrane damage is a major determinant of the antimicrobial activity of low amphipathic peptides. These peptides appear to stiffen the membrane such that it is deprived of the fluidity necessary for biological functions. We also showed that to construct the lipid domains, peptides need not form stable and cooperative structures. Rather, it is essential for peptides to only interact tightly with the membrane interface via strong electrostatic interactions, and slight differences in binding strength are invoked by differences in hydrophobicity. The peptides thus designed might pave the way for "clean" antimicrobial reagents that never cause release of membrane elements and efflux of their inner components. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  3. Spectral affinity in protein networks.

    Science.gov (United States)

    Voevodski, Konstantin; Teng, Shang-Hua; Xia, Yu

    2009-11-29

    Protein-protein interaction (PPI) networks enable us to better understand the functional organization of the proteome. We can learn a lot about a particular protein by querying its neighborhood in a PPI network to find proteins with similar function. A spectral approach that considers random walks between nodes of interest is particularly useful in evaluating closeness in PPI networks. Spectral measures of closeness are more robust to noise in the data and are more precise than simpler methods based on edge density and shortest path length. We develop a novel affinity measure for pairs of proteins in PPI networks, which uses personalized PageRank, a random walk based method used in context-sensitive search on the Web. Our measure of closeness, which we call PageRank Affinity, is proportional to the number of times the smaller-degree protein is visited in a random walk that restarts at the larger-degree protein. PageRank considers paths of all lengths in a network, therefore PageRank Affinity is a precise measure that is robust to noise in the data. PageRank Affinity is also provably related to cluster co-membership, making it a meaningful measure. In our experiments on protein networks we find that our measure is better at predicting co-complex membership and finding functionally related proteins than other commonly used measures of closeness. Moreover, our experiments indicate that PageRank Affinity is very resilient to noise in the network. In addition, based on our method we build a tool that quickly finds nodes closest to a queried protein in any protein network, and easily scales to much larger biological networks. We define a meaningful way to assess the closeness of two proteins in a PPI network, and show that our closeness measure is more biologically significant than other commonly used methods. We also develop a tool, accessible at http://xialab.bu.edu/resources/pnns, that allows the user to quickly find nodes closest to a queried vertex in any protein

  4. Spectral affinity in protein networks

    Directory of Open Access Journals (Sweden)

    Teng Shang-Hua

    2009-11-01

    Full Text Available Abstract Background Protein-protein interaction (PPI networks enable us to better understand the functional organization of the proteome. We can learn a lot about a particular protein by querying its neighborhood in a PPI network to find proteins with similar function. A spectral approach that considers random walks between nodes of interest is particularly useful in evaluating closeness in PPI networks. Spectral measures of closeness are more robust to noise in the data and are more precise than simpler methods based on edge density and shortest path length. Results We develop a novel affinity measure for pairs of proteins in PPI networks, which uses personalized PageRank, a random walk based method used in context-sensitive search on the Web. Our measure of closeness, which we call PageRank Affinity, is proportional to the number of times the smaller-degree protein is visited in a random walk that restarts at the larger-degree protein. PageRank considers paths of all lengths in a network, therefore PageRank Affinity is a precise measure that is robust to noise in the data. PageRank Affinity is also provably related to cluster co-membership, making it a meaningful measure. In our experiments on protein networks we find that our measure is better at predicting co-complex membership and finding functionally related proteins than other commonly used measures of closeness. Moreover, our experiments indicate that PageRank Affinity is very resilient to noise in the network. In addition, based on our method we build a tool that quickly finds nodes closest to a queried protein in any protein network, and easily scales to much larger biological networks. Conclusion We define a meaningful way to assess the closeness of two proteins in a PPI network, and show that our closeness measure is more biologically significant than other commonly used methods. We also develop a tool, accessible at http://xialab.bu.edu/resources/pnns, that allows the user to

  5. A natural grouping of motifs with an aspartate or asparagine residue forming two hydrogen bonds to residues ahead in sequence: their occurrence at alpha-helical N termini and in other situations.

    Science.gov (United States)

    Wan, W Y; Milner-White, E J

    1999-03-12

    Examination of the ways side-chain carboxylate and amide groups in high-resolution protein crystal structures form hydrogen bonds with main-chain atoms reveals that the most common category is a two-hydrogen-bond four to five residue motif with an aspartate or asparagine (Asx) at the first residue, for which we propose the name Asx-motif. Similar motifs with glutamate or glutamine residues at that position are rare. Asx-motifs occur typically as (1) a common feature of the N termini of alpha-helices called the Asx N-cap motif; (2) an independent motif, usually a beta-turn with an appropriately hydrogen-bonded Asx as the first residue; and (3) a motif incorporated in a beta-bulge loop. Asx-motifs are common, there being just under two-and-a-half in an average-sized protein subunit; of these, about 55 % are Asx N-cap motifs. Because they occur often in many situations, it seems that these motifs have an inherent propensity to form on their own rather than just being a feature stabilised at the end of a helix. Asx-motifs also occur in functionally interesting situations in aspartyl proteases, citrate synthase, EF hands, haemoglobins, lipocalins, glutathione reductase and the alpha/beta hydrolases. Copyright 1999 Academic Press.

  6. Toxoplasma gondii: Biochemical and biophysical characterization of recombinant soluble dense granule proteins GRA2 and GRA6

    Energy Technology Data Exchange (ETDEWEB)

    Bittame, Amina [CNRS, UMR 5163, 38042 Grenoble (France); Université Grenoble Alpes, 38042 Grenoble (France); Effantin, Grégory [Université Grenoble Alpes, Institut de Biologie Structurale (IBS), 38044 Grenoble (France); CNRS, IBS, 38044 Grenoble (France); CEA, IBS, 38044 Grenoble (France); Unit for Virus Host-Cell Interactions (UVHCI), UMI 3265 (UJF-EMBL-CNRS), 38027 Grenoble (France); Pètre, Graciane; Ruffiot, Pauline; Travier, Laetitia [CNRS, UMR 5163, 38042 Grenoble (France); Université Grenoble Alpes, 38042 Grenoble (France); Schoehn, Guy; Weissenhorn, Winfried [Université Grenoble Alpes, Institut de Biologie Structurale (IBS), 38044 Grenoble (France); CNRS, IBS, 38044 Grenoble (France); CEA, IBS, 38044 Grenoble (France); Unit for Virus Host-Cell Interactions (UVHCI), UMI 3265 (UJF-EMBL-CNRS), 38027 Grenoble (France); Cesbron-Delauw, Marie-France; Gagnon, Jean [CNRS, UMR 5163, 38042 Grenoble (France); Université Grenoble Alpes, 38042 Grenoble (France); Mercier, Corinne, E-mail: corinne.mercier@ujf-grenoble.fr [CNRS, UMR 5163, 38042 Grenoble (France); Université Grenoble Alpes, 38042 Grenoble (France)

    2015-03-27

    The most prominent structural feature of the parasitophorous vacuole (PV) in which the intracellular parasite Toxoplasma gondii proliferates is a membranous nanotubular network (MNN), which interconnects the parasites and the PV membrane. The MNN function remains unclear. The GRA2 and GRA6 proteins secreted from the parasite dense granules into the PV have been implicated in the MNN biogenesis. Amphipathic alpha-helices (AAHs) predicted in GRA2 and an alpha-helical hydrophobic domain predicted in GRA6 have been proposed to be responsible for their membrane association, thereby potentially molding the MMN in its structure. Here we report an analysis of the recombinant proteins (expressed in detergent-free conditions) by circular dichroism, which showed that full length GRA2 displays an alpha-helical secondary structure while recombinant GRA6 and GRA2 truncated of its AAHs are mainly random coiled. Dynamic light scattering and transmission electron microscopy showed that recombinant GRA6 and truncated GRA2 constitute a homogenous population of small particles (6–8 nm in diameter) while recombinant GRA2 corresponds to 2 populations of particles (∼8–15 nm and up to 40 nm in diameter, respectively). The unusual properties of GRA2 due to its AAHs are discussed. - Highlights: • Toxoplasma gondii: soluble GRA2 forms 2 populations of particles. • T. gondii: the dense granule protein GRA2 folds intrinsically as an alpha-helix. • T. gondii: monomeric soluble GRA6 forms particles of 6–8 nm in diameter. • T. gondii: monomeric soluble GRA6 is random coiled. • Unusual biophysical properties of the dense granule protein GRA2 from T. gondii.

  7. Oligomeric protein structure networks: insights into protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Brinda KV

    2005-12-01

    Full Text Available Abstract Background Protein-protein association is essential for a variety of cellular processes and hence a large number of investigations are being carried out to understand the principles of protein-protein interactions. In this study, oligomeric protein structures are viewed from a network perspective to obtain new insights into protein association. Structure graphs of proteins have been constructed from a non-redundant set of protein oligomer crystal structures by considering amino acid residues as nodes and the edges are based on the strength of the non-covalent interactions between the residues. The analysis of such networks has been carried out in terms of amino acid clusters and hubs (highly connected residues with special emphasis to protein interfaces. Results A variety of interactions such as hydrogen bond, salt bridges, aromatic and hydrophobic interactions, which occur at the interfaces are identified in a consolidated manner as amino acid clusters at the interface, from this study. Moreover, the characterization of the highly connected hub-forming residues at the interfaces and their comparison with the hubs from the non-interface regions and the non-hubs in the interface regions show that there is a predominance of charged interactions at the interfaces. Further, strong and weak interfaces are identified on the basis of the interaction strength between amino acid residues and the sizes of the interface clusters, which also show that many protein interfaces are stronger than their monomeric protein cores. The interface strengths evaluated based on the interface clusters and hubs also correlate well with experimentally determined dissociation constants for known complexes. Finally, the interface hubs identified using the present method correlate very well with experimentally determined hotspots in the interfaces of protein complexes obtained from the Alanine Scanning Energetics database (ASEdb. A few predictions of interface hot

  8. Scaffolds, levers, rods and springs: diverse cellular functions of long coiled-coil proteins.

    Science.gov (United States)

    Rose, A; Meier, I

    2004-08-01

    Long alpha-helical coiled-coil proteins are involved in a variety of organizational and regulatory processes in eukaryotic cells. They provide cables and networks in the cyto- and nucleoskeleton, molecular scaffolds that organize membrane systems, motors, levers, rotating arms and possibly springs. A growing number of human diseases are found to be caused by mutations in long coiled-coil proteins. This review summarizes our current understanding of the multifaceted group of long coiled-coil proteins in the cytoskeleton, nucleus, Golgi and cell division apparatus. The biophysical features of coiled-coil domains provide first clues toward their contribution to the diverse protein functions and promise potential future applications in the area of nanotechnology. Combining the power of fully sequenced genomes and structure prediction algorithms, it is now possible to comprehensively summarize and compare the complete inventory of coiled-coil proteins of different organisms.

  9. Structural flexibility of the G alpha s alpha-helical domain in the beta2-adrenoceptor Gs complex

    DEFF Research Database (Denmark)

    Westfield, Gerwin H; Rasmussen, Søren Gøgsig Faarup; Su, Min

    2011-01-01

    The active-state complex between an agonist-bound receptor and a guanine nucleotide-free G protein represents the fundamental signaling assembly for the majority of hormone and neurotransmitter signaling. We applied single-particle electron microscopy (EM) analysis to examine the architecture...... of agonist-occupied β(2)-adrenoceptor (β(2)AR) in complex with the heterotrimeric G protein Gs (Gαsβγ). EM 2D averages and 3D reconstructions of the detergent-solubilized complex reveal an overall architecture that is in very good agreement with the crystal structure of the active-state ternary complex...

  10. Alpha-Helical Fragaceatoxin C Nanopore Engineered for Double-Stranded and Single-Stranded Nucleic Acid Analysis

    NARCIS (Netherlands)

    Wloka, Carsten; Mutter, Natalie Lisa; Soskine, Misha; Maglia, Giovanni

    2016-01-01

    Nanopores are used in single-molecule DNA analysis and sequencing. Herein, we show that Fragaceatoxin C (FraC), an α-helical pore-forming toxin from an actinoporin protein family, can be reconstituted in sphingomyelin-free standard planar lipid bilayers. We engineered FraC for DNA analysis and show

  11. Design of a minimal protein oligomerization domain by a structural approach.

    Science.gov (United States)

    Burkhard, P; Meier, M; Lustig, A

    2000-12-01

    Because of the simplicity and regularity of the alpha-helical coiled coil relative to other structural motifs, it can be conveniently used to clarify the molecular interactions responsible for protein folding and stability. Here we describe the de novo design and characterization of a two heptad-repeat peptide stabilized by a complex network of inter- and intrahelical salt bridges. Circular dichroism spectroscopy and analytical ultracentrifugation show that this peptide is highly alpha-helical and 100% dimeric tinder physiological buffer conditions. Interestingly, the peptide was shown to switch its oligomerization state from a dimer to a trimer upon increasing ionic strength. The correctness of the rational design principles used here is supported by details of the atomic structure of the peptide deduced from X-ray crystallography. The structure of the peptide shows that it is not a molten globule but assumes a unique, native-like conformation. This de novo peptide thus represents an attractive model system for the design of a molecular recognition system.

  12. Light-activated DNA binding in a designed allosteric protein

    Energy Technology Data Exchange (ETDEWEB)

    Strickland, Devin; Moffat, Keith; Sosnick, Tobin R. (UC)

    2008-09-03

    An understanding of how allostery, the conformational coupling of distant functional sites, arises in highly evolvable systems is of considerable interest in areas ranging from cell biology to protein design and signaling networks. We reasoned that the rigidity and defined geometry of an {alpha}-helical domain linker would make it effective as a conduit for allosteric signals. To test this idea, we rationally designed 12 fusions between the naturally photoactive LOV2 domain from Avena sativa phototropin 1 and the Escherichia coli trp repressor. When illuminated, one of the fusions selectively binds operator DNA and protects it from nuclease digestion. The ready success of our rational design strategy suggests that the helical 'allosteric lever arm' is a general scheme for coupling the function of two proteins.

  13. Protein Networks in Alzheimer’s Disease

    DEFF Research Database (Denmark)

    Carlsen, Eva Maria Meier; Rasmussen, Rune

    2017-01-01

    Overlap of RNA and protein networks reveals glia cells as key players for the development of symptomatic Alzheimer’s disease in humans......Overlap of RNA and protein networks reveals glia cells as key players for the development of symptomatic Alzheimer’s disease in humans...

  14. Ontology integration to identify protein complex in protein interaction networks

    Directory of Open Access Journals (Sweden)

    Yang Zhihao

    2011-10-01

    Full Text Available Abstract Background Protein complexes can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of protein complexes detection algorithms. Methods We have developed novel semantic similarity method, which use Gene Ontology (GO annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. Following the approach of that of the previously proposed clustering algorithm IPCA which expands clusters starting from seeded vertices, we present a clustering algorithm OIIP based on the new weighted Protein-Protein interaction networks for identifying protein complexes. Results The algorithm OIIP is applied to the protein interaction network of Sacchromyces cerevisiae and identifies many well known complexes. Experimental results show that the algorithm OIIP has higher F-measure and accuracy compared to other competing approaches.

  15. Hepatitis C virus infection protein network.

    Science.gov (United States)

    de Chassey, B; Navratil, V; Tafforeau, L; Hiet, M S; Aublin-Gex, A; Agaugué, S; Meiffren, G; Pradezynski, F; Faria, B F; Chantier, T; Le Breton, M; Pellet, J; Davoust, N; Mangeot, P E; Chaboud, A; Penin, F; Jacob, Y; Vidalain, P O; Vidal, M; André, P; Rabourdin-Combe, C; Lotteau, V

    2008-01-01

    A proteome-wide mapping of interactions between hepatitis C virus (HCV) and human proteins was performed to provide a comprehensive view of the cellular infection. A total of 314 protein-protein interactions between HCV and human proteins was identified by yeast two-hybrid and 170 by literature mining. Integration of this data set into a reconstructed human interactome showed that cellular proteins interacting with HCV are enriched in highly central and interconnected proteins. A global analysis on the basis of functional annotation highlighted the enrichment of cellular pathways targeted by HCV. A network of proteins associated with frequent clinical disorders of chronically infected patients was constructed by connecting the insulin, Jak/STAT and TGFbeta pathways with cellular proteins targeted by HCV. CORE protein appeared as a major perturbator of this network. Focal adhesion was identified as a new function affected by HCV, mainly by NS3 and NS5A proteins.

  16. Unraveling protein networks with power graph analysis.

    Directory of Open Access Journals (Sweden)

    Loïc Royer

    Full Text Available Networks play a crucial role in computational biology, yet their analysis and representation is still an open problem. Power Graph Analysis is a lossless transformation of biological networks into a compact, less redundant representation, exploiting the abundance of cliques and bicliques as elementary topological motifs. We demonstrate with five examples the advantages of Power Graph Analysis. Investigating protein-protein interaction networks, we show how the catalytic subunits of the casein kinase II complex are distinguishable from the regulatory subunits, how interaction profiles and sequence phylogeny of SH3 domains correlate, and how false positive interactions among high-throughput interactions are spotted. Additionally, we demonstrate the generality of Power Graph Analysis by applying it to two other types of networks. We show how power graphs induce a clustering of both transcription factors and target genes in bipartite transcription networks, and how the erosion of a phosphatase domain in type 22 non-receptor tyrosine phosphatases is detected. We apply Power Graph Analysis to high-throughput protein interaction networks and show that up to 85% (56% on average of the information is redundant. Experimental networks are more compressible than rewired ones of same degree distribution, indicating that experimental networks are rich in cliques and bicliques. Power Graphs are a novel representation of networks, which reduces network complexity by explicitly representing re-occurring network motifs. Power Graphs compress up to 85% of the edges in protein interaction networks and are applicable to all types of networks such as protein interactions, regulatory networks, or homology networks.

  17. Protein complexes predictions within protein interaction networks using genetic algorithms.

    Science.gov (United States)

    Ramadan, Emad; Naef, Ahmed; Ahmed, Moataz

    2016-07-25

    Protein-protein interaction networks are receiving increased attention due to their importance in understanding life at the cellular level. A major challenge in systems biology is to understand the modular structure of such biological networks. Although clustering techniques have been proposed for clustering protein-protein interaction networks, those techniques suffer from some drawbacks. The application of earlier clustering techniques to protein-protein interaction networks in order to predict protein complexes within the networks does not yield good results due to the small-world and power-law properties of these networks. In this paper, we construct a new clustering algorithm for predicting protein complexes through the use of genetic algorithms. We design an objective function for exclusive clustering and overlapping clustering. We assess the quality of our proposed clustering algorithm using two gold-standard data sets. Our algorithm can identify protein complexes that are significantly enriched in the gold-standard data sets. Furthermore, our method surpasses three competing methods: MCL, ClusterOne, and MCODE in terms of the quality of the predicted complexes. The source code and accompanying examples are freely available at http://faculty.kfupm.edu.sa/ics/eramadan/GACluster.zip .

  18. Enthalpic and entropic stages in alpha-helical peptide unfolding, from laser T-jump/UV Raman spectroscopy.

    Science.gov (United States)

    Balakrishnan, Gurusamy; Hu, Ying; Bender, Gretchen M; Getahun, Zelleka; DeGrado, William F; Spiro, Thomas G

    2007-10-24

    The alpha-helix is a ubiquitous structural element in proteins, and a number of studies have addressed the mechanism of helix formation and melting in simple peptides. However, fundamental issues remain to be resolved, particularly the temperature (T) dependence of the rate. In this work, we report application of a novel kHz repetition rate solid-state tunable NIR (pump) and deep UV Raman (probe) laser system to study the dynamics of helix unfolding in Ac-GSPEA3KA4KA4-CO-D-Arg-CONH2, a peptide designed for helix stabilization in aqueous solution. Its T-dependent UV resonance Raman (UVRR) spectra, excited at 197 nm for optimal enhancement of amide vibrations, were decomposed into variable contributions from helix and coil spectra. The helix fractions derived from the UVRR spectra and from far UV CD spectra were coincident at low T but deviated increasingly at high T, the UVRR curve giving higher helix content. This difference is consistent with the greater sensitivity of UVRR spectra to local conformation than CD. After a laser-induced T-jump, the UVRR-determined helix fractions defined monoexponential decays, with time-constants of approximately 120 ns, independent of the final T (Tf = 18-61 degrees C), provided the initial T (Ti) was held constant (6 degrees C). However, there was also a prompt loss of helicity, whose amplitude increased with increasing Tf, thereby defining an initial enthalpic phase, distinct from the subsequent entropic phase. These phases are attributed to disruption of H-bonds followed by reorientation of peptide links, as the chain is extended. When Ti was raised in parallel with Tf (10 degrees C T-jumps), the prompt phase merged into an accelerating slow phase, an effect attributable to the shifting distribution of initial helix lengths. Even greater acceleration with rising Ti has been reported in T-jump experiments monitored by IR and fluorescence spectroscopies. This difference is attributable to the longer range character of these probes

  19. Identifying hubs in protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Ravishankar R Vallabhajosyula

    Full Text Available In spite of the scale-free degree distribution that characterizes most protein interaction networks (PINs, it is common to define an ad hoc degree scale that defines "hub" proteins having special topological and functional significance. This raises the concern that some conclusions on the functional significance of proteins based on network properties may not be robust.In this paper we present three objective methods to define hub proteins in PINs: one is a purely topological method and two others are based on gene expression and function. By applying these methods to four distinct PINs, we examine the extent of agreement among these methods and implications of these results on network construction.We find that the methods agree well for networks that contain a balance between error-free and unbiased interactions, indicating that the hub concept is meaningful for such networks.

  20. Prediction of Protein-Protein Interactions Related to Protein Complexes Based on Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Peng Liu

    2015-01-01

    Full Text Available A method for predicting protein-protein interactions based on detected protein complexes is proposed to repair deficient interactions derived from high-throughput biological experiments. Protein complexes are pruned and decomposed into small parts based on the adaptive k-cores method to predict protein-protein interactions associated with the complexes. The proposed method is adaptive to protein complexes with different structure, number, and size of nodes in a protein-protein interaction network. Based on different complex sets detected by various algorithms, we can obtain different prediction sets of protein-protein interactions. The reliability of the predicted interaction sets is proved by using estimations with statistical tests and direct confirmation of the biological data. In comparison with the approaches which predict the interactions based on the cliques, the overlap of the predictions is small. Similarly, the overlaps among the predicted sets of interactions derived from various complex sets are also small. Thus, every predicted set of interactions may complement and improve the quality of the original network data. Meanwhile, the predictions from the proposed method replenish protein-protein interactions associated with protein complexes using only the network topology.

  1. Human cancer protein-protein interaction network: a structural perspective.

    Directory of Open Access Journals (Sweden)

    Gozde Kar

    2009-12-01

    Full Text Available Protein-protein interaction networks provide a global picture of cellular function and biological processes. Some proteins act as hub proteins, highly connected to others, whereas some others have few interactions. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. Similar or overlapping binding sites should be used repeatedly in single interface hub proteins, making them promiscuous. Alternatively, multi-interface hub proteins make use of several distinct binding sites to bind to different partners. We propose a methodology to integrate protein interfaces into cancer interaction networks (ciSPIN, cancer structural protein interface network. The interactions in the human protein interaction network are replaced by interfaces, coming from either known or predicted complexes. We provide a detailed analysis of cancer related human protein-protein interfaces and the topological properties of the cancer network. The results reveal that cancer-related proteins have smaller, more planar, more charged and less hydrophobic binding sites than non-cancer proteins, which may indicate low affinity and high specificity of the cancer-related interactions. We also classified the genes in ciSPIN according to phenotypes. Within phenotypes, for breast cancer, colorectal cancer and leukemia, interface properties were found to be discriminating from non-cancer interfaces with an accuracy of 71%, 67%, 61%, respectively. In addition, cancer-related proteins tend to interact with their partners through distinct interfaces, corresponding mostly to multi-interface hubs, which comprise 56% of cancer-related proteins, and constituting the nodes with higher essentiality in the network (76%. We illustrate the interface related affinity properties of two cancer-related hub

  2. NAPS: Network Analysis of Protein Structures

    Science.gov (United States)

    Chakrabarty, Broto; Parekh, Nita

    2016-01-01

    Traditionally, protein structures have been analysed by the secondary structure architecture and fold arrangement. An alternative approach that has shown promise is modelling proteins as a network of non-covalent interactions between amino acid residues. The network representation of proteins provide a systems approach to topological analysis of complex three-dimensional structures irrespective of secondary structure and fold type and provide insights into structure-function relationship. We have developed a web server for network based analysis of protein structures, NAPS, that facilitates quantitative and qualitative (visual) analysis of residue–residue interactions in: single chains, protein complex, modelled protein structures and trajectories (e.g. from molecular dynamics simulations). The user can specify atom type for network construction, distance range (in Å) and minimal amino acid separation along the sequence. NAPS provides users selection of node(s) and its neighbourhood based on centrality measures, physicochemical properties of amino acids or cluster of well-connected residues (k-cliques) for further analysis. Visual analysis of interacting domains and protein chains, and shortest path lengths between pair of residues are additional features that aid in functional analysis. NAPS support various analyses and visualization views for identifying functional residues, provide insight into mechanisms of protein folding, domain-domain and protein–protein interactions for understanding communication within and between proteins. URL:http://bioinf.iiit.ac.in/NAPS/. PMID:27151201

  3. Evolution of protein-protein interaction networks in yeast.

    Directory of Open Access Journals (Sweden)

    Andrew Schoenrock

    Full Text Available Interest in the evolution of protein-protein and genetic interaction networks has been rising in recent years, but the lack of large-scale high quality comparative datasets has acted as a barrier. Here, we carried out a comparative analysis of computationally predicted protein-protein interaction (PPI networks from five closely related yeast species. We used the Protein-protein Interaction Prediction Engine (PIPE, which uses a database of known interactions to make sequence-based PPI predictions, to generate high quality predicted interactomes. Simulated proteomes and corresponding PPI networks were used to provide null expectations for the extent and nature of PPI network evolution. We found strong evidence for conservation of PPIs, with lower than expected levels of change in PPIs for about a quarter of the proteome. Furthermore, we found that changes in predicted PPI networks are poorly predicted by sequence divergence. Our analyses identified a number of functional classes experiencing fewer PPI changes than expected, suggestive of purifying selection on PPIs. Our results demonstrate the added benefit of considering predicted PPI networks when studying the evolution of closely related organisms.

  4. Neural Networks for protein Structure Prediction

    DEFF Research Database (Denmark)

    Bohr, Henrik

    1998-01-01

    This is a review about neural network applications in bioinformatics. Especially the applications to protein structure prediction, e.g. prediction of secondary structures, prediction of surface structure, fold class recognition and prediction of the 3-dimensional structure of protein backbones...

  5. Geometric de-noising of protein-protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Oleksii Kuchaiev

    2009-08-01

    Full Text Available Understanding complex networks of protein-protein interactions (PPIs is one of the foremost challenges of the post-genomic era. Due to the recent advances in experimental bio-technology, including yeast-2-hybrid (Y2H, tandem affinity purification (TAP and other high-throughput methods for protein-protein interaction (PPI detection, huge amounts of PPI network data are becoming available. Of major concern, however, are the levels of noise and incompleteness. For example, for Y2H screens, it is thought that the false positive rate could be as high as 64%, and the false negative rate may range from 43% to 71%. TAP experiments are believed to have comparable levels of noise.We present a novel technique to assess the confidence levels of interactions in PPI networks obtained from experimental studies. We use it for predicting new interactions and thus for guiding future biological experiments. This technique is the first to utilize currently the best fitting network model for PPI networks, geometric graphs. Our approach achieves specificity of 85% and sensitivity of 90%. We use it to assign confidence scores to physical protein-protein interactions in the human PPI network downloaded from BioGRID. Using our approach, we predict 251 interactions in the human PPI network, a statistically significant fraction of which correspond to protein pairs sharing common GO terms. Moreover, we validate a statistically significant portion of our predicted interactions in the HPRD database and the newer release of BioGRID. The data and Matlab code implementing the methods are freely available from the web site: http://www.kuchaev.com/Denoising.

  6. Influence of degree correlations on network structure and stability in protein-protein interaction networks

    Directory of Open Access Journals (Sweden)

    Zimmer Ralf

    2007-08-01

    Full Text Available Abstract Background The existence of negative correlations between degrees of interacting proteins is being discussed since such negative degree correlations were found for the large-scale yeast protein-protein interaction (PPI network of Ito et al. More recent studies observed no such negative correlations for high-confidence interaction sets. In this article, we analyzed a range of experimentally derived interaction networks to understand the role and prevalence of degree correlations in PPI networks. We investigated how degree correlations influence the structure of networks and their tolerance against perturbations such as the targeted deletion of hubs. Results For each PPI network, we simulated uncorrelated, positively and negatively correlated reference networks. Here, a simple model was developed which can create different types of degree correlations in a network without changing the degree distribution. Differences in static properties associated with degree correlations were compared by analyzing the network characteristics of the original PPI and reference networks. Dynamics were compared by simulating the effect of a selective deletion of hubs in all networks. Conclusion Considerable differences between the network types were found for the number of components in the original networks. Negatively correlated networks are fragmented into significantly less components than observed for positively correlated networks. On the other hand, the selective deletion of hubs showed an increased structural tolerance to these deletions for the positively correlated networks. This results in a lower rate of interaction loss in these networks compared to the negatively correlated networks and a decreased disintegration rate. Interestingly, real PPI networks are most similar to the randomly correlated references with respect to all properties analyzed. Thus, although structural properties of networks can be modified considerably by degree

  7. Network compression as a quality measure for protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Loic Royer

    Full Text Available With the advent of large-scale protein interaction studies, there is much debate about data quality. Can different noise levels in the measurements be assessed by analyzing network structure? Because proteomic regulation is inherently co-operative, modular and redundant, it is inherently compressible when represented as a network. Here we propose that network compression can be used to compare false positive and false negative noise levels in protein interaction networks. We validate this hypothesis by first confirming the detrimental effect of false positives and false negatives. Second, we show that gold standard networks are more compressible. Third, we show that compressibility correlates with co-expression, co-localization, and shared function. Fourth, we also observe correlation with better protein tagging methods, physiological expression in contrast to over-expression of tagged proteins, and smart pooling approaches for yeast two-hybrid screens. Overall, this new measure is a proxy for both sensitivity and specificity and gives complementary information to standard measures such as average degree and clustering coefficients.

  8. Detection of protein complex from protein-protein interaction network using Markov clustering

    International Nuclear Information System (INIS)

    Ochieng, P J; Kusuma, W A; Haryanto, T

    2017-01-01

    Detection of complexes, or groups of functionally related proteins, is an important challenge while analysing biological networks. However, existing algorithms to identify protein complexes are insufficient when applied to dense networks of experimentally derived interaction data. Therefore, we introduced a graph clustering method based on Markov clustering algorithm to identify protein complex within highly interconnected protein-protein interaction networks. Protein-protein interaction network was first constructed to develop geometrical network, the network was then partitioned using Markov clustering to detect protein complexes. The interest of the proposed method was illustrated by its application to Human Proteins associated to type II diabetes mellitus. Flow simulation of MCL algorithm was initially performed and topological properties of the resultant network were analysed for detection of the protein complex. The results indicated the proposed method successfully detect an overall of 34 complexes with 11 complexes consisting of overlapping modules and 20 non-overlapping modules. The major complex consisted of 102 proteins and 521 interactions with cluster modularity and density of 0.745 and 0.101 respectively. The comparison analysis revealed MCL out perform AP, MCODE and SCPS algorithms with high clustering coefficient (0.751) network density and modularity index (0.630). This demonstrated MCL was the most reliable and efficient graph clustering algorithm for detection of protein complexes from PPI networks. (paper)

  9. Data management of protein interaction networks

    CERN Document Server

    Cannataro, Mario

    2012-01-01

    Interactomics: a complete survey from data generation to knowledge extraction With the increasing use of high-throughput experimental assays, more and more protein interaction databases are becoming available. As a result, computational analysis of protein-to-protein interaction (PPI) data and networks, now known as interactomics, has become an essential tool to determine functionally associated proteins. From wet lab technologies to data management to knowledge extraction, this timely book guides readers through the new science of interactomics, giving them the tools needed to: Generate

  10. Finding local communities in protein networks.

    Science.gov (United States)

    Voevodski, Konstantin; Teng, Shang-Hua; Xia, Yu

    2009-09-18

    Protein-protein interactions (PPIs) play fundamental roles in nearly all biological processes, and provide major insights into the inner workings of cells. A vast amount of PPI data for various organisms is available from BioGRID and other sources. The identification of communities in PPI networks is of great interest because they often reveal previously unknown functional ties between proteins. A large number of global clustering algorithms have been applied to protein networks, where the entire network is partitioned into clusters. Here we take a different approach by looking for local communities in PPI networks. We develop a tool, named Local Protein Community Finder, which quickly finds a community close to a queried protein in any network available from BioGRID or specified by the user. Our tool uses two new local clustering algorithms Nibble and PageRank-Nibble, which look for a good cluster among the most popular destinations of a short random walk from the queried vertex. The quality of a cluster is determined by proportion of outgoing edges, known as conductance, which is a relative measure particularly useful in undersampled networks. We show that the two local clustering algorithms find communities that not only form excellent clusters, but are also likely to be biologically relevant functional components. We compare the performance of Nibble and PageRank-Nibble to other popular and effective graph partitioning algorithms, and show that they find better clusters in the graph. Moreover, Nibble and PageRank-Nibble find communities that are more functionally coherent. The Local Protein Community Finder, accessible at http://xialab.bu.edu/resources/lpcf, allows the user to quickly find a high-quality community close to a queried protein in any network available from BioGRID or specified by the user. We show that the communities found by our tool form good clusters and are functionally coherent, making our application useful for biologists who wish to

  11. Finding local communities in protein networks

    Directory of Open Access Journals (Sweden)

    Teng Shang-Hua

    2009-09-01

    Full Text Available Abstract Background Protein-protein interactions (PPIs play fundamental roles in nearly all biological processes, and provide major insights into the inner workings of cells. A vast amount of PPI data for various organisms is available from BioGRID and other sources. The identification of communities in PPI networks is of great interest because they often reveal previously unknown functional ties between proteins. A large number of global clustering algorithms have been applied to protein networks, where the entire network is partitioned into clusters. Here we take a different approach by looking for local communities in PPI networks. Results We develop a tool, named Local Protein Community Finder, which quickly finds a community close to a queried protein in any network available from BioGRID or specified by the user. Our tool uses two new local clustering algorithms Nibble and PageRank-Nibble, which look for a good cluster among the most popular destinations of a short random walk from the queried vertex. The quality of a cluster is determined by proportion of outgoing edges, known as conductance, which is a relative measure particularly useful in undersampled networks. We show that the two local clustering algorithms find communities that not only form excellent clusters, but are also likely to be biologically relevant functional components. We compare the performance of Nibble and PageRank-Nibble to other popular and effective graph partitioning algorithms, and show that they find better clusters in the graph. Moreover, Nibble and PageRank-Nibble find communities that are more functionally coherent. Conclusion The Local Protein Community Finder, accessible at http://xialab.bu.edu/resources/lpcf, allows the user to quickly find a high-quality community close to a queried protein in any network available from BioGRID or specified by the user. We show that the communities found by our tool form good clusters and are functionally coherent

  12. DETECTION OF TOPOLOGICAL PATTERNS IN PROTEIN NETWORKS.

    Energy Technology Data Exchange (ETDEWEB)

    MASLOV,S.SNEPPEN,K.

    2003-11-17

    Complex networks appear in biology on many different levels: (1) All biochemical reactions taking place in a single cell constitute its metabolic network, where nodes are individual metabolites, and edges are metabolic reactions converting them to each other. (2) Virtually every one of these reactions is catalyzed by an enzyme and the specificity of this catalytic function is ensured by the key and lock principle of its physical interaction with the substrate. Often the functional enzyme is formed by several mutually interacting proteins. Thus the structure of the metabolic network is shaped by the network of physical interactions of cell's proteins with their substrates and each other. (3) The abundance and the level of activity of each of the proteins in the physical interaction network in turn is controlled by the regulatory network of the cell. Such regulatory network includes all of the multiple mechanisms in which proteins in the cell control each other including transcriptional and translational regulation, regulation of mRNA editing and its transport out of the nucleus, specific targeting of individual proteins for degradation, modification of their activity e.g. by phosphorylation/dephosphorylation or allosteric regulation, etc. To get some idea about the complexity and interconnectedness of protein-protein regulations in baker's yeast Saccharomyces Cerevisiae in Fig. 1 we show a part of the regulatory network corresponding to positive or negative regulations that regulatory proteins exert on each other. (4) On yet higher level individual cells of a multicellular organism exchange signals with each other. This gives rise to several new networks such as e.g. nervous, hormonal, and immune systems of animals. The intercellular signaling network stages the development of a multicellular organism from the fertilized egg. (5) Finally, on the grandest scale, the interactions between individual species in ecosystems determine their food webs. An

  13. A conserved mammalian protein interaction network.

    Directory of Open Access Journals (Sweden)

    Åsa Pérez-Bercoff

    Full Text Available Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.

  14. Inferring protein function by domain context similarities in protein-protein interaction networks

    Directory of Open Access Journals (Sweden)

    Sun Zhirong

    2009-12-01

    Full Text Available Abstract Background Genome sequencing projects generate massive amounts of sequence data but there are still many proteins whose functions remain unknown. The availability of large scale protein-protein interaction data sets makes it possible to develop new function prediction methods based on protein-protein interaction (PPI networks. Although several existing methods combine multiple information resources, there is no study that integrates protein domain information and PPI networks to predict protein functions. Results The domain context similarity can be a useful index to predict protein function similarity. The prediction accuracy of our method in yeast is between 63%-67%, which outperforms the other methods in terms of ROC curves. Conclusion This paper presents a novel protein function prediction method that combines protein domain composition information and PPI networks. Performance evaluations show that this method outperforms existing methods.

  15. Hubs with network motifs organize modularity dynamically in the protein-protein interaction network of yeast.

    Science.gov (United States)

    Jin, Guangxu; Zhang, Shihua; Zhang, Xiang-Sun; Chen, Luonan

    2007-11-21

    It has been recognized that modular organization pervades biological complexity. Based on network analysis, 'party hubs' and 'date hubs' were proposed to understand the basic principle of module organization of biomolecular networks. However, recent study on hubs has suggested that there is no clear evidence for coexistence of 'party hubs' and 'date hubs'. Thus, an open question has been raised as to whether or not 'party hubs' and 'date hubs' truly exist in yeast interactome. In contrast to previous studies focusing on the partners of a hub or the individual proteins around the hub, our work aims to study the network motifs of a hub or interactions among individual proteins including the hub and its neighbors. Depending on the relationship between a hub's network motifs and protein complexes, we define two new types of hubs, 'motif party hubs' and 'motif date hubs', which have the same characteristics as the original 'party hubs' and 'date hubs' respectively. The network motifs of these two types of hubs display significantly different features in spatial distribution (or cellular localizations), co-expression in microarray data, controlling topological structure of network, and organizing modularity. By virtue of network motifs, we basically solved the open question about 'party hubs' and 'date hubs' which was raised by previous studies. Specifically, at the level of network motifs instead of individual proteins, we found two types of hubs, motif party hubs (mPHs) and motif date hubs (mDHs), whose network motifs display distinct characteristics on biological functions. In addition, in this paper we studied network motifs from a different viewpoint. That is, we show that a network motif should not be merely considered as an interaction pattern but be considered as an essential function unit in organizing modules of networks.

  16. Topology-function conservation in protein-protein interaction networks.

    Science.gov (United States)

    Davis, Darren; Yaveroğlu, Ömer Nebil; Malod-Dognin, Noël; Stojmirovic, Aleksandar; Pržulj, Nataša

    2015-05-15

    Proteins underlay the functioning of a cell and the wiring of proteins in protein-protein interaction network (PIN) relates to their biological functions. Proteins with similar wiring in the PIN (topology around them) have been shown to have similar functions. This property has been successfully exploited for predicting protein functions. Topological similarity is also used to guide network alignment algorithms that find similarly wired proteins between PINs of different species; these similarities are used to transfer annotation across PINs, e.g. from model organisms to human. To refine these functional predictions and annotation transfers, we need to gain insight into the variability of the topology-function relationships. For example, a function may be significantly associated with specific topologies, while another function may be weakly associated with several different topologies. Also, the topology-function relationships may differ between different species. To improve our understanding of topology-function relationships and of their conservation among species, we develop a statistical framework that is built upon canonical correlation analysis. Using the graphlet degrees to represent the wiring around proteins in PINs and gene ontology (GO) annotations to describe their functions, our framework: (i) characterizes statistically significant topology-function relationships in a given species, and (ii) uncovers the functions that have conserved topology in PINs of different species, which we term topologically orthologous functions. We apply our framework to PINs of yeast and human, identifying seven biological process and two cellular component GO terms to be topologically orthologous for the two organisms. © The Author 2015. Published by Oxford University Press.

  17. Essential Protein Detection by Random Walk on Weighted Protein-Protein Interaction Networks.

    Science.gov (United States)

    Xu, Bin; Guan, Jihong; Wang, Yang; Wang, Zewei

    2017-05-12

    Essential proteins are critical to the development and survival of cells. Identification of essential proteins is helpful for understanding the minimal set of required genes in a living cell and for designing new drugs. To detect essential proteins, various computational methods have been proposed based on protein-protein interaction (PPI) networks. However, protein interaction data obtained by highthroughput experiments usually contain high false positives, which negatively impacts the accuracy of essential protein detection. Moreover, most existing studies focused on the local information of proteins in PPI networks, while ignoring the influence of indirect protein interactions on essentiality. In this paper, we propose a novel method, called Essentiality Ranking (EssRank in short), to boost the accuracy of essential protein detection. To deal with the inaccuracy of PPI data, confidence scores of interactions are evaluated by integrating various biological information. Weighted edge clustering coefficient (WECC), considering both interaction confidence scores and network topology, is proposed to calculate edge weights in PPI networks. The weight of each node is evaluated by the sum of WECC values of its linking edges. A random walk method, making use of both direct and indirect protein interactions, is then employed to calculate protein essentiality iteratively. Experimental results on the yeast PPI network show that EssRank outperforms most existing methods, including the most commonly-used centrality measures (SC, DC, BC, CC, IC, EC), topology based methods (DMNC and NC) and the data integrating method IEW.

  18. Comparative Study of Elastic Network Model and Protein Contact Network for Protein Complexes: The Hemoglobin Case

    Directory of Open Access Journals (Sweden)

    Guang Hu

    2017-01-01

    Full Text Available The overall topology and interfacial interactions play key roles in understanding structural and functional principles of protein complexes. Elastic Network Model (ENM and Protein Contact Network (PCN are two widely used methods for high throughput investigation of structures and interactions within protein complexes. In this work, the comparative analysis of ENM and PCN relative to hemoglobin (Hb was taken as case study. We examine four types of structural and dynamical paradigms, namely, conformational change between different states of Hbs, modular analysis, allosteric mechanisms studies, and interface characterization of an Hb. The comparative study shows that ENM has an advantage in studying dynamical properties and protein-protein interfaces, while PCN is better for describing protein structures quantitatively both from local and from global levels. We suggest that the integration of ENM and PCN would give a potential but powerful tool in structural systems biology.

  19. HKC: An Algorithm to Predict Protein Complexes in Protein-Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Xiaomin Wang

    2011-01-01

    Full Text Available With the availability of more and more genome-scale protein-protein interaction (PPI networks, research interests gradually shift to Systematic Analysis on these large data sets. A key topic is to predict protein complexes in PPI networks by identifying clusters that are densely connected within themselves but sparsely connected with the rest of the network. In this paper, we present a new topology-based algorithm, HKC, to detect protein complexes in genome-scale PPI networks. HKC mainly uses the concepts of highest k-core and cohesion to predict protein complexes by identifying overlapping clusters. The experiments on two data sets and two benchmarks show that our algorithm has relatively high F-measure and exhibits better performance compared with some other methods.

  20. Discriminating lysosomal membrane protein types using dynamic neural network.

    Science.gov (United States)

    Tripathi, Vijay; Gupta, Dwijendra Kumar

    2014-01-01

    This work presents a dynamic artificial neural network methodology, which classifies the proteins into their classes from their sequences alone: the lysosomal membrane protein classes and the various other membranes protein classes. In this paper, neural networks-based lysosomal-associated membrane protein type prediction system is proposed. Different protein sequence representations are fused to extract the features of a protein sequence, which includes seven feature sets; amino acid (AA) composition, sequence length, hydrophobic group, electronic group, sum of hydrophobicity, R-group, and dipeptide composition. To reduce the dimensionality of the large feature vector, we applied the principal component analysis. The probabilistic neural network, generalized regression neural network, and Elman regression neural network (RNN) are used as classifiers and compared with layer recurrent network (LRN), a dynamic network. The dynamic networks have memory, i.e. its output depends not only on the input but the previous outputs also. Thus, the accuracy of LRN classifier among all other artificial neural networks comes out to be the highest. The overall accuracy of jackknife cross-validation is 93.2% for the data-set. These predicted results suggest that the method can be effectively applied to discriminate lysosomal associated membrane proteins from other membrane proteins (Type-I, Outer membrane proteins, GPI-Anchored) and Globular proteins, and it also indicates that the protein sequence representation can better reflect the core feature of membrane proteins than the classical AA composition.

  1. Analysis of core–periphery organization in protein contact networks ...

    Indian Academy of Sciences (India)

    2015-09-29

    Sep 29, 2015 ... The representation of proteins as networks of interacting amino acids, referred to as protein contact networks (PCN), and their subsequent analyses using graph theoretic tools, can provide novel insights into the key functional roles of specific groups of residues. We have characterized the networks ...

  2. Analysis of core–periphery organization in protein contact networks ...

    Indian Academy of Sciences (India)

    The representation of proteins as networks of interacting amino acids, referred to as protein contact networks (PCN), and their subsequent analyses using graph theoretic tools, can provide novel insights into the key functional roles of specific groups of residues. We have characterized the networks corresponding to the ...

  3. Protein function prediction using neighbor relativity in protein-protein interaction network.

    Science.gov (United States)

    Moosavi, Sobhan; Rahgozar, Masoud; Rahimi, Amir

    2013-04-01

    There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology and bioinformatics. Some researches utilize the proteins interaction information to predict function for un-annotated proteins. In this paper, we propose a novel approach called "Neighbor Relativity Coefficient" (NRC) based on interaction network topology which estimates the functional similarity between two proteins. NRC is calculated for each pair of proteins based on their graph-based features including distance, common neighbors and the number of paths between them. In order to ascribe function to an un-annotated protein, NRC estimates a weight for each neighbor to transfer its annotation to the unknown protein. Finally, the unknown protein will be annotated by the top score transferred functions. We also investigate the effect of using different coefficients for various types of functions. The proposed method has been evaluated on Saccharomyces cerevisiae and Homo sapiens interaction networks. The performance analysis demonstrates that NRC yields better results in comparison with previous protein function prediction approaches that utilize interaction network. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. Topology of membrane proteins-predictions, limitations and variations.

    Science.gov (United States)

    Tsirigos, Konstantinos D; Govindarajan, Sudha; Bassot, Claudio; Västermark, Åke; Lamb, John; Shu, Nanjiang; Elofsson, Arne

    2017-10-26

    Transmembrane proteins perform a variety of important biological functions necessary for the survival and growth of the cells. Membrane proteins are built up by transmembrane segments that span the lipid bilayer. The segments can either be in the form of hydrophobic alpha-helices or beta-sheets which create a barrel. A fundamental aspect of the structure of transmembrane proteins is the membrane topology, that is, the number of transmembrane segments, their position in the protein sequence and their orientation in the membrane. Along these lines, many predictive algorithms for the prediction of the topology of alpha-helical and beta-barrel transmembrane proteins exist. The newest algorithms obtain an accuracy close to 80% both for alpha-helical and beta-barrel transmembrane proteins. However, lately it has been shown that the simplified picture presented when describing a protein family by its topology is limited. To demonstrate this, we highlight examples where the topology is either not conserved in a protein superfamily or where the structure cannot be described solely by the topology of a protein. The prediction of these non-standard features from sequence alone was not successful until the recent revolutionary progress in 3D-structure prediction of proteins. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Modeling protein network evolution under genome duplication and domain shuffling

    Directory of Open Access Journals (Sweden)

    Isambert Hervé

    2007-11-01

    Full Text Available Abstract Background Successive whole genome duplications have recently been firmly established in all major eukaryote kingdoms. Such exponential evolutionary processes must have largely contributed to shape the topology of protein-protein interaction (PPI networks by outweighing, in particular, all time-linear network growths modeled so far. Results We propose and solve a mathematical model of PPI network evolution under successive genome duplications. This demonstrates, from first principles, that evolutionary conservation and scale-free topology are intrinsically linked properties of PPI networks and emerge from i prevailing exponential network dynamics under duplication and ii asymmetric divergence of gene duplicates. While required, we argue that this asymmetric divergence arises, in fact, spontaneously at the level of protein-binding sites. This supports a refined model of PPI network evolution in terms of protein domains under exponential and asymmetric duplication/divergence dynamics, with multidomain proteins underlying the combinatorial formation of protein complexes. Genome duplication then provides a powerful source of PPI network innovation by promoting local rearrangements of multidomain proteins on a genome wide scale. Yet, we show that the overall conservation and topology of PPI networks are robust to extensive domain shuffling of multidomain proteins as well as to finer details of protein interaction and evolution. Finally, large scale features of direct and indirect PPI networks of S. cerevisiae are well reproduced numerically with only two adjusted parameters of clear biological significance (i.e. network effective growth rate and average number of protein-binding domains per protein. Conclusion This study demonstrates the statistical consequences of genome duplication and domain shuffling on the conservation and topology of PPI networks over a broad evolutionary scale across eukaryote kingdoms. In particular, scale

  6. Protein-protein interaction network-based detection of functionally similar proteins within species.

    Science.gov (United States)

    Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli

    2012-07-01

    Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.

  7. Analysis of protein-protein interaction networks by means of annotated graph mining algorithms

    NARCIS (Netherlands)

    Rahmani, Hossein

    2012-01-01

    This thesis discusses solutions to several open problems in Protein-Protein Interaction (PPI) networks with the aid of Knowledge Discovery. PPI networks are usually represented as undirected graphs, with nodes corresponding to proteins and edges representing interactions among protein pairs. A large

  8. Building protein-protein interaction networks for Leishmania species through protein structural information.

    Science.gov (United States)

    Dos Santos Vasconcelos, Crhisllane Rafaele; de Lima Campos, Túlio; Rezende, Antonio Mauro

    2018-03-06

    Systematic analysis of a parasite interactome is a key approach to understand different biological processes. It makes possible to elucidate disease mechanisms, to predict protein functions and to select promising targets for drug development. Currently, several approaches for protein interaction prediction for non-model species incorporate only small fractions of the entire proteomes and their interactions. Based on this perspective, this study presents an integration of computational methodologies, protein network predictions and comparative analysis of the protozoan species Leishmania braziliensis and Leishmania infantum. These parasites cause Leishmaniasis, a worldwide distributed and neglected disease, with limited treatment options using currently available drugs. The predicted interactions were obtained from a meta-approach, applying rigid body docking tests and template-based docking on protein structures predicted by different comparative modeling techniques. In addition, we trained a machine-learning algorithm (Gradient Boosting) using docking information performed on a curated set of positive and negative protein interaction data. Our final model obtained an AUC = 0.88, with recall = 0.69, specificity = 0.88 and precision = 0.83. Using this approach, it was possible to confidently predict 681 protein structures and 6198 protein interactions for L. braziliensis, and 708 protein structures and 7391 protein interactions for L. infantum. The predicted networks were integrated to protein interaction data already available, analyzed using several topological features and used to classify proteins as essential for network stability. The present study allowed to demonstrate the importance of integrating different methodologies of interaction prediction to increase the coverage of the protein interaction of the studied protocols, besides it made available protein structures and interactions not previously reported.

  9. Detecting overlapping protein complexes by rough-fuzzy clustering in protein-protein interaction networks.

    Science.gov (United States)

    Wu, Hao; Gao, Lin; Dong, Jihua; Yang, Xiaofei

    2014-01-01

    In this paper, we present a novel rough-fuzzy clustering (RFC) method to detect overlapping protein complexes in protein-protein interaction (PPI) networks. RFC focuses on fuzzy relation model rather than graph model by integrating fuzzy sets and rough sets, employs the upper and lower approximations of rough sets to deal with overlapping complexes, and calculates the number of complexes automatically. Fuzzy relation between proteins is established and then transformed into fuzzy equivalence relation. Non-overlapping complexes correspond to equivalence classes satisfying certain equivalence relation. To obtain overlapping complexes, we calculate the similarity between one protein and each complex, and then determine whether the protein belongs to one or multiple complexes by computing the ratio of each similarity to maximum similarity. To validate RFC quantitatively, we test it in Gavin, Collins, Krogan and BioGRID datasets. Experiment results show that there is a good correspondence to reference complexes in MIPS and SGD databases. Then we compare RFC with several previous methods, including ClusterONE, CMC, MCL, GCE, OSLOM and CFinder. Results show the precision, sensitivity and separation are 32.4%, 42.9% and 81.9% higher than mean of the five methods in four weighted networks, and are 0.5%, 11.2% and 66.1% higher than mean of the six methods in five unweighted networks. Our method RFC works well for protein complexes detection and provides a new insight of network division, and it can also be applied to identify overlapping community structure in social networks and LFR benchmark networks.

  10. RAIN: RNA-protein Association and Interaction Networks

    DEFF Research Database (Denmark)

    Junge, Alexander; Refsgaard, Jan Christian; Garde, Christian

    2017-01-01

    Protein association networks can be inferred from a range of resources including experimental data, literature mining and computational predictions. These types of evidence are emerging for non-coding RNAs (ncRNAs) as well. However, integration of ncRNAs into protein association networks...

  11. The architectural design of networks of protein domain architectures.

    Science.gov (United States)

    Hsu, Chia-Hsin; Chen, Chien-Kuo; Hwang, Ming-Jing

    2013-08-23

    Protein domain architectures (PDAs), in which single domains are linked to form multiple-domain proteins, are a major molecular form used by evolution for the diversification of protein functions. However, the design principles of PDAs remain largely uninvestigated. In this study, we constructed networks to connect domain architectures that had grown out from the same single domain for every single domain in the Pfam-A database and found that there are three main distinctive types of these networks, which suggests that evolution can exploit PDAs in three different ways. Further analysis showed that these three different types of PDA networks are each adopted by different types of protein domains, although many networks exhibit the characteristics of more than one of the three types. Our results shed light on nature's blueprint for protein architecture and provide a framework for understanding architectural design from a network perspective.

  12. Small world network strategies for studying protein structures and binding.

    Science.gov (United States)

    Taylor, Neil R

    2013-01-01

    Small world network concepts provide many new opportunities to investigate the complex three dimensional structures of protein molecules. This mini-review explores the published literature on using small-world network approaches to study protein structure, with emphasis on the different combinations of descriptors that have been tested, on studies involving ligand binding in protein-ligand complexes, and on protein-protein complexes. The benefits and success of small world network approaches, which change the focus from specific interactions to the local environment, even to non-local phenomenon, are described. The purpose is to show the different ways that small world network concepts have been used for building new computational models for studying protein structure and function, and for extending and improving existing modelling approaches.

  13. Evidence of probabilistic behaviour in protein interaction networks.

    Science.gov (United States)

    Ivanic, Joseph; Wallqvist, Anders; Reifman, Jaques

    2008-01-31

    Data from high-throughput experiments of protein-protein interactions are commonly used to probe the nature of biological organization and extract functional relationships between sets of proteins. What has not been appreciated is that the underlying mechanisms involved in assembling these networks may exhibit considerable probabilistic behaviour. We find that the probability of an interaction between two proteins is generally proportional to the numerical product of their individual interacting partners, or degrees. The degree-weighted behaviour is manifested throughout the protein-protein interaction networks studied here, except for the high-degree, or hub, interaction areas. However, we find that the probabilities of interaction between the hubs are still high. Further evidence is provided by path length analyses, which show that these hubs are separated by very few links. The results suggest that protein-protein interaction networks incorporate probabilistic elements that lead to scale-rich hierarchical architectures. These observations seem to be at odds with a biologically-guided organization. One interpretation of the findings is that we are witnessing the ability of proteins to indiscriminately bind rather than the protein-protein interactions that are actually utilized by the cell in biological processes. Therefore, the topological study of a degree-weighted network requires a more refined methodology to extract biological information about pathways, modules, or other inferred relationships among proteins.

  14. Evidence of probabilistic behaviour in protein interaction networks

    Directory of Open Access Journals (Sweden)

    Reifman Jaques

    2008-01-01

    Full Text Available Abstract Background Data from high-throughput experiments of protein-protein interactions are commonly used to probe the nature of biological organization and extract functional relationships between sets of proteins. What has not been appreciated is that the underlying mechanisms involved in assembling these networks may exhibit considerable probabilistic behaviour. Results We find that the probability of an interaction between two proteins is generally proportional to the numerical product of their individual interacting partners, or degrees. The degree-weighted behaviour is manifested throughout the protein-protein interaction networks studied here, except for the high-degree, or hub, interaction areas. However, we find that the probabilities of interaction between the hubs are still high. Further evidence is provided by path length analyses, which show that these hubs are separated by very few links. Conclusion The results suggest that protein-protein interaction networks incorporate probabilistic elements that lead to scale-rich hierarchical architectures. These observations seem to be at odds with a biologically-guided organization. One interpretation of the findings is that we are witnessing the ability of proteins to indiscriminately bind rather than the protein-protein interactions that are actually utilized by the cell in biological processes. Therefore, the topological study of a degree-weighted network requires a more refined methodology to extract biological information about pathways, modules, or other inferred relationships among proteins.

  15. Probing the extent of randomness in protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Joseph Ivanic

    Full Text Available Protein-protein interaction (PPI networks are commonly explored for the identification of distinctive biological traits, such as pathways, modules, and functional motifs. In this respect, understanding the underlying network structure is vital to assess the significance of any discovered features. We recently demonstrated that PPI networks show degree-weighted behavior, whereby the probability of interaction between two proteins is generally proportional to the product of their numbers of interacting partners or degrees. It was surmised that degree-weighted behavior is a characteristic of randomness. We expand upon these findings by developing a random, degree-weighted, network model and show that eight PPI networks determined from single high-throughput (HT experiments have global and local properties that are consistent with this model. The apparent random connectivity in HT PPI networks is counter-intuitive with respect to their observed degree distributions; however, we resolve this discrepancy by introducing a non-network-based model for the evolution of protein degrees or "binding affinities." This mechanism is based on duplication and random mutation, for which the degree distribution converges to a steady state that is identical to one obtained by averaging over the eight HT PPI networks. The results imply that the degrees and connectivities incorporated in HT PPI networks are characteristic of unbiased interactions between proteins that have varying individual binding affinities. These findings corroborate the observation that curated and high-confidence PPI networks are distinct from HT PPI networks and not consistent with a random connectivity. These results provide an avenue to discern indiscriminate organizations in biological networks and suggest caution in the analysis of curated and high-confidence networks.

  16. Probing the extent of randomness in protein interaction networks.

    Science.gov (United States)

    Ivanic, Joseph; Wallqvist, Anders; Reifman, Jaques

    2008-07-11

    Protein-protein interaction (PPI) networks are commonly explored for the identification of distinctive biological traits, such as pathways, modules, and functional motifs. In this respect, understanding the underlying network structure is vital to assess the significance of any discovered features. We recently demonstrated that PPI networks show degree-weighted behavior, whereby the probability of interaction between two proteins is generally proportional to the product of their numbers of interacting partners or degrees. It was surmised that degree-weighted behavior is a characteristic of randomness. We expand upon these findings by developing a random, degree-weighted, network model and show that eight PPI networks determined from single high-throughput (HT) experiments have global and local properties that are consistent with this model. The apparent random connectivity in HT PPI networks is counter-intuitive with respect to their observed degree distributions; however, we resolve this discrepancy by introducing a non-network-based model for the evolution of protein degrees or "binding affinities." This mechanism is based on duplication and random mutation, for which the degree distribution converges to a steady state that is identical to one obtained by averaging over the eight HT PPI networks. The results imply that the degrees and connectivities incorporated in HT PPI networks are characteristic of unbiased interactions between proteins that have varying individual binding affinities. These findings corroborate the observation that curated and high-confidence PPI networks are distinct from HT PPI networks and not consistent with a random connectivity. These results provide an avenue to discern indiscriminate organizations in biological networks and suggest caution in the analysis of curated and high-confidence networks.

  17. Convolutional LSTM Networks for Subcellular Localization of Proteins

    DEFF Research Database (Denmark)

    Sønderby, Søren Kaae; Sønderby, Casper Kaae; Nielsen, Henrik

    2015-01-01

    Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model...... on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networks predict the subcellular location of proteins given only the protein sequence with high accuracy (0.902) outperforming current state of the art algorithms. We further improve the performance by introducing...... the LSTM networks....

  18. Combining neural networks for protein secondary structure prediction

    DEFF Research Database (Denmark)

    Riis, Søren Kamaric

    1995-01-01

    In this paper structured neural networks are applied to the problem of predicting the secondary structure of proteins. A hierarchical approach is used where specialized neural networks are designed for each structural class and then combined using another neural network. The submodels are designed...... by using a priori knowledge of the mapping between protein building blocks and the secondary structure and by using weight sharing. Since none of the individual networks have more than 600 adjustable weights over-fitting is avoided. When ensembles of specialized experts are combined the performance...... is better than most secondary structure prediction methods based on single sequences even though this model contains much fewer parameters...

  19. The PAM domain, a multi-protein complex-associated module with an all-alpha-helix fold

    Directory of Open Access Journals (Sweden)

    Izaurralde Elisa

    2003-12-01

    Full Text Available Abstract Background Multimeric protein complexes have a role in many cellular pathways and are highly interconnected with various other proteins. The characterization of their domain composition and organization provides useful information on the specific role of each region of their sequence. Results We identified a new module, the PAM domain (PCI/PINT associated module, present in single subunits of well characterized multiprotein complexes, like the regulatory lid of the 26S proteasome, the COP-9 signalosome and the Sac3-Thp1 complex. This module is an around 200 residue long domain with a predicted TPR-like all-alpha-helical fold. Conclusions The occurrence of the PAM domain in specific subunits of multimeric protein complexes, together with the role of other all-alpha-helical folds in protein-protein interactions, suggest a function for this domain in mediating transient binding to diverse target proteins.

  20. Evolution of a protein domain interaction network

    International Nuclear Information System (INIS)

    Li-Feng, Gao; Jian-Jun, Shi; Shan, Guan

    2010-01-01

    In this paper, we attempt to understand complex network evolution from the underlying evolutionary relationship between biological organisms. Firstly, we construct a Pfam domain interaction network for each of the 470 completely sequenced organisms, and therefore each organism is correlated with a specific Pfam domain interaction network; secondly, we infer the evolutionary relationship of these organisms with the nearest neighbour joining method; thirdly, we use the evolutionary relationship between organisms constructed in the second step as the evolutionary course of the Pfam domain interaction network constructed in the first step. This analysis of the evolutionary course shows: (i) there is a conserved sub-network structure in network evolution; in this sub-network, nodes with lower degree prefer to maintain their connectivity invariant, and hubs tend to maintain their role as a hub is attached preferentially to new added nodes; (ii) few nodes are conserved as hubs; most of the other nodes are conserved as one with very low degree; (iii) in the course of network evolution, new nodes are added to the network either individually in most cases or as clusters with relative high clustering coefficients in a very few cases. (general)

  1. Analysis of protein folds using protein contact networks

    Indian Academy of Sciences (India)

    Proteins are important biomolecules, which perform diverse structural and functional roles in living systems. Starting from a .... even be extended up to the level of protein secondary structural elements, as seen in protein topology cartoons [13]. Even though ... chemical interactions [8]. This distance map is a 2D symmetric, ...

  2. CNNcon: improved protein contact maps prediction using cascaded neural networks.

    Directory of Open Access Journals (Sweden)

    Wang Ding

    Full Text Available BACKGROUNDS: Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly. METHODS: CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction. RESULTS: The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 Å for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective

  3. Fluctuations in Mass-Action Equilibrium of Protein Binding Networks

    Science.gov (United States)

    Yan, Koon-Kiu; Walker, Dylan; Maslov, Sergei

    2008-12-01

    We consider two types of fluctuations in the mass-action equilibrium in protein binding networks. The first type is driven by slow changes in total concentrations of interacting proteins. The second type (spontaneous) is caused by quickly decaying thermodynamic deviations away from equilibrium. We investigate the effects of network connectivity on fluctuations by comparing them to scenarios in which the interacting pair is isolated from the network and analytically derives bounds on fluctuations. Collective effects are shown to sometimes lead to large amplification of spontaneous fluctuations. The strength of both types of fluctuations is positively correlated with the complex connectivity and negatively correlated with complex concentration. Our general findings are illustrated using a curated network of protein interactions and multiprotein complexes in baker’s yeast, with empirical protein concentrations.

  4. NatalieQ: A web server for protein-protein interaction network querying

    NARCIS (Netherlands)

    El-Kebir, M.; Brandt, B.W.; Heringa, J.; Klau, G.W.

    2014-01-01

    Background Molecular interactions need to be taken into account to adequately model the complex behavior of biological systems. These interactions are captured by various types of biological networks, such as metabolic, gene-regulatory, signal transduction and protein-protein interaction networks.

  5. NatalieQ: a web server for protein-protein interaction network querying

    NARCIS (Netherlands)

    El-Kebir, M.; Brandt, B.W.; Heringa, J.; Klau, G.W.

    2014-01-01

    Background: Molecular interactions need to be taken into account to adequately model the complex behavior of biological systems. These interactions are captured by various types of biological networks, such as metabolic, gene-regulatory, signal transduction and protein-protein interaction networks.

  6. Rapid Sampling of Hydrogen Bond Networks for Computational Protein Design.

    Science.gov (United States)

    Maguire, Jack B; Boyken, Scott E; Baker, David; Kuhlman, Brian

    2018-04-20

    Hydrogen bond networks play a critical role in determining the stability and specificity of biomolecular complexes, and the ability to design such networks is important for engineering novel structures, interactions, and enzymes. One key feature of hydrogen bond networks that makes them difficult to rationally engineer is that they are highly cooperative and are not energetically favorable until the hydrogen bonding potential has been satisfied for all buried polar groups in the network. Existing computational methods for protein design are ill-equipped for creating these highly cooperative networks because they rely on energy functions and sampling strategies that are focused on pairwise interactions. To enable the design of complex hydrogen bond networks, we have developed a new sampling protocol in the molecular modeling program Rosetta that explicitly searches for sets of amino acid mutations that can form self-contained hydrogen bond networks. For a given set of designable residues, the protocol often identifies many alternative sets of mutations/networks, and we show that it can readily be applied to large sets of residues at protein-protein interfaces or in the interior of proteins. The protocol builds on a recently developed method in Rosetta for designing hydrogen bond networks that has been experimentally validated for small symmetric systems but was not extensible to many larger protein structures and complexes. The sampling protocol we describe here not only recapitulates previously validated designs with performance improvements but also yields viable hydrogen bond networks for cases where the previous method fails, such as the design of large, asymmetric interfaces relevant to engineering protein-based therapeutics.

  7. Topological properties of complex networks in protein structures

    Science.gov (United States)

    Kim, Kyungsik; Jung, Jae-Won; Min, Seungsik

    2014-03-01

    We study topological properties of networks in structural classification of proteins. We model the native-state protein structure as a network made of its constituent amino-acids and their interactions. We treat four structural classes of proteins composed predominantly of α helices and β sheets and consider several proteins from each of these classes whose sizes range from amino acids of the Protein Data Bank. Particularly, we simulate and analyze the network metrics such as the mean degree, the probability distribution of degree, the clustering coefficient, the characteristic path length, the local efficiency, and the cost. This work was supported by the KMAR and DP under Grant WISE project (153-3100-3133-302-350).

  8. NETAL: a new graph-based method for global alignment of protein-protein interaction networks.

    Science.gov (United States)

    Neyshabur, Behnam; Khadem, Ahmadreza; Hashemifar, Somaye; Arab, Seyed Shahriar

    2013-07-01

    The interactions among proteins and the resulting networks of such interactions have a central role in cell biology. Aligning these networks gives us important information, such as conserved complexes and evolutionary relationships. Although there have been several publications on the global alignment of protein networks; however, none of proposed methods are able to produce a highly conserved and meaningful alignment. Moreover, time complexity of current algorithms makes them impossible to use for multiple alignment of several large networks together. We present a novel algorithm for the global alignment of protein-protein interaction networks. It uses a greedy method, based on the alignment scoring matrix, which is derived from both biological and topological information of input networks to find the best global network alignment. NETAL outperforms other global alignment methods in terms of several measurements, such as Edge Correctness, Largest Common Connected Subgraphs and the number of common Gene Ontology terms between aligned proteins. As the running time of NETAL is much less than other available methods, NETAL can be easily expanded to multiple alignment algorithm. Furthermore, NETAL overpowers all other existing algorithms in term of performance so that the short running time of NETAL allowed us to implement it as the first server for global alignment of protein-protein interaction networks. Binaries supported on linux are freely available for download at http://www.bioinf.cs.ipm.ir/software/netal. Supplementary data are available at Bioinformatics online.

  9. A periodic table of coiled-coil protein structures.

    Science.gov (United States)

    Moutevelis, Efrosini; Woolfson, Derek N

    2009-01-23

    Coiled coils are protein structure domains with two or more alpha-helices packed together via interlacing of side chains known as knob-into-hole packing. We analysed and classified a large set of coiled-coil structures using a combination of automated and manual methods. This led to a systematic classification that we termed a "periodic table of coiled coils," which we have made available at http://coiledcoils.chm.bris.ac.uk/ccplus/search/periodic_table. In this table, coiled-coil assemblies are arranged in columns with increasing numbers of alpha-helices and in rows of increased complexity. The table provides a framework for understanding possibilities in and limits on coiled-coil structures and a basis for future prediction, engineering and design studies.

  10. Analysis of protein folds using protein contact networks

    Indian Academy of Sciences (India)

    Proteins are important biomolecules, which perform diverse structural and functional roles in living systems. Starting from a linear chain of amino acids, proteins fold to different secondary structures, which then fold through short- and long-range interactions to give rise to the final three-dimensional shapes useful to carry out ...

  11. Predicting and validating protein interactions using network structure.

    Directory of Open Access Journals (Sweden)

    Pao-Yang Chen

    2008-07-01

    Full Text Available Protein interactions play a vital part in the function of a cell. As experimental techniques for detection and validation of protein interactions are time consuming, there is a need for computational methods for this task. Protein interactions appear to form a network with a relatively high degree of local clustering. In this paper we exploit this clustering by suggesting a score based on triplets of observed protein interactions. The score utilises both protein characteristics and network properties. Our score based on triplets is shown to complement existing techniques for predicting protein interactions, outperforming them on data sets which display a high degree of clustering. The predicted interactions score highly against test measures for accuracy. Compared to a similar score derived from pairwise interactions only, the triplet score displays higher sensitivity and specificity. By looking at specific examples, we show how an experimental set of interactions can be enriched and validated. As part of this work we also examine the effect of different prior databases upon the accuracy of prediction and find that the interactions from the same kingdom give better results than from across kingdoms, suggesting that there may be fundamental differences between the networks. These results all emphasize that network structure is important and helps in the accurate prediction of protein interactions. The protein interaction data set and the program used in our analysis, and a list of predictions and validations, are available at http://www.stats.ox.ac.uk/bioinfo/resources/PredictingInteractions.

  12. Advanced path sampling of the kinetic network of small proteins

    NARCIS (Netherlands)

    Du, W.

    2014-01-01

    This thesis is focused on developing advanced path sampling simulation methods to study protein folding and unfolding, and to build kinetic equilibrium networks describing these processes. In Chapter 1 the basic knowledge of protein structure and folding theories were introduced and a brief overview

  13. Dynamic rheology of food protein networks

    Science.gov (United States)

    Small amplitude oscillatory shear analyses of samples containing protein are useful for determining the nature of the protein matrix without damaging it. Elastic modulus, viscous modulus, and loss tangent (the ratio of viscous modulus to elastic modulus) give information on the strength of the netw...

  14. Computational Modeling of Complex Protein Activity Networks

    NARCIS (Netherlands)

    Schivo, Stefano; Leijten, Jeroen; Karperien, Marcel; Post, Janine N.; Prignet, Claude

    2017-01-01

    Because of the numerous entities interacting, the complexity of the networks that regulate cell fate makes it impossible to analyze and understand them using the human brain alone. Computational modeling is a powerful method to unravel complex systems. We recently described the development of a

  15. Protein diffusion in photopolymerized poly(ethylene glycol) hydrogel networks

    Energy Technology Data Exchange (ETDEWEB)

    Engberg, Kristin; Frank, Curtis W, E-mail: curt.frank@stanford.edu [Department of Chemical Engineering, Stanford University, 381 North-South Mall, Stauffer III, Stanford, CA 94305 (United States)

    2011-10-15

    In this study, protein diffusion through swollen hydrogel networks prepared from end-linked poly(ethylene glycol)-diacrylate (PEG-DA) was investigated. Hydrogels were prepared via photopolymerization from PEG-DA macromonomer solutions of two molecular weights, 4600 Da and 8000 Da, with three initial solid contents: 20, 33 and 50 wt/wt% PEG. Diffusion coefficients for myoglobin traveling across the hydrogel membrane were determined for all PEG network compositions. The diffusion coefficient depended on PEG molecular weight and initial solid content, with the slowest diffusion occurring through lower molecular weight, high-solid-content networks (D{sub gel} = 0.16 {+-} 0.02 x 10{sup -8} cm{sup 2} s{sup -1}) and the fastest diffusion occurring through higher molecular weight, low-solid-content networks (D{sub gel} = 11.05 {+-} 0.43 x 10{sup -8} cm{sup 2} s{sup -1}). Myoglobin diffusion coefficients increased linearly with the increase of water content within the hydrogels. The permeability of three larger model proteins (horseradish peroxidase, bovine serum albumin and immunoglobulin G) through PEG(8000) hydrogel membranes was also examined, with the observation that globular molecules as large as 10.7 nm in hydrodynamic diameter can diffuse through the PEG network. Protein diffusion coefficients within the PEG hydrogels ranged from one to two orders of magnitude lower than the diffusion coefficients in free water. Network defects were determined to be a significant contributing factor to the observed protein diffusion.

  16. Protein diffusion in photopolymerized poly(ethylene glycol) hydrogel networks

    International Nuclear Information System (INIS)

    Engberg, Kristin; Frank, Curtis W

    2011-01-01

    In this study, protein diffusion through swollen hydrogel networks prepared from end-linked poly(ethylene glycol)-diacrylate (PEG-DA) was investigated. Hydrogels were prepared via photopolymerization from PEG-DA macromonomer solutions of two molecular weights, 4600 Da and 8000 Da, with three initial solid contents: 20, 33 and 50 wt/wt% PEG. Diffusion coefficients for myoglobin traveling across the hydrogel membrane were determined for all PEG network compositions. The diffusion coefficient depended on PEG molecular weight and initial solid content, with the slowest diffusion occurring through lower molecular weight, high-solid-content networks (D gel = 0.16 ± 0.02 x 10 -8 cm 2 s -1 ) and the fastest diffusion occurring through higher molecular weight, low-solid-content networks (D gel = 11.05 ± 0.43 x 10 -8 cm 2 s -1 ). Myoglobin diffusion coefficients increased linearly with the increase of water content within the hydrogels. The permeability of three larger model proteins (horseradish peroxidase, bovine serum albumin and immunoglobulin G) through PEG(8000) hydrogel membranes was also examined, with the observation that globular molecules as large as 10.7 nm in hydrodynamic diameter can diffuse through the PEG network. Protein diffusion coefficients within the PEG hydrogels ranged from one to two orders of magnitude lower than the diffusion coefficients in free water. Network defects were determined to be a significant contributing factor to the observed protein diffusion.

  17. Using the clustered circular layout as an informative method for visualizing protein-protein interaction networks.

    Science.gov (United States)

    Fung, David C Y; Wilkins, Marc R; Hart, David; Hong, Seok-Hee

    2010-07-01

    The force-directed layout is commonly used in computer-generated visualizations of protein-protein interaction networks. While it is good for providing a visual outline of the protein complexes and their interactions, it has two limitations when used as a visual analysis method. The first is poor reproducibility. Repeated running of the algorithm does not necessarily generate the same layout, therefore, demanding cognitive readaptation on the investigator's part. The second limitation is that it does not explicitly display complementary biological information, e.g. Gene Ontology, other than the protein names or gene symbols. Here, we present an alternative layout called the clustered circular layout. Using the human DNA replication protein-protein interaction network as a case study, we compared the two network layouts for their merits and limitations in supporting visual analysis.

  18. Emergence of modularity and disassortativity in protein-protein interaction networks.

    Science.gov (United States)

    Wan, Xi; Cai, Shuiming; Zhou, Jin; Liu, Zengrong

    2010-12-01

    In this paper, we present a simple evolution model of protein-protein interaction networks by introducing a rule of small-preference duplication of a node, meaning that the probability of a node chosen to duplicate is inversely proportional to its degree, and subsequent divergence plus nonuniform heterodimerization based on some plausible mechanisms in biology. We show that our model cannot only reproduce scale-free connectivity and small-world pattern, but also exhibit hierarchical modularity and disassortativity. After comparing the features of our model with those of real protein-protein interaction networks, we believe that our model can provide relevant insights into the mechanism underlying the evolution of protein-protein interaction networks. © 2010 American Institute of Physics.

  19. Distinctive Behaviors of Druggable Proteins in Cellular Networks.

    Directory of Open Access Journals (Sweden)

    Costas Mitsopoulos

    2015-12-01

    Full Text Available The interaction environment of a protein in a cellular network is important in defining the role that the protein plays in the system as a whole, and thus its potential suitability as a drug target. Despite the importance of the network environment, it is neglected during target selection for drug discovery. Here, we present the first systematic, comprehensive computational analysis of topological, community and graphical network parameters of the human interactome and identify discriminatory network patterns that strongly distinguish drug targets from the interactome as a whole. Importantly, we identify striking differences in the network behavior of targets of cancer drugs versus targets from other therapeutic areas and explore how they may relate to successful drug combinations to overcome acquired resistance to cancer drugs. We develop, computationally validate and provide the first public domain predictive algorithm for identifying druggable neighborhoods based on network parameters. We also make available full predictions for 13,345 proteins to aid target selection for drug discovery. All target predictions are available through canSAR.icr.ac.uk. Underlying data and tools are available at https://cansar.icr.ac.uk/cansar/publications/druggable_network_neighbourhoods/.

  20. Distinctive Behaviors of Druggable Proteins in Cellular Networks.

    Science.gov (United States)

    Mitsopoulos, Costas; Schierz, Amanda C; Workman, Paul; Al-Lazikani, Bissan

    2015-12-01

    The interaction environment of a protein in a cellular network is important in defining the role that the protein plays in the system as a whole, and thus its potential suitability as a drug target. Despite the importance of the network environment, it is neglected during target selection for drug discovery. Here, we present the first systematic, comprehensive computational analysis of topological, community and graphical network parameters of the human interactome and identify discriminatory network patterns that strongly distinguish drug targets from the interactome as a whole. Importantly, we identify striking differences in the network behavior of targets of cancer drugs versus targets from other therapeutic areas and explore how they may relate to successful drug combinations to overcome acquired resistance to cancer drugs. We develop, computationally validate and provide the first public domain predictive algorithm for identifying druggable neighborhoods based on network parameters. We also make available full predictions for 13,345 proteins to aid target selection for drug discovery. All target predictions are available through canSAR.icr.ac.uk. Underlying data and tools are available at https://cansar.icr.ac.uk/cansar/publications/druggable_network_neighbourhoods/.

  1. Analysis of protein folds using protein contact networks

    Indian Academy of Sciences (India)

    range in- teractions to give rise to the final three-dimensional ... data. As defined by SCOP, there exist several hierarchies. The principal levels are family, superfamily, fold and class. According to SCOP, proteins clustered together into families are ...

  2. Response of the mosquito protein interaction network to dengue infection

    Directory of Open Access Journals (Sweden)

    Pike Andrew D

    2010-06-01

    Full Text Available Abstract Background Two fifths of the world's population is at risk from dengue. The absence of effective drugs and vaccines leaves vector control as the primary intervention tool. Understanding dengue virus (DENV host interactions is essential for the development of novel control strategies. The availability of genome sequences for both human and mosquito host greatly facilitates genome-wide studies of DENV-host interactions. Results We developed the first draft of the mosquito protein interaction network using a computational approach. The weighted network includes 4,214 Aedes aegypti proteins with 10,209 interactions, among which 3,500 proteins are connected into an interconnected scale-free network. We demonstrated the application of this network for the further annotation of mosquito proteins and dissection of pathway crosstalk. Using three datasets based on physical interaction assays, genome-wide RNA interference (RNAi screens and microarray assays, we identified 714 putative DENV-associated mosquito proteins. An integrated analysis of these proteins in the network highlighted four regions consisting of highly interconnected proteins with closely related functions in each of replication/transcription/translation (RTT, immunity, transport and metabolism. Putative DENV-associated proteins were further selected for validation by RNAi-mediated gene silencing, and dengue viral titer in mosquito midguts was significantly reduced for five out of ten (50.0% randomly selected genes. Conclusions Our results indicate the presence of common host requirements for DENV in mosquitoes and humans. We discuss the significance of our findings for pharmacological intervention and genetic modification of mosquitoes for blocking dengue transmission.

  3. A scored human protein-protein interaction network to catalyze genomic interpretation

    DEFF Research Database (Denmark)

    Li, Taibo; Wernersson, Rasmus; Hansen, Rasmus B

    2017-01-01

    Genome-scale human protein-protein interaction networks are critical to understanding cell biology and interpreting genomic data, but challenging to produce experimentally. Through data integration and quality control, we provide a scored human protein-protein interaction network (InWeb_InBioMap,......Web_InBioMap, or InWeb_IM) with severalfold more interactions (>500,000) and better functional biological relevance than comparable resources. We illustrate that InWeb_InBioMap enables functional interpretation of >4,700 cancer genomes and genes involved in autism....

  4. Positive Selection and Centrality in the Yeast and Fly Protein-Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Sandip Chakraborty

    2016-01-01

    Full Text Available Proteins within a molecular network are expected to be subject to different selective pressures depending on their relative hierarchical positions. However, it is not obvious what genes within a network should be more likely to evolve under positive selection. On one hand, only mutations at genes with a relatively high degree of control over adaptive phenotypes (such as those encoding highly connected proteins are expected to be “seen” by natural selection. On the other hand, a high degree of pleiotropy at these genes is expected to hinder adaptation. Previous analyses of the human protein-protein interaction network have shown that genes under long-term, recurrent positive selection (as inferred from interspecific comparisons tend to act at the periphery of the network. It is unknown, however, whether these trends apply to other organisms. Here, we show that long-term positive selection has preferentially targeted the periphery of the yeast interactome. Conversely, in flies, genes under positive selection encode significantly more connected and central proteins. These observations are not due to covariation of genes’ adaptability and centrality with confounding factors. Therefore, the distribution of proteins encoded by genes under recurrent positive selection across protein-protein interaction networks varies from one species to another.

  5. Positive Selection and Centrality in the Yeast and Fly Protein-Protein Interaction Networks.

    Science.gov (United States)

    Chakraborty, Sandip; Alvarez-Ponce, David

    2016-01-01

    Proteins within a molecular network are expected to be subject to different selective pressures depending on their relative hierarchical positions. However, it is not obvious what genes within a network should be more likely to evolve under positive selection. On one hand, only mutations at genes with a relatively high degree of control over adaptive phenotypes (such as those encoding highly connected proteins) are expected to be "seen" by natural selection. On the other hand, a high degree of pleiotropy at these genes is expected to hinder adaptation. Previous analyses of the human protein-protein interaction network have shown that genes under long-term, recurrent positive selection (as inferred from interspecific comparisons) tend to act at the periphery of the network. It is unknown, however, whether these trends apply to other organisms. Here, we show that long-term positive selection has preferentially targeted the periphery of the yeast interactome. Conversely, in flies, genes under positive selection encode significantly more connected and central proteins. These observations are not due to covariation of genes' adaptability and centrality with confounding factors. Therefore, the distribution of proteins encoded by genes under recurrent positive selection across protein-protein interaction networks varies from one species to another.

  6. Droplet networks with incorporated protein diodes show collective properties

    Science.gov (United States)

    Maglia, Giovanni; Heron, Andrew J.; Hwang, William L.; Holden, Matthew A.; Mikhailova, Ellina; Li, Qiuhong; Cheley, Stephen; Bayley, Hagan

    2009-07-01

    Recently, we demonstrated that submicrolitre aqueous droplets submerged in an apolar liquid containing lipid can be tightly connected by means of lipid bilayers to form networks. Droplet interface bilayers have been used for rapid screening of membrane proteins and to form asymmetric bilayers with which to examine the fundamental properties of channels and pores. Networks, meanwhile, have been used to form microscale batteries and to detect light. Here, we develop an engineered protein pore with diode-like properties that can be incorporated into droplet interface bilayers in droplet networks to form devices with electrical properties including those of a current limiter, a half-wave rectifier and a full-wave rectifier. The droplet approach, which uses unsophisticated components (oil, lipid, salt water and a simple pore), can therefore be used to create multidroplet networks with collective properties that cannot be produced by droplet pairs.

  7. Discovering disease-associated genes in weighted protein-protein interaction networks

    Science.gov (United States)

    Cui, Ying; Cai, Meng; Stanley, H. Eugene

    2018-04-01

    Although there have been many network-based attempts to discover disease-associated genes, most of them have not taken edge weight - which quantifies their relative strength - into consideration. We use connection weights in a protein-protein interaction (PPI) network to locate disease-related genes. We analyze the topological properties of both weighted and unweighted PPI networks and design an improved random forest classifier to distinguish disease genes from non-disease genes. We use a cross-validation test to confirm that weighted networks are better able to discover disease-associated genes than unweighted networks, which indicates that including link weight in the analysis of network properties provides a better model of complex genotype-phenotype associations.

  8. Functional module identification in protein interaction networks by interaction patterns

    Science.gov (United States)

    Wang, Yijie; Qian, Xiaoning

    2014-01-01

    Motivation: Identifying functional modules in protein–protein interaction (PPI) networks may shed light on cellular functional organization and thereafter underlying cellular mechanisms. Many existing module identification algorithms aim to detect densely connected groups of proteins as potential modules. However, based on this simple topological criterion of ‘higher than expected connectivity’, those algorithms may miss biologically meaningful modules of functional significance, in which proteins have similar interaction patterns to other proteins in networks but may not be densely connected to each other. A few blockmodel module identification algorithms have been proposed to address the problem but the lack of global optimum guarantee and the prohibitive computational complexity have been the bottleneck of their applications in real-world large-scale PPI networks. Results: In this article, we propose a novel optimization formulation LCP2 (low two-hop conductance sets) using the concept of Markov random walk on graphs, which enables simultaneous identification of both dense and sparse modules based on protein interaction patterns in given networks through searching for LCP2 by random walk. A spectral approximate algorithm SLCP2 is derived to identify non-overlapping functional modules. Based on a bottom-up greedy strategy, we further extend LCP2 to a new algorithm (greedy algorithm for LCP2) GLCP2 to identify overlapping functional modules. We compare SLCP2 and GLCP2 with a range of state-of-the-art algorithms on synthetic networks and real-world PPI networks. The performance evaluation based on several criteria with respect to protein complex prediction, high level Gene Ontology term prediction and especially sparse module detection, has demonstrated that our algorithms based on searching for LCP2 outperform all other compared algorithms. Availability and implementation: All data and code are available at http://www.cse.usf.edu/∼xqian/fmi/slcp2hop

  9. Network approaches to the functional analysis of microbial proteins.

    Science.gov (United States)

    Hallinan, J S; James, K; Wipat, A

    2011-01-01

    Large amounts of detailed biological data have been generated over the past few decades. Much of these data is freely available in over 1000 online databases; an enticing, but frustrating resource for microbiologists interested in a systems-level view of the structure and function of microbial cells. The frustration engendered by the need to trawl manually through hundreds of databases in order to accumulate information about a gene, protein, pathway, or organism of interest can be alleviated by the use of computational data integration to generated network views of the system of interest. Biological networks can be constructed from a single type of data, such as protein-protein binding information, or from data generated by multiple experimental approaches. In an integrated network, nodes usually represent genes or gene products, while edges represent some form of interaction between the nodes. Edges between nodes may be weighted to represent the probability that the edge exists in vivo. Networks may also be enriched with ontological annotations, facilitating both visual browsing and computational analysis via web service interfaces. In this review, we describe the construction, analysis of both single-data source and integrated networks, and their application to the inference of protein function in microbes. Copyright © 2011 Elsevier Ltd. All rights reserved.

  10. Context-specific protein network miner - an online system for exploring context-specific protein interaction networks from the literature

    KAUST Repository

    Chowdhary, Rajesh

    2012-04-06

    Background: Protein interaction networks (PINs) specific within a particular context contain crucial information regarding many cellular biological processes. For example, PINs may include information on the type and directionality of interaction (e.g. phosphorylation), location of interaction (i.e. tissues, cells), and related diseases. Currently, very few tools are capable of deriving context-specific PINs for conducting exploratory analysis. Results: We developed a literature-based online system, Context-specific Protein Network Miner (CPNM), which derives context-specific PINs in real-time from the PubMed database based on a set of user-input keywords and enhanced PubMed query system. CPNM reports enriched information on protein interactions (with type and directionality), their network topology with summary statistics (e.g. most densely connected proteins in the network; most densely connected protein-pairs; and proteins connected by most inbound/outbound links) that can be explored via a user-friendly interface. Some of the novel features of the CPNM system include PIN generation, ontology-based PubMed query enhancement, real-time, user-queried, up-to-date PubMed document processing, and prediction of PIN directionality. Conclusions: CPNM provides a tool for biologists to explore PINs. It is freely accessible at http://www.biotextminer.com/CPNM/. © 2012 Chowdhary et al.

  11. Protein networks as logic functions in development and cancer.

    Directory of Open Access Journals (Sweden)

    Janusz Dutkowski

    2011-09-01

    Full Text Available Many biological and clinical outcomes are based not on single proteins, but on modules of proteins embedded in protein networks. A fundamental question is how the proteins within each module contribute to the overall module activity. Here, we study the modules underlying three representative biological programs related to tissue development, breast cancer metastasis, or progression of brain cancer, respectively. For each case we apply a new method, called Network-Guided Forests, to identify predictive modules together with logic functions which tie the activity of each module to the activity of its component genes. The resulting modules implement a diverse repertoire of decision logic which cannot be captured using the simple approximations suggested in previous work such as gene summation or subtraction. We show that in cancer, certain combinations of oncogenes and tumor suppressors exert competing forces on the system, suggesting that medical genetics should move beyond cataloguing individual cancer genes to cataloguing their combinatorial logic.

  12. Network analysis and cross species comparison of protein-protein interaction networks of human, mouse and rat cytochrome P450 proteins that degrade xenobiotics.

    Science.gov (United States)

    Karthikeyan, Bagavathy Shanmugam; Akbarsha, Mohammad Abdulkader; Parthasarathy, Subbiah

    2016-06-21

    Cytochrome P450 (CYP) enzymes that degrade xenobiotics play a critical role in the metabolism and biotransformation of drugs and xenobiotics in humans as well as experimental animal models such as mouse and rat. These proteins function as a network collectively as well as independently. Though there are several reports on the organization, regulation and functionality of various CYP enzymes at the molecular level, the understanding of organization and functionality of these proteins at the holistic level remain unclear. The objective of this study is to understand the organization and functionality of xenobiotic degrading CYP enzymes of human, mouse and rat using network theory approaches and to study species differences that exist among them at the holistic level. For our analysis, a protein-protein interaction (PPI) network for CYP enzymes of human, mouse and rat was constructed using the STRING database. Topology, centrality, modularity and robustness analyses were performed for our predicted CYP PPI networks that were then validated by comparison with randomly generated network models. Network centrality analyses of CYP PPI networks reveal the central/hub proteins in the network. Modular analysis of the CYP PPI networks of human, mouse and rat resulted in functional clusters. These clusters were subjected to ontology and pathway enrichment analysis. The analyses show that the cluster of the human CYP PPI network is enriched with pathways principally related to xenobiotic/drug metabolism. Endo-xenobiotic crosstalk dominated in mouse and rat CYP PPI networks, and they were highly enriched with endogenous metabolic and signaling pathways. Thus, cross-species comparisons and analyses of human, mouse and rat CYP PPI networks gave insights about species differences that existed at the holistic level. More investigations from both reductionist and holistic perspectives can help understand CYP metabolism and species extrapolation in a much better way.

  13. Differential variation patterns between hubs and bottlenecks in human protein-protein interaction networks.

    Science.gov (United States)

    Pang, Erli; Hao, Yu; Sun, Ying; Lin, Kui

    2016-12-01

    The identification, description and understanding of protein-protein networks are important in cell biology and medicine, especially for the study of system biology where the focus concerns the interaction of biomolecules. Hubs and bottlenecks refer to the important proteins of a protein interaction network. Until now, very little attention has been paid to differentiate these two protein groups. By integrating human protein-protein interaction networks and human genome-wide variations across populations, we described the differences between hubs and bottlenecks in this study. Our findings showed that similar to interspecies, hubs and bottlenecks changed significantly more slowly than non-hubs and non-bottlenecks. To distinguish hubs from bottlenecks, we extracted their special members: hub-non-bottlenecks and non-hub-bottlenecks. The differences between these two groups represent what is between hubs and bottlenecks. We found that the variation rate of hubs was significantly lower than that of bottlenecks. In addition, we verified that stronger constraint is exerted on hubs than on bottlenecks. We further observed fewer non-synonymous sites on the domains of hubs than on those of bottlenecks and different molecular functions between them. Based on these results, we conclude that in recent human history, different variation patterns exist in hubs and bottlenecks in protein interaction networks. By revealing the difference between hubs and bottlenecks, our results might provide further insights in the relationship between evolution and biological structure.

  14. Deciphering the protein-protein interaction network regulating hepatocellular carcinoma metastasis.

    Science.gov (United States)

    Qin, Guoxuan; Dang, Mengjiao; Gao, Huajun; Wang, Hao; Luo, Fengting; Chen, Ruibing

    2017-09-01

    Hepatocellular carcinoma (HCC) is one of the leading causes of mortality related to cancer all over the world. To better understand the molecular mechanisms of HCC metastasis, we analyzed the proteome of three HCC cell lines with different metastasis potentials by quantitative proteomics and bioinformatics analysis. As a result, we identified 378 cellular proteins potentially associated to HCC metastasis, and constructed a highly connected protein-protein interaction (PPI) network. Functional annotation of the network uncovered prominent pathways and key roles of these proteins, suggesting that the metabolism and cytoskeleton biological processes are greatly involved with HCC metastasis. Furthermore, the integrative network analysis revealed a rich-club organization within the PPI network, indicating a hub center of connections. The rich-club nodes include several well-known cancer-related proteins, such as proto-oncogene non-receptor tyrosine kinase (SRC) and pyruvate kinase M2 (PKM2). Moreover, the differential expressions of two identified proteins, including PKM2 and actin-related protein 2/3 complex subunit 4 (ARPC4), were validated using Western blotting. These two proteins were revealed as potential prognostic markers for HCC as shown by survival rate analysis. Copyright © 2017. Published by Elsevier B.V.

  15. NatalieQ: a web server for protein-protein interaction network querying.

    Science.gov (United States)

    El-Kebir, Mohammed; Brandt, Bernd W; Heringa, Jaap; Klau, Gunnar W

    2014-04-01

    Molecular interactions need to be taken into account to adequately model the complex behavior of biological systems. These interactions are captured by various types of biological networks, such as metabolic, gene-regulatory, signal transduction and protein-protein interaction networks. We recently developed Natalie, which computes high-quality network alignments via advanced methods from combinatorial optimization. Here, we present NatalieQ, a web server for topology-based alignment of a specified query protein-protein interaction network to a selected target network using the Natalie algorithm. By incorporating similarity at both the sequence and the network level, we compute alignments that allow for the transfer of functional annotation as well as for the prediction of missing interactions. We illustrate the capabilities of NatalieQ with a biological case study involving the Wnt signaling pathway. We show that topology-based network alignment can produce results complementary to those obtained by using sequence similarity alone. We also demonstrate that NatalieQ is able to predict putative interactions. The server is available at: http://www.ibi.vu.nl/programs/natalieq/.

  16. The topology of the bacterial co-conserved protein network and its implications for predicting protein function

    Directory of Open Access Journals (Sweden)

    Leach Sonia M

    2008-06-01

    Full Text Available Abstract Background Protein-protein interactions networks are most often generated from physical protein-protein interaction data. Co-conservation, also known as phylogenetic profiles, is an alternative source of information for generating protein interaction networks. Co-conservation methods generate interaction networks among proteins that are gained or lost together through evolution. Co-conservation is a particularly useful technique in the compact bacteria genomes. Prior studies in yeast suggest that the topology of protein-protein interaction networks generated from physical interaction assays can offer important insight into protein function. Here, we hypothesize that in bacteria, the topology of protein interaction networks derived via co-conservation information could similarly improve methods for predicting protein function. Since the topology of bacteria co-conservation protein-protein interaction networks has not previously been studied in depth, we first perform such an analysis for co-conservation networks in E. coli K12. Next, we demonstrate one way in which network connectivity measures and global and local function distribution can be exploited to predict protein function for previously uncharacterized proteins. Results Our results showed, like most biological networks, our bacteria co-conserved protein-protein interaction networks had scale-free topologies. Our results indicated that some properties of the physical yeast interaction network hold in our bacteria co-conservation networks, such as high connectivity for essential proteins. However, the high connectivity among protein complexes in the yeast physical network was not seen in the co-conservation network which uses all bacteria as the reference set. We found that the distribution of node connectivity varied by functional category and could be informative for function prediction. By integrating of functional information from different annotation sources and using the

  17. Deep recurrent conditional random field network for protein secondary prediction

    DEFF Research Database (Denmark)

    Johansen, Alexander Rosenberg; Sønderby, Søren Kaae; Sønderby, Casper Kaae

    2017-01-01

    Deep learning has become the state-of-the-art method for predicting protein secondary structure from only its amino acid residues and sequence profile. Building upon these results, we propose to combine a bi-directional recurrent neural network (biRNN) with a conditional random field (CRF), which...

  18. Analysis of core–periphery organization in protein contact networks ...

    Indian Academy of Sciences (India)

    2015-09-29

    Sep 29, 2015 ... Caenorhabditis elegans (Chatterjee and Sinha 2007) and the protein interaction network of Escherichia coli (Lin et al. 2009). Recently, this decomposition technique has been used to disentangle the hierarchical structure of Internet router-level connection topology (Zhang et al. 2009), to show that software ...

  19. Analysis of core–periphery organization in protein contact networks ...

    Indian Academy of Sciences (India)

    From mutation sensitivity analysis, we show that the probability of deleterious or intolerant mutations also increases with the core order. We also show that stabilization centre residues are in the innermost cores, suggesting that the network core is critically important in maintaining the structural stability of the protein.

  20. Predicting protein complex in protein interaction network - a supervised learning based method.

    Science.gov (United States)

    Yu, Feng; Yang, Zhi; Tang, Nan; Lin, Hong; Wang, Jian; Yang, Zhi

    2014-01-01

    Protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein interactions, making it possible to predict protein complexes from protein -protein interaction networks. However, most of current methods are unsupervised learning based methods which can't utilize the information of the large amount of available known complexes. We present a supervised learning-based method for predicting protein complexes in protein - protein interaction networks. The method extracts rich features from both the unweighted and weighted networks to train a Regression model, which is then used for the cliques filtering, growth, and candidate complex filtering. The model utilizes additional "uncertainty" samples and, therefore, is more discriminative when used in the complex detection algorithm. In addition, our method uses the maximal cliques found by the Cliques algorithm as the initial cliques, which has been proven to be more effective than the method of expanding from the seeding proteins used in other methods. The experimental results on several PIN datasets show that in most cases the performance of our method are superior to comparable state-of-the-art protein complex detection techniques. The results demonstrate the several advantages of our method over other state-of-the-art techniques. Firstly, our method is a supervised learning-based method that can make full use of the information of the available known complexes instead of being only based on the topological structure of the PIN. That also means, if more training samples are provided, our method can achieve better performance than those unsupervised methods. Secondly, we design the rich feature set to describe the properties of the known complexes, which includes not only the features from the unweighted network, but also those from the weighted network built based on the Gene Ontology information. Thirdly

  1. FACETS: multi-faceted functional decomposition of protein interaction networks

    Science.gov (United States)

    Seah, Boon-Siew; Bhowmick, Sourav S.; Forbes Dewey, C.

    2012-01-01

    Motivation: The availability of large-scale curated protein interaction datasets has given rise to the opportunity to investigate higher level organization and modularity within the protein–protein interaction (PPI) network using graph theoretic analysis. Despite the recent progress, systems level analysis of high-throughput PPIs remains a daunting task because of the amount of data they present. In this article, we propose a novel PPI network decomposition algorithm called FACETS in order to make sense of the deluge of interaction data using Gene Ontology (GO) annotations. FACETS finds not just a single functional decomposition of the PPI network, but a multi-faceted atlas of functional decompositions that portray alternative perspectives of the functional landscape of the underlying PPI network. Each facet in the atlas represents a distinct interpretation of how the network can be functionally decomposed and organized. Our algorithm maximizes interpretative value of the atlas by optimizing inter-facet orthogonality and intra-facet cluster modularity. Results: We tested our algorithm on the global networks from IntAct, and compared it with gold standard datasets from MIPS and KEGG. We demonstrated the performance of FACETS. We also performed a case study that illustrates the utility of our approach. Contact: seah0097@ntu.edu.sg or assourav@ntu.edu.sg Supplementary information: Supplementary data are available at the Bioinformatics online. Availability: Our software is available freely for non-commercial purposes from: http://www.cais.ntu.edu.sg/∼assourav/Facets/ PMID:22908217

  2. Amyloid precursor protein interaction network in human testis: sentinel proteins for male reproduction.

    Science.gov (United States)

    Silva, Joana Vieira; Yoon, Sooyeon; Domingues, Sara; Guimarães, Sofia; Goltsev, Alexander V; da Cruz E Silva, Edgar Figueiredo; Mendes, José Fernando F; da Cruz E Silva, Odete Abreu Beirão; Fardilha, Margarida

    2015-01-16

    Amyloid precursor protein (APP) is widely recognized for playing a central role in Alzheimer's disease pathogenesis. Although APP is expressed in several tissues outside the human central nervous system, the functions of APP and its family members in other tissues are still poorly understood. APP is involved in several biological functions which might be potentially important for male fertility, such as cell adhesion, cell motility, signaling, and apoptosis. Furthermore, APP superfamily members are known to be associated with fertility. Knowledge on the protein networks of APP in human testis and spermatozoa will shed light on the function of APP in the male reproductive system. We performed a Yeast Two-Hybrid screen and a database search to study the interaction network of APP in human testis and sperm. To gain insights into the role of APP superfamily members in fertility, the study was extended to APP-like protein 2 (APLP2). We analyzed several topological properties of the APP interaction network and the biological and physiological properties of the proteins in the APP interaction network were also specified by gene ontologyand pathways analyses. We classified significant features related to the human male reproduction for the APP interacting proteins and identified modules of proteins with similar functional roles which may show cooperative behavior for male fertility. The present work provides the first report on the APP interactome in human testis. Our approach allowed the identification of novel interactions and recognition of key APP interacting proteins for male reproduction, particularly in sperm-oocyte interaction.

  3. AtPIN: Arabidopsis thaliana Protein Interaction Network

    Directory of Open Access Journals (Sweden)

    Silva-Filho Marcio C

    2009-12-01

    Full Text Available Abstract Background Protein-protein interactions (PPIs constitute one of the most crucial conditions to sustain life in living organisms. To study PPI in Arabidopsis thaliana we have developed AtPIN, a database and web interface for searching and building interaction networks based on publicly available protein-protein interaction datasets. Description All interactions were divided into experimentally demonstrated or predicted. The PPIs in the AtPIN database present a cellular compartment classification (C3 which divides the PPI into 4 classes according to its interaction evidence and subcellular localization. It has been shown in the literature that a pair of genuine interacting proteins are generally expected to have a common cellular role and proteins that have common interaction partners have a high chance of sharing a common function. In AtPIN, due to its integrative profile, the reliability index for a reported PPI can be postulated in terms of the proportion of interaction partners that two proteins have in common. For this, we implement the Functional Similarity Weight (FSW calculation for all first level interactions present in AtPIN database. In order to identify target proteins of cytosolic glutamyl-tRNA synthetase (Cyt-gluRS (AT5G26710 we combined two approaches, AtPIN search and yeast two-hybrid screening. Interestingly, the proteins glutamine synthetase (AT5G35630, a disease resistance protein (AT3G50950 and a zinc finger protein (AT5G24930, which has been predicted as target proteins for Cyt-gluRS by AtPIN, were also detected in the experimental screening. Conclusions AtPIN is a friendly and easy-to-use tool that aggregates information on Arabidopsis thaliana PPIs, ontology, and sub-cellular localization, and might be a useful and reliable strategy to map protein-protein interactions in Arabidopsis. AtPIN can be accessed at http://bioinfo.esalq.usp.br/atpin.

  4. Prediction of Protein-Protein Interacting Sites: How to Bridge Molecular Events to Large Scale Protein Interaction Networks

    Science.gov (United States)

    Bartoli, Lisa; Martelli, Pier Luigi; Rossi, Ivan; Fariselli, Piero; Casadio, Rita

    Most of the cellular functions are the result of the concerted action of protein complexes forming pathways and networks. For this reason, efforts were devoted to the study of protein-protein interactions. Large-scale experiments on whole genomes allowed the identification of interacting protein pairs. However residues involved in the interaction are generally not known and the majority of the interactions still lack a structural characterization. A crucial step towards the deciphering of the interaction mechanism of proteins is the recognition of their interacting surfaces, particularly in those structures for which also the most recent interaction network resources do not contain information. To this purpose, we developed a neural network-based method that is able to characterize protein complexes, by predicting amino acid residues that mediate the interactions. All the Protein Data Bank (PDB) chains, both in the unbound and in the complexed form, are predicted and the results are stored in a database of interaction surfaces (http://gpcr.biocomp.unibo.it/zenpatches). Finally, we performed a survey on the different computational methods for protein-protein interaction prediction and on their training/testing sets in order to highlight the most informative properties of protein interfaces.

  5. Evolution of an intricate J-protein network driving protein disaggregation in eukaryotes.

    Science.gov (United States)

    Nillegoda, Nadinath B; Stank, Antonia; Malinverni, Duccio; Alberts, Niels; Szlachcic, Anna; Barducci, Alessandro; De Los Rios, Paolo; Wade, Rebecca C; Bukau, Bernd

    2017-05-15

    Hsp70 participates in a broad spectrum of protein folding processes extending from nascent chain folding to protein disaggregation. This versatility in function is achieved through a diverse family of J-protein cochaperones that select substrates for Hsp70. Substrate selection is further tuned by transient complexation between different classes of J-proteins, which expands the range of protein aggregates targeted by metazoan Hsp70 for disaggregation. We assessed the prevalence and evolutionary conservation of J-protein complexation and cooperation in disaggregation. We find the emergence of a eukaryote-specific signature for interclass complexation of canonical J-proteins. Consistently, complexes exist in yeast and human cells, but not in bacteria, and correlate with cooperative action in disaggregation in vitro. Signature alterations exclude some J-proteins from networking, which ensures correct J-protein pairing, functional network integrity and J-protein specialization. This fundamental change in J-protein biology during the prokaryote-to-eukaryote transition allows for increased fine-tuning and broadening of Hsp70 function in eukaryotes.

  6. Prioritizing disease candidate proteins in cardiomyopathy-specific protein-protein interaction networks based on "guilt by association" analysis.

    Directory of Open Access Journals (Sweden)

    Wan Li

    Full Text Available The cardiomyopathies are a group of heart muscle diseases which can be inherited (familial. Identifying potential disease-related proteins is important to understand mechanisms of cardiomyopathies. Experimental identification of cardiomyophthies is costly and labour-intensive. In contrast, bioinformatics approach has a competitive advantage over experimental method. Based on "guilt by association" analysis, we prioritized candidate proteins involving in human cardiomyopathies. We first built weighted human cardiomyopathy-specific protein-protein interaction networks for three subtypes of cardiomyopathies using the known disease proteins from Online Mendelian Inheritance in Man as seeds. We then developed a method in prioritizing disease candidate proteins to rank candidate proteins in the network based on "guilt by association" analysis. It was found that most candidate proteins with high scores shared disease-related pathways with disease seed proteins. These top ranked candidate proteins were related with the corresponding disease subtypes, and were potential disease-related proteins. Cross-validation and comparison with other methods indicated that our approach could be used for the identification of potentially novel disease proteins, which may provide insights into cardiomyopathy-related mechanisms in a more comprehensive and integrated way.

  7. Completing sparse and disconnected protein-protein network by deep learning.

    Science.gov (United States)

    Huang, Lei; Liao, Li; Wu, Cathy H

    2018-03-22

    Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level prediction. Many of the existing network level methods predict PPIs under the assumption that the training network should be connected. However, this assumption greatly affects the prediction power and limits the application area because the current golden standard PPI networks are usually very sparse and disconnected. Therefore, how to effectively predict PPIs based on a training network that is sparse and disconnected remains a challenge. In this work, we developed a novel PPI prediction method based on deep learning neural network and regularized Laplacian kernel. We use a neural network with an autoencoder-like architecture to implicitly simulate the evolutionary processes of a PPI network. Neurons of the output layer correspond to proteins and are labeled with values (1 for interaction and 0 for otherwise) from the adjacency matrix of a sparse disconnected training PPI network. Unlike autoencoder, neurons at the input layer are given all zero input, reflecting an assumption of no a priori knowledge about PPIs, and hidden layers of smaller sizes mimic ancient interactome at different times during evolution. After the training step, an evolved PPI network whose rows are outputs of the neural network can be obtained. We then predict PPIs by applying the regularized Laplacian kernel to the transition matrix that is built upon the evolved PPI network. The results from cross-validation experiments show that the PPI prediction accuracies for yeast data and human data measured as AUC are increased by up to 8.4 and 14.9% respectively, as compared to the baseline. Moreover, the evolved PPI network can also help us leverage complementary information from the disconnected training network

  8. A Physical Interaction Network of Dengue Virus and Human Proteins*

    Science.gov (United States)

    Khadka, Sudip; Vangeloff, Abbey D.; Zhang, Chaoying; Siddavatam, Prasad; Heaton, Nicholas S.; Wang, Ling; Sengupta, Ranjan; Sahasrabudhe, Sudhir; Randall, Glenn; Gribskov, Michael; Kuhn, Richard J.; Perera, Rushika; LaCount, Douglas J.

    2011-01-01

    Dengue virus (DENV), an emerging mosquito-transmitted pathogen capable of causing severe disease in humans, interacts with host cell factors to create a more favorable environment for replication. However, few interactions between DENV and human proteins have been reported to date. To identify DENV-human protein interactions, we used high-throughput yeast two-hybrid assays to screen the 10 DENV proteins against a human liver activation domain library. From 45 DNA-binding domain clones containing either full-length viral genes or partially overlapping gene fragments, we identified 139 interactions between DENV and human proteins, the vast majority of which are novel. These interactions involved 105 human proteins, including six previously implicated in DENV infection and 45 linked to the replication of other viruses. Human proteins with functions related to the complement and coagulation cascade, the centrosome, and the cytoskeleton were enriched among the DENV interaction partners. To determine if the cellular proteins were required for DENV infection, we used small interfering RNAs to inhibit their expression. Six of 12 proteins targeted (CALR, DDX3X, ERC1, GOLGA2, TRIP11, and UBE2I) caused a significant decrease in the replication of a DENV replicon. We further showed that calreticulin colocalized with viral dsRNA and with the viral NS3 and NS5 proteins in DENV-infected cells, consistent with a direct role for calreticulin in DENV replication. Human proteins that interacted with DENV had significantly higher average degree and betweenness than expected by chance, which provides additional support for the hypothesis that viruses preferentially target cellular proteins that occupy central position in the human protein interaction network. This study provides a valuable starting point for additional investigations into the roles of human proteins in DENV infection. PMID:21911577

  9. A physical interaction network of dengue virus and human proteins.

    Science.gov (United States)

    Khadka, Sudip; Vangeloff, Abbey D; Zhang, Chaoying; Siddavatam, Prasad; Heaton, Nicholas S; Wang, Ling; Sengupta, Ranjan; Sahasrabudhe, Sudhir; Randall, Glenn; Gribskov, Michael; Kuhn, Richard J; Perera, Rushika; LaCount, Douglas J

    2011-12-01

    Dengue virus (DENV), an emerging mosquito-transmitted pathogen capable of causing severe disease in humans, interacts with host cell factors to create a more favorable environment for replication. However, few interactions between DENV and human proteins have been reported to date. To identify DENV-human protein interactions, we used high-throughput yeast two-hybrid assays to screen the 10 DENV proteins against a human liver activation domain library. From 45 DNA-binding domain clones containing either full-length viral genes or partially overlapping gene fragments, we identified 139 interactions between DENV and human proteins, the vast majority of which are novel. These interactions involved 105 human proteins, including six previously implicated in DENV infection and 45 linked to the replication of other viruses. Human proteins with functions related to the complement and coagulation cascade, the centrosome, and the cytoskeleton were enriched among the DENV interaction partners. To determine if the cellular proteins were required for DENV infection, we used small interfering RNAs to inhibit their expression. Six of 12 proteins targeted (CALR, DDX3X, ERC1, GOLGA2, TRIP11, and UBE2I) caused a significant decrease in the replication of a DENV replicon. We further showed that calreticulin colocalized with viral dsRNA and with the viral NS3 and NS5 proteins in DENV-infected cells, consistent with a direct role for calreticulin in DENV replication. Human proteins that interacted with DENV had significantly higher average degree and betweenness than expected by chance, which provides additional support for the hypothesis that viruses preferentially target cellular proteins that occupy central position in the human protein interaction network. This study provides a valuable starting point for additional investigations into the roles of human proteins in DENV infection.

  10. What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae?

    OpenAIRE

    Ekman, Diana; Light, Sara; Bj?rklund, ?sa K; Elofsson, Arne

    2006-01-01

    Background Most proteins interact with only a few other proteins while a small number of proteins (hubs) have many interaction partners. Hub proteins and non-hub proteins differ in several respects; however, understanding is not complete about what properties characterize the hubs and set them apart from proteins of low connectivity. Therefore, we have investigated what differentiates hubs from non-hubs and static hubs (party hubs) from dynamic hubs (date hubs) in the protein-protein interact...

  11. Convolutional LSTM Networks for Subcellular Localization of Proteins

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Sønderby, Søren Kaae; Sønderby, Casper Kaae

    2015-01-01

    Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model...... convolutional filters and experiment with an attention mechanism which lets the LSTM focus on specific parts of the protein. Lastly we introduce new visualizations of both the convolutional filters and the attention mechanisms and show how they can be used to extract biologically relevant knowledge from...

  12. Probing RNA-protein networks: biochemistry meets genomics.

    Science.gov (United States)

    Campbell, Zachary T; Wickens, Marvin

    2015-03-01

    RNA-protein interactions are pervasive. The specificity of these interactions dictates which RNAs are controlled by what protein. Here we describe a class of revolutionary new methods that enable global views of RNA-binding specificity in vitro, for both single proteins and multiprotein complexes. These methods provide insight into central issues in RNA regulation in living cells, including understanding the balance between free and bound components, the basis for exclusion of binding sites, detection of binding events in the absence of discernible regulatory elements, and new approaches to targeting endogenous transcripts by design. Comparisons of in vitro and in vivo binding provide a foundation for comprehensive understanding of the biochemistry of protein-mediated RNA regulatory networks. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Protein Kinase C Epsilon and Genetic Networks in Osteosarcoma Metastasis

    International Nuclear Information System (INIS)

    Goudarzi, Atta; Gokgoz, Nalan; Gill, Mona; Pinnaduwage, Dushanthi; Merico, Daniele; Wunder, Jay S.; Andrulis, Irene L.

    2013-01-01

    Osteosarcoma (OS) is the most common primary malignant tumor of the bone, and pulmonary metastasis is the most frequent cause of OS mortality. The aim of this study was to discover and characterize genetic networks differentially expressed in metastatic OS. Expression profiling of OS tumors, and subsequent supervised network analysis, was performed to discover genetic networks differentially activated or organized in metastatic OS compared to localized OS. Broad trends among the profiles of metastatic tumors include aberrant activity of intracellular organization and translation networks, as well as disorganization of metabolic networks. The differentially activated PRKCε-RASGRP3-GNB2 network, which interacts with the disorganized DLG2 hub, was also found to be differentially expressed among OS cell lines with differing metastatic capacity in xenograft models. PRKCε transcript was more abundant in some metastatic OS tumors; however the difference was not significant overall. In functional studies, PRKCε was not found to be involved in migration of M132 OS cells, but its protein expression was induced in M112 OS cells following IGF-1 stimulation

  14. Detection of Locally Over-Represented GO Terms in Protein-Protein Interaction Networks

    Science.gov (United States)

    LAVALLÉE-ADAM, MATHIEU; COULOMBE, BENOIT; BLANCHETTE, MATHIEU

    2015-01-01

    High-throughput methods for identifying protein-protein interactions produce increasingly complex and intricate interaction networks. These networks are extremely rich in information, but extracting biologically meaningful hypotheses from them and representing them in a human-readable manner is challenging. We propose a method to identify Gene Ontology terms that are locally over-represented in a subnetwork of a given biological network. Specifically, we propose several methods to evaluate the degree of clustering of proteins associated to a particular GO term in both weighted and unweighted PPI networks, and describe efficient methods to estimate the statistical significance of the observed clustering. We show, using Monte Carlo simulations, that our best approximation methods accurately estimate the true p-value, for random scale-free graphs as well as for actual yeast and human networks. When applied to these two biological networks, our approach recovers many known complexes and pathways, but also suggests potential functions for many subnetworks. Online Supplementary Material is available at www.liebertonline.com. PMID:20377456

  15. Evolution versus "intelligent design": comparing the topology of protein-protein interaction networks to the Internet.

    Science.gov (United States)

    Yang, Q; Siganos, G; Faloutsos, M; Lonardi, S

    2006-01-01

    Recent research efforts have made available genome-wide, high-throughput protein-protein interaction (PPI) maps for several model organisms. This has enabled the systematic analysis of PPI networks, which has become one of the primary challenges for the system biology community. In this study, we attempt to understand better the topological structure of PPI networks by comparing them against man-made communication networks, and more specifically, the Internet. Our comparative study is based on a comprehensive set of graph metrics. Our results exhibit an interesting dichotomy. On the one hand, both networks share several macroscopic properties such as scale-free and small-world properties. On the other hand, the two networks exhibit significant topological differences, such as the cliqueishness of the highest degree nodes. We attribute these differences to the distinct design principles and constraints that both networks are assumed to satisfy. We speculate that the evolutionary constraints that favor the survivability and diversification are behind the building process of PPI networks, whereas the leading force in shaping the Internet topology is a decentralized optimization process geared towards efficient node communication.

  16. PROSHIFT: Protein chemical shift prediction using artificial neural networks

    International Nuclear Information System (INIS)

    Meiler, Jens

    2003-01-01

    The importance of protein chemical shift values for the determination of three-dimensional protein structure has increased in recent years because of the large databases of protein structures with assigned chemical shift data. These databases have allowed the investigation of the quantitative relationship between chemical shift values obtained by liquid state NMR spectroscopy and the three-dimensional structure of proteins. A neural network was trained to predict the 1 H, 13 C, and 15 N of proteins using their three-dimensional structure as well as experimental conditions as input parameters. It achieves root mean square deviations of 0.3 ppm for hydrogen, 1.3 ppm for carbon, and 2.6 ppm for nitrogen chemical shifts. The model reflects important influences of the covalent structure as well as of the conformation not only for backbone atoms (as, e.g., the chemical shift index) but also for side-chain nuclei. For protein models with a RMSD smaller than 5 A a correlation of the RMSD and the r.m.s. deviation between the predicted and the experimental chemical shift is obtained. Thus the method has the potential to not only support the assignment process of proteins but also help with the validation and the refinement of three-dimensional structural proposals. It is freely available for academic users at the PROSHIFT server: www.jens-meiler.de/proshift.html

  17. Peptide microarrays to probe for competition for binding sites in a protein interaction network

    NARCIS (Netherlands)

    Sinzinger, M.D.S.; Ruttekolk, I.R.R.; Gloerich, J.; Wessels, H.; Chung, Y.D.; Adjobo-Hermans, M.J.W.; Brock, R.E.

    2013-01-01

    Cellular protein interaction networks are a result of the binding preferences of a particular protein and the entirety of interactors that mutually compete for binding sites. Therefore, the reconstruction of interaction networks by the accumulation of interaction networks for individual proteins

  18. Rbfox2 controls autoregulation in RNA-binding protein networks.

    Science.gov (United States)

    Jangi, Mohini; Boutz, Paul L; Paul, Prakriti; Sharp, Phillip A

    2014-03-15

    The tight regulation of splicing networks is critical for organismal development. To maintain robust splicing patterns, many splicing factors autoregulate their expression through alternative splicing-coupled nonsense-mediated decay (AS-NMD). However, as negative autoregulation results in a self-limiting window of splicing factor expression, it is unknown how variations in steady-state protein levels can arise in different physiological contexts. Here, we demonstrate that Rbfox2 cross-regulates AS-NMD events within RNA-binding proteins to alter their expression. Using individual nucleotide-resolution cross-linking immunoprecipitation coupled to high-throughput sequencing (iCLIP) and mRNA sequencing, we identified >200 AS-NMD splicing events that are bound by Rbfox2 in mouse embryonic stem cells. These "silent" events are characterized by minimal apparent splicing changes but appreciable changes in gene expression upon Rbfox2 knockdown due to degradation of the NMD-inducing isoform. Nearly 70 of these AS-NMD events fall within genes encoding RNA-binding proteins, many of which are autoregulated. As with the coding splicing events that we found to be regulated by Rbfox2, silent splicing events are evolutionarily conserved and frequently contain the Rbfox2 consensus UGCAUG. Our findings uncover an unexpectedly broad and multilayer regulatory network controlled by Rbfox2 and offer an explanation for how autoregulatory splicing networks are tuned.

  19. The construction of an amino acid network for understanding protein structure and function.

    Science.gov (United States)

    Yan, Wenying; Zhou, Jianhong; Sun, Maomin; Chen, Jiajia; Hu, Guang; Shen, Bairong

    2014-06-01

    Amino acid networks (AANs) are undirected networks consisting of amino acid residues and their interactions in three-dimensional protein structures. The analysis of AANs provides novel insight into protein science, and several common amino acid network properties have revealed diverse classes of proteins. In this review, we first summarize methods for the construction and characterization of AANs. We then compare software tools for the construction and analysis of AANs. Finally, we review the application of AANs for understanding protein structure and function, including the identification of functional residues, the prediction of protein folding, analyzing protein stability and protein-protein interactions, and for understanding communication within and between proteins.

  20. An automated approach to network features of protein structure ensembles

    Science.gov (United States)

    Bhattacharyya, Moitrayee; Bhat, Chanda R; Vishveshwara, Saraswathi

    2013-01-01

    Network theory applied to protein structures provides insights into numerous problems of biological relevance. The explosion in structural data available from PDB and simulations establishes a need to introduce a standalone-efficient program that assembles network concepts/parameters under one hood in an automated manner. Herein, we discuss the development/application of an exhaustive, user-friendly, standalone program package named PSN-Ensemble, which can handle structural ensembles generated through molecular dynamics (MD) simulation/NMR studies or from multiple X-ray structures. The novelty in network construction lies in the explicit consideration of side-chain interactions among amino acids. The program evaluates network parameters dealing with topological organization and long-range allosteric communication. The introduction of a flexible weighing scheme in terms of residue pairwise cross-correlation/interaction energy in PSN-Ensemble brings in dynamical/chemical knowledge into the network representation. Also, the results are mapped on a graphical display of the structure, allowing an easy access of network analysis to a general biological community. The potential of PSN-Ensemble toward examining structural ensemble is exemplified using MD trajectories of an ubiquitin-conjugating enzyme (UbcH5b). Furthermore, insights derived from network parameters evaluated using PSN-Ensemble for single-static structures of active/inactive states of β2-adrenergic receptor and the ternary tRNA complexes of tyrosyl tRNA synthetases (from organisms across kingdoms) are discussed. PSN-Ensemble is freely available from http://vishgraph.mbu.iisc.ernet.in/PSN-Ensemble/psn_index.html. PMID:23934896

  1. Prediction and characterization of protein-protein interaction networks in swine

    Directory of Open Access Journals (Sweden)

    Wang Fen

    2012-01-01

    Full Text Available Abstract Background Studying the large-scale protein-protein interaction (PPI network is important in understanding biological processes. The current research presents the first PPI map of swine, which aims to give new insights into understanding their biological processes. Results We used three methods, Interolog-based prediction of porcine PPI network, domain-motif interactions from structural topology-based prediction of porcine PPI network and motif-motif interactions from structural topology-based prediction of porcine PPI network, to predict porcine protein interactions among 25,767 porcine proteins. We predicted 20,213, 331,484, and 218,705 porcine PPIs respectively, merged the three results into 567,441 PPIs, constructed four PPI networks, and analyzed the topological properties of the porcine PPI networks. Our predictions were validated with Pfam domain annotations and GO annotations. Averages of 70, 10,495, and 863 interactions were related to the Pfam domain-interacting pairs in iPfam database. For comparison, randomized networks were generated, and averages of only 4.24, 66.79, and 44.26 interactions were associated with Pfam domain-interacting pairs in iPfam database. In GO annotations, we found 52.68%, 75.54%, 27.20% of the predicted PPIs sharing GO terms respectively. However, the number of PPI pairs sharing GO terms in the 10,000 randomized networks reached 52.68%, 75.54%, 27.20% is 0. Finally, we determined the accuracy and precision of the methods. The methods yielded accuracies of 0.92, 0.53, and 0.50 at precisions of about 0.93, 0.74, and 0.75, respectively. Conclusion The results reveal that the predicted PPI networks are considerably reliable. The present research is an important pioneering work on protein function research. The porcine PPI data set, the confidence score of each interaction and a list of related data are available at (http://pppid.biositemap.com/.

  2. Small-strain dynamic rheology of food protein networks.

    Science.gov (United States)

    Tunick, Michael H

    2011-03-09

    Small-amplitude oscillatory shear analyses of samples containing protein are useful for determining the nature of the protein matrix without damaging it. G' (elastic or storage modulus), G'' (viscous or loss modulus), and tan δ (loss tangent, the ratio of G'' to G') give information on the properties of the network. Strain, frequency, time, and temperature sweeps provide information on the linear viscoelastic region, structural assembly, and thermal characteristics. The gelation point may be determined by locating the time at which tan δ is independent of frequency or the temperature at which G' becomes greater than G''. The logarithm of η* (complex viscosity) may be plotted against the reciprocal of the absolute temperature, with the slope being proportional to the activation energy. Dynamic tests of protein-containing samples reveal a great deal about their rheological characteristics.

  3. Information theory in systems biology. Part II: protein-protein interaction and signaling networks.

    Science.gov (United States)

    Mousavian, Zaynab; Díaz, José; Masoudi-Nejad, Ali

    2016-03-01

    By the development of information theory in 1948 by Claude Shannon to address the problems in the field of data storage and data communication over (noisy) communication channel, it has been successfully applied in many other research areas such as bioinformatics and systems biology. In this manuscript, we attempt to review some of the existing literatures in systems biology, which are using the information theory measures in their calculations. As we have reviewed most of the existing information-theoretic methods in gene regulatory and metabolic networks in the first part of the review, so in the second part of our study, the application of information theory in other types of biological networks including protein-protein interaction and signaling networks will be surveyed. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Validation of protein models by a neural network approach

    Directory of Open Access Journals (Sweden)

    Fantucci Piercarlo

    2008-01-01

    Full Text Available Abstract Background The development and improvement of reliable computational methods designed to evaluate the quality of protein models is relevant in the context of protein structure refinement, which has been recently identified as one of the bottlenecks limiting the quality and usefulness of protein structure prediction. Results In this contribution, we present a computational method (Artificial Intelligence Decoys Evaluator: AIDE which is able to consistently discriminate between correct and incorrect protein models. In particular, the method is based on neural networks that use as input 15 structural parameters, which include energy, solvent accessible surface, hydrophobic contacts and secondary structure content. The results obtained with AIDE on a set of decoy structures were evaluated using statistical indicators such as Pearson correlation coefficients, Znat, fraction enrichment, as well as ROC plots. It turned out that AIDE performances are comparable and often complementary to available state-of-the-art learning-based methods. Conclusion In light of the results obtained with AIDE, as well as its comparison with available learning-based methods, it can be concluded that AIDE can be successfully used to evaluate the quality of protein structures. The use of AIDE in combination with other evaluation tools is expected to further enhance protein refinement efforts.

  5. Protein interaction networks as metric spaces: a novel perspective on distribution of hubs

    Science.gov (United States)

    2014-01-01

    Background In the post-genomic era, a central and overarching question in the analysis of protein-protein interaction networks continues to be whether biological characteristics and functions of proteins such as lethality, physiological malfunctions and malignancy are intimately linked to the topological role proteins play in the network as a mathematical structure. One of the key features that have implicitly been presumed is the existence of hubs, highly connected proteins considered to play a crucial role in biological networks. We explore the structure of protein interaction networks of a number of organisms as metric spaces and show that hubs are non randomly positioned and, from a distance point of view, centrally located. Results By analysing how the human functional protein interaction network, the human signalling network, Saccharomyces cerevisiae, Arabidopsis thaliana and Escherichia coli protein-protein interaction networks from various databases are distributed as metric spaces, we found that proteins interact radially through a central node, high degree proteins coagulate in the centre of the network, and those far away from the centre have low degree. We further found that the distribution of proteins from the centre is in some hierarchy of importance and has biological significance. Conclusions We conclude that structurally, protein interaction networks are mathematical entities that share properties between organisms but not necessarily with other networks that follow power-law. We therefore conclude that (i) if there are hubs defined by degree, they are not distributed randomly; (ii) zones closest to the centre of the network are enriched for critically important proteins and are also functionally very specialised for specific 'house keeping’ functions; (iii) proteins closest to the network centre are functionally less dispensable and may present good targets for therapy development; and (iv) network biology requires its own network theory

  6. Protein interaction networks as metric spaces: a novel perspective on distribution of hubs.

    Science.gov (United States)

    Fadhal, Emad; Gamieldien, Junaid; Mwambene, Eric C

    2014-01-18

    In the post-genomic era, a central and overarching question in the analysis of protein-protein interaction networks continues to be whether biological characteristics and functions of proteins such as lethality, physiological malfunctions and malignancy are intimately linked to the topological role proteins play in the network as a mathematical structure. One of the key features that have implicitly been presumed is the existence of hubs, highly connected proteins considered to play a crucial role in biological networks. We explore the structure of protein interaction networks of a number of organisms as metric spaces and show that hubs are non randomly positioned and, from a distance point of view, centrally located. By analysing how the human functional protein interaction network, the human signalling network, Saccharomyces cerevisiae, Arabidopsis thaliana and Escherichia coli protein-protein interaction networks from various databases are distributed as metric spaces, we found that proteins interact radially through a central node, high degree proteins coagulate in the centre of the network, and those far away from the centre have low degree. We further found that the distribution of proteins from the centre is in some hierarchy of importance and has biological significance. We conclude that structurally, protein interaction networks are mathematical entities that share properties between organisms but not necessarily with other networks that follow power-law. We therefore conclude that (i) if there are hubs defined by degree, they are not distributed randomly; (ii) zones closest to the centre of the network are enriched for critically important proteins and are also functionally very specialised for specific 'house keeping' functions; (iii) proteins closest to the network centre are functionally less dispensable and may present good targets for therapy development; and (iv) network biology requires its own network theory modelled on actual biological evidence

  7. Disease candidate gene identification and prioritization using protein interaction networks

    Directory of Open Access Journals (Sweden)

    Aronow Bruce J

    2009-02-01

    Full Text Available Abstract Background Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-protein interaction network (PPIN analyses. Results For the first time, extended versions of the PageRank and HITS algorithms, and the K-Step Markov method are applied to prioritize disease candidate genes in a training-test schema. Using a list of known disease-related genes from our earlier study as a training set ("seeds", and the rest of the known genes as a test list, we perform large-scale cross validation to rank the candidate genes and also evaluate and compare the performance of our approach. Under appropriate settings – for example, a back probability of 0.3 for PageRank with Priors and HITS with Priors, and step size 6 for K-Step Markov method – the three methods achieved a comparable AUC value, suggesting a similar performance. Conclusion Even though network-based methods are generally not as effective as integrated functional annotation-based methods for disease candidate gene prioritization, in a one-to-one comparison, PPIN-based candidate gene prioritization performs better than all other gene features or annotations. Additionally, we demonstrate that methods used for studying both social and Web networks can be successfully used for disease candidate gene prioritization.

  8. Construction and analysis of protein-protein interaction network correlated with ankylosing spondylitis.

    Science.gov (United States)

    Kanwal, Attiya; Fazal, Sahar

    2018-01-05

    Ankylosing spondylitis, a systemic illness is a foundation of progressing joint swelling that for the most part influences the spine. However, it frequently causes aggravation in different joints far from the spine, and in addition organs, for example, the eyes, heart, lungs, and kidneys. It's an immune system ailment that may be activated by specific sorts of bacterial or viral diseases that initiate an invulnerable reaction that don't close off after the contamination is recuperated. The particular reason for ankylosing spondylitis is obscure, yet hereditary qualities assume a huge part in this condition. The rising apparatuses of network medicine offer a stage to investigate an unpredictable illness at framework level. In this study, we meant to recognize the key proteins and the biological regulator pathways including in AS and further investigating the molecular connectivity between these pathways by the topological examination of the Protein-protein communication (PPI) system. The extended network including of 93 nodes and have 199 interactions respectively scanned from STRING database and some separated small networks. 24 proteins with high BC at the threshold of 0.01 and 55 proteins with large degree at the threshold of 1 have been identified. CD4 with highest BC and Closeness centrality located in the centre of the network. The backbone network derived from high BC proteins presents a clear and visual overview which shows all important regulatory pathways for AS and the crosstalk between them. The finding of this research suggests that AS variation is orchestrated by an integrated PPI network centered on CD4 out of 93 nodes. Ankylosing spondylitis, a systemic disease is an establishment of advancing joint swelling that generally impacts the spine. Be that as it may, it as often as possible causes disturbance in various joints a long way from the spine, and what's more organs. It's a resistant framework affliction that might be actuated by particular sorts

  9. Gene, protein and network of male sterility in rice

    Directory of Open Access Journals (Sweden)

    Wang eKun

    2013-04-01

    Full Text Available Rice is one of the most important model crop plants whose heterosis has been well exploited in commercial hybrid seed production via a variety of types of male sterile lines. Hybrid rice cultivation area is steadily expanding around the world, especially in Southern Asia. Characterization of genes and proteins related to male sterility aims to understand how and why the male sterility occurs, and which proteins are the key players for microspores abortion. Recently, a series of genes and proteins related to cytoplasmic male sterility, photoperiod sensitive male sterility, self-incompatibility and other types of microspores deterioration have been characterized through genetics or proteomics. Especially the latter, offers us a powerful and high throughput approach to discern the novel proteins involving in male-sterile pathways which may help us to breed artificial male-sterile system. This represents an alternative tool to meet the critical challenge of further development of hybrid rice. In this paper, we reviewed the recent developments in our understanding of male sterility in rice hybrid production across gene, protein and integrated network levels, and also, present a perspective on the engineering of male sterile lines for hybrid rice production.

  10. Multivariate Entropy Characterizes the Gene Expression and Protein-Protein Networks in Four Types of Cancer

    Directory of Open Access Journals (Sweden)

    Angel Juarez-Flores

    2018-02-01

    Full Text Available There is an important urgency to detect cancer at early stages to treat it, to improve the patients’ lifespans, and even to cure it. In this work, we determined the entropic contributions of genes in cancer networks. We detected sudden changes in entropy values in melanoma, hepatocellular carcinoma, pancreatic cancer, and squamous lung cell carcinoma associated to transitions from healthy controls to cancer. We also identified the most relevant genes involved in carcinogenic process of the four types of cancer with the help of entropic changes in local networks. Their corresponding proteins could be used as potential targets for treatments and as biomarkers of cancer.

  11. Computational 3D imaging to quantify structural components and assembly of protein networks.

    Science.gov (United States)

    Asgharzadeh, Pouyan; Özdemir, Bugra; Reski, Ralf; Röhrle, Oliver; Birkhold, Annette I

    2018-03-15

    Traditionally, protein structures have been described by the secondary structure architecture and fold arrangement. However, the relatively novel method of 3D confocal microscopy of fluorescent-protein-tagged networks in living cells allows resolving the detailed spatial organization of these networks. This provides new possibilities to predict network functionality, as structure and function seem to be linked at various scales. Here, we propose a quantitative approach using 3D confocal microscopy image data to describe protein networks based on their nano-structural characteristics. This analysis is constructed in four steps: (i) Segmentation of the microscopic raw data into a volume model and extraction of a spatial graph representing the protein network. (ii) Quantifying protein network gross morphology using the volume model. (iii) Quantifying protein network components using the spatial graph. (iv) Linking these two scales to obtain insights into network assembly. Here, we quantitatively describe the filamentous temperature sensitive Z protein network of the moss Physcomitrella patens and elucidate relations between network size and assembly details. Future applications will link network structure and functionality by tracking dynamic structural changes over time and comparing different states or types of networks, possibly allowing more precise identification of (mal) functions or the design of protein-engineered biomaterials for applications in regenerative medicine. Protein networks are highly complex and dynamic structures that play various roles in biological environments. Analyzing the detailed spatial structure of these networks may lead to new insight into biological functions and malfunctions. Here, we propose a tool set that extracts structural information at two scales of the protein network and allows therefore to address questions such as "how is the network built?" or "how networks grow?". Copyright © 2018 Acta Materialia Inc. Published by

  12. Deep recurrent conditional random field network for protein secondary prediction

    DEFF Research Database (Denmark)

    Johansen, Alexander Rosenberg; Sønderby, Søren Kaae; Sønderby, Casper Kaae

    2017-01-01

    Deep learning has become the state-of-the-art method for predicting protein secondary structure from only its amino acid residues and sequence profile. Building upon these results, we propose to combine a bi-directional recurrent neural network (biRNN) with a conditional random field (CRF), which...... of the labels for all time-steps. We condition the CRF on the output of biRNN, which learns a distributed representation based on the entire sequence. The biRNN-CRF is therefore close to ideally suited for the secondary structure task because a high degree of cross-talk between neighboring elements can...

  13. A membrane protein / signaling protein interaction network for Arabidopsis version AMPv2

    Directory of Open Access Journals (Sweden)

    Sylvie Lalonde

    2010-09-01

    Full Text Available Interactions between membrane proteins and the soluble fraction are essential for signal transduction and for regulating nutrient transport. To gain insights into the membrane-based interactome, 3,852 open reading frames (ORFs out of a target list of 8,383 representing membrane and signaling proteins from Arabidopsis thaliana were cloned into a Gateway compatible vector. The mating-based split-ubiquitin system was used to screen for potential protein-protein interactions (pPPIs among 490 Arabidopsis ORFs. A binary robotic screen between 142 receptor-like kinases, 72 transporters, 57 soluble protein kinases and phosphatases, 40 glycosyltransferases, 95 proteins of various functions and 89 proteins with unknown function detected 387 out of 90,370 possible PPIs. A secondary screen confirmed 343 (of 387 pPPIs between 179 proteins, yielding a scale-free network (r2=0.863. Eighty of 142 transmembrane receptor-like kinases (RLK tested positive, identifying three homomers, 63 heteromers and 80 pPPIs with other proteins. Thirty-one out of 142 RLK interactors (including RLKs had previously been found to be phosphorylated; thus interactors may be substrates for respective RLKs. None of the pPPIs described here had been reported in the major interactome databases, including potential interactors of G protein-coupled receptors, phospholipase C, and AMT ammonium transporters. Two RLKs found as putative interactors of AMT1;1 were independently confirmed using a split luciferase assay in Arabidopsis protoplasts. These RLKs may be involved in ammonium-dependent phosphorylation of the C-terminus and regulation of ammonium uptake activity. The robotic screening method established here will enable a systematic analysis of membrane protein interactions in fungi, plants and metazoa.

  14. Dynamic circadian protein-protein interaction networks predict temporal organization of cellular functions.

    Directory of Open Access Journals (Sweden)

    Thomas Wallach

    2013-03-01

    Full Text Available Essentially all biological processes depend on protein-protein interactions (PPIs. Timing of such interactions is crucial for regulatory function. Although circadian (~24-hour clocks constitute fundamental cellular timing mechanisms regulating important physiological processes, PPI dynamics on this timescale are largely unknown. Here, we identified 109 novel PPIs among circadian clock proteins via a yeast-two-hybrid approach. Among them, the interaction of protein phosphatase 1 and CLOCK/BMAL1 was found to result in BMAL1 destabilization. We constructed a dynamic circadian PPI network predicting the PPI timing using circadian expression data. Systematic circadian phenotyping (RNAi and overexpression suggests a crucial role for components involved in dynamic interactions. Systems analysis of a global dynamic network in liver revealed that interacting proteins are expressed at similar times likely to restrict regulatory interactions to specific phases. Moreover, we predict that circadian PPIs dynamically connect many important cellular processes (signal transduction, cell cycle, etc. contributing to temporal organization of cellular physiology in an unprecedented manner.

  15. The role of exon shuffling in shaping protein-protein interaction networks

    Directory of Open Access Journals (Sweden)

    França Gustavo S

    2010-12-01

    Full Text Available Abstract Background Physical protein-protein interaction (PPI is a critical phenomenon for the function of most proteins in living organisms and a significant fraction of PPIs are the result of domain-domain interactions. Exon shuffling, intron-mediated recombination of exons from existing genes, is known to have been a major mechanism of domain shuffling in metazoans. Thus, we hypothesized that exon shuffling could have a significant influence in shaping the topology of PPI networks. Results We tested our hypothesis by compiling exon shuffling and PPI data from six eukaryotic species: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Cryptococcus neoformans and Arabidopsis thaliana. For all four metazoan species, genes enriched in exon shuffling events presented on average higher vertex degree (number of interacting partners in PPI networks. Furthermore, we verified that a set of protein domains that are simultaneously promiscuous (known to interact to multiple types of other domains, self-interacting (able to interact with another copy of themselves and abundant in the genomes presents a stronger signal for exon shuffling. Conclusions Exon shuffling appears to have been a recurrent mechanism for the emergence of new PPIs along metazoan evolution. In metazoan genomes, exon shuffling also promoted the expansion of some protein domains. We speculate that their promiscuous and self-interacting properties may have been decisive for that expansion.

  16. Methionine sulfoxides on prion protein Helix-3 switch on the alpha-fold destabilization required for conversion.

    Directory of Open Access Journals (Sweden)

    Giorgio Colombo

    Full Text Available BACKGROUND: The conversion of the cellular prion protein (PrP(C into the infectious form (PrP(Sc is the key event in prion induced neurodegenerations. This process is believed to involve a multi-step conformational transition from an alpha-helical (PrP(C form to a beta-sheet-rich (PrP(Sc state. In addition to the conformational difference, PrP(Sc exhibits as covalent signature the sulfoxidation of M213. To investigate whether such modification may play a role in the misfolding process we have studied the impact of methionine oxidation on the dynamics and energetics of the HuPrP(125-229 alpha-fold. METHODOLOGY/PRINCIPAL FINDINGS: Using molecular dynamics simulation, essential dynamics, correlated motions and signal propagation analysis, we have found that substitution of the sulfur atom of M213 by a sulfoxide group impacts on the stability of the native state increasing the flexibility of regions preceding the site of the modification and perturbing the network of stabilizing interactions. Together, these changes favor the population of alternative states which maybe essential in the productive pathway of the pathogenic conversion. These changes are also observed when the sulfoxidation is placed at M206 and at both, M206 and M213. CONCLUSIONS/SIGNIFICANCE: Our results suggest that the sulfoxidation of Helix-3 methionines might be the switch for triggering the initial alpha-fold destabilization required for the productive pathogenic conversion.

  17. Visualization of protein interaction networks: problems and solutions

    Science.gov (United States)

    2013-01-01

    Background Visualization concerns the representation of data visually and is an important task in scientific research. Protein-protein interactions (PPI) are discovered using either wet lab techniques, such mass spectrometry, or in silico predictions tools, resulting in large collections of interactions stored in specialized databases. The set of all interactions of an organism forms a protein-protein interaction network (PIN) and is an important tool for studying the behaviour of the cell machinery. Since graphic representation of PINs may highlight important substructures, e.g. protein complexes, visualization is more and more used to study the underlying graph structure of PINs. Although graphs are well known data structures, there are different open problems regarding PINs visualization: the high number of nodes and connections, the heterogeneity of nodes (proteins) and edges (interactions), the possibility to annotate proteins and interactions with biological information extracted by ontologies (e.g. Gene Ontology) that enriches the PINs with semantic information, but complicates their visualization. Methods In these last years many software tools for the visualization of PINs have been developed. Initially thought for visualization only, some of them have been successively enriched with new functions for PPI data management and PIN analysis. The paper analyzes the main software tools for PINs visualization considering four main criteria: (i) technology, i.e. availability/license of the software and supported OS (Operating System) platforms; (ii) interoperability, i.e. ability to import/export networks in various formats, ability to export data in a graphic format, extensibility of the system, e.g. through plug-ins; (iii) visualization, i.e. supported layout and rendering algorithms and availability of parallel implementation; (iv) analysis, i.e. availability of network analysis functions, such as clustering or mining of the graph, and the possibility to

  18. Reconstruction of the yeast protein-protein interaction network involved in nutrient sensing and global metabolic regulation

    DEFF Research Database (Denmark)

    Nandy, Subir Kumar; Jouhten, Paula; Nielsen, Jens

    2010-01-01

    BACKGROUND: Several protein-protein interaction studies have been performed for the yeast Saccharomyces cerevisiae using different high-throughput experimental techniques. All these results are collected in the BioGRID database and the SGD database provide detailed annotation of the different......-sensing and metabolic regulatory signal transduction pathways (STP) operating in Saccharomyces cerevisiae. The reconstructed STP network includes a full protein-protein interaction network including the key nodes Snf1, Tor1, Hog1 and Pka1. The network includes a total of 623 structural open reading frames (ORFs...

  19. A human protein interaction network shows conservation of aging processes between human and invertebrate species.

    Directory of Open Access Journals (Sweden)

    Russell Bell

    2009-03-01

    Full Text Available We have mapped a protein interaction network of human homologs of proteins that modify longevity in invertebrate species. This network is derived from a proteome-scale human protein interaction Core Network generated through unbiased high-throughput yeast two-hybrid searches. The longevity network is composed of 175 human homologs of proteins known to confer increased longevity through loss of function in yeast, nematode, or fly, and 2,163 additional human proteins that interact with these homologs. Overall, the network consists of 3,271 binary interactions among 2,338 unique proteins. A comparison of the average node degree of the human longevity homologs with random sets of proteins in the Core Network indicates that human homologs of longevity proteins are highly connected hubs with a mean node degree of 18.8 partners. Shortest path length analysis shows that proteins in this network are significantly more connected than would be expected by chance. To examine the relationship of this network to human aging phenotypes, we compared the genes encoding longevity network proteins to genes known to be changed transcriptionally during aging in human muscle. In the case of both the longevity protein homologs and their interactors, we observed enrichments for differentially expressed genes in the network. To determine whether homologs of human longevity interacting proteins can modulate life span in invertebrates, homologs of 18 human FRAP1 interacting proteins showing significant changes in human aging muscle were tested for effects on nematode life span using RNAi. Of 18 genes tested, 33% extended life span when knocked-down in Caenorhabditis elegans. These observations indicate that a broad class of longevity genes identified in invertebrate models of aging have relevance to human aging. They also indicate that the longevity protein interaction network presented here is enriched for novel conserved longevity proteins.

  20. A Topology Potential-Based Method for Identifying Essential Proteins from PPI Networks.

    Science.gov (United States)

    Li, Min; Lu, Yu; Wang, Jianxin; Wu, Fang-Xiang; Pan, Yi

    2015-01-01

    Essential proteins are indispensable for cellular life. It is of great significance to identify essential proteins that can help us understand the minimal requirements for cellular life and is also very important for drug design. However, identification of essential proteins based on experimental approaches are typically time-consuming and expensive. With the development of high-throughput technology in the post-genomic era, more and more protein-protein interaction data can be obtained, which make it possible to study essential proteins from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. Most of these topology based essential protein discovery methods were to use network centralities. In this paper, we investigate the essential proteins' topological characters from a completely new perspective. To our knowledge it is the first time that topology potential is used to identify essential proteins from a protein-protein interaction (PPI) network. The basic idea is that each protein in the network can be viewed as a material particle which creates a potential field around itself and the interaction of all proteins forms a topological field over the network. By defining and computing the value of each protein's topology potential, we can obtain a more precise ranking which reflects the importance of proteins from the PPI network. The experimental results show that topology potential-based methods TP and TP-NC outperform traditional topology measures: degree centrality (DC), betweenness centrality (BC), closeness centrality (CC), subgraph centrality (SC), eigenvector centrality (EC), information centrality (IC), and network centrality (NC) for predicting essential proteins. In addition, these centrality measures are improved on their performance for identifying essential proteins in biological network when controlled by topology potential.

  1. Effective comparative analysis of protein-protein interaction networks by measuring the steady-state network flow using a Markov model.

    Science.gov (United States)

    Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun

    2016-10-06

    Comparative analysis of protein-protein interaction (PPI) networks provides an effective means of detecting conserved functional network modules across different species. Such modules typically consist of orthologous proteins with conserved interactions, which can be exploited to computationally predict the modules through network comparison. In this work, we propose a novel probabilistic framework for comparing PPI networks and effectively predicting the correspondence between proteins, represented as network nodes, that belong to conserved functional modules across the given PPI networks. The basic idea is to estimate the steady-state network flow between nodes that belong to different PPI networks based on a Markov random walk model. The random walker is designed to make random moves to adjacent nodes within a PPI network as well as cross-network moves between potential orthologous nodes with high sequence similarity. Based on this Markov random walk model, we estimate the steady-state network flow - or the long-term relative frequency of the transitions that the random walker makes - between nodes in different PPI networks, which can be used as a probabilistic score measuring their potential correspondence. Subsequently, the estimated scores can be used for detecting orthologous proteins in conserved functional modules through network alignment. Through evaluations based on multiple real PPI networks, we demonstrate that the proposed scheme leads to improved alignment results that are biologically more meaningful at reduced computational cost, outperforming the current state-of-the-art algorithms. The source code and datasets can be downloaded from http://www.ece.tamu.edu/~bjyoon/CUFID .

  2. Towards a map of the Populus biomass protein-protein interaction network

    Energy Technology Data Exchange (ETDEWEB)

    Beers, Eric [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States); Brunner, Amy [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States); Helm, Richard [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States); Dickerman, Allan [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)

    2015-07-31

    -depth characterizations. Characterizations involved both in vivo and in vitro independent methods to confirm protein-protein interactions and the evaluation of novel phenotypes resulting from creation of transgenic poplar and Arabidopsis plants engineered for increased or decreased expression of the selected genes. Transgenic poplar trees were studied in growth chamber, greenhouse, and two separate replicated field trials involving over 25 distinct wood-associated proteins. In-depth characterizations yielding positive results include the following. First, a NAC domain transcription factor (NAC154) that is a promoter of stress response and dormancy in trees was discovered. Increasing expression of NAC154 caused stunted growth and premature senescence, while decreasing expression led to both delayed bud and leaf expansion in spring and delayed leaf drop (i.e., prolonged leaf retention) in fall. Second, we discovered and characterized a new connection between a negative regulator of wood formation, the NAC domain transcription factor XND1, and an important regulator of cell division and cell differentiation, RBR. Third, we identified a new network of interacting wood-associated transcription factors belonging to the MYB and HD families. One of the HD family proteins, WOX13, was used to prepare transgenic poplar for high-level expression, resulting in significantly increased lateral branch growth. Finally, we modeled and performed in vitro analyses of the insect protein rubber resilin and we prepared transgenic Arabidopsis plants for expression of resilin to test the feasibility of using resilin to modify lignin cross-linking in wood and reduce recalcitrance and improve yield of fermentable sugars for biofuels production. Analysis of these and additional transgenics created with this support is continuing.

  3. Weighted Protein Interaction Network Analysis of Frontotemporal Dementia.

    Science.gov (United States)

    Ferrari, Raffaele; Lovering, Ruth C; Hardy, John; Lewis, Patrick A; Manzoni, Claudia

    2017-02-03

    The genetic analysis of complex disorders has undoubtedly led to the identification of a wealth of associations between genes and specific traits. However, moving from genetics to biochemistry one gene at a time has, to date, rather proved inefficient and under-powered to comprehensively explain the molecular basis of phenotypes. Here we present a novel approach, weighted protein-protein interaction network analysis (W-PPI-NA), to highlight key functional players within relevant biological processes associated with a given trait. This is exemplified in the current study by applying W-PPI-NA to frontotemporal dementia (FTD): We first built the state of the art FTD protein network (FTD-PN) and then analyzed both its topological and functional features. The FTD-PN resulted from the sum of the individual interactomes built around FTD-spectrum genes, leading to a total of 4198 nodes. Twenty nine of 4198 nodes, called inter-interactome hubs (IIHs), represented those interactors able to bridge over 60% of the individual interactomes. Functional annotation analysis not only reiterated and reinforced previous findings from single genes and gene-coexpression analyses but also indicated a number of novel potential disease related mechanisms, including DNA damage response, gene expression regulation, and cell waste disposal and potential biomarkers or therapeutic targets including EP300. These processes and targets likely represent the functional core impacted in FTD, reflecting the underlying genetic architecture contributing to disease. The approach presented in this study can be applied to other complex traits for which risk-causative genes are known as it provides a promising tool for setting the foundations for collating genomics and wet laboratory data in a bidirectional manner. This is and will be critical to accelerate molecular target prioritization and drug discovery.

  4. Category Theoretic Analysis of Hierarchical Protein Materials and Social Networks

    Science.gov (United States)

    Spivak, David I.; Giesa, Tristan; Wood, Elizabeth; Buehler, Markus J.

    2011-01-01

    Materials in biology span all the scales from Angstroms to meters and typically consist of complex hierarchical assemblies of simple building blocks. Here we describe an application of category theory to describe structural and resulting functional properties of biological protein materials by developing so-called ologs. An olog is like a “concept web” or “semantic network” except that it follows a rigorous mathematical formulation based on category theory. This key difference ensures that an olog is unambiguous, highly adaptable to evolution and change, and suitable for sharing concepts with other olog. We consider simple cases of beta-helical and amyloid-like protein filaments subjected to axial extension and develop an olog representation of their structural and resulting mechanical properties. We also construct a representation of a social network in which people send text-messages to their nearest neighbors and act as a team to perform a task. We show that the olog for the protein and the olog for the social network feature identical category-theoretic representations, and we proceed to precisely explicate the analogy or isomorphism between them. The examples presented here demonstrate that the intrinsic nature of a complex system, which in particular includes a precise relationship between structure and function at different hierarchical levels, can be effectively represented by an olog. This, in turn, allows for comparative studies between disparate materials or fields of application, and results in novel approaches to derive functionality in the design of de novo hierarchical systems. We discuss opportunities and challenges associated with the description of complex biological materials by using ologs as a powerful tool for analysis and design in the context of materiomics, and we present the potential impact of this approach for engineering, life sciences, and medicine. PMID:21931622

  5. SLIDER: a generic metaheuristic for the discovery of correlated motifs in protein-protein interaction networks.

    Science.gov (United States)

    Boyen, Peter; Van Dyck, Dries; Neven, Frank; van Ham, Roeland C H J; van Dijk, Aalt D J

    2011-01-01

    Correlated motif mining (cmm) is the problem of finding overrepresented pairs of patterns, called motifs, in sequences of interacting proteins. Algorithmic solutions for cmm thereby provide a computational method for predicting binding sites for protein interaction. In this paper, we adopt a motif-driven approach where the support of candidate motif pairs is evaluated in the network. We experimentally establish the superiority of the Chi-square-based support measure over other support measures. Furthermore, we obtain that cmm is an np-hard problem for a large class of support measures (including Chi-square) and reformulate the search for correlated motifs as a combinatorial optimization problem. We then present the generic metaheuristic slider which uses steepest ascent with a neighborhood function based on sliding motifs and employs the Chi-square-based support measure. We show that slider outperforms existing motif-driven cmm methods and scales to large protein-protein interaction networks. The slider-implementation and the data used in the experiments are available on http://bioinformatics.uhasselt.be.

  6. Protein-protein interaction network and subcellular localization of the Arabidopsis thaliana ESCRT machinery

    Directory of Open Access Journals (Sweden)

    Lynn eRichardson

    2011-06-01

    Full Text Available The Endosomal Sorting Complex Required for Transport (ESCRT consists of several multi-protein subcomplexes which assemble sequentially at the endosomal surface and function in multivesicular body (MVB biogenesis. While ESCRT has been relatively well characterized in yeasts and mammals, comparably little is known about ESCRT in plants. Here we explored the yeast two-hybrid protein interaction network and subcellular localization of the Arabidopsis thaliana ESCRT machinery. We show that Arabidopsis ESCRT interactome possess a number of protein-protein interactions that are either conserved in yeasts and mammals or distinct to plants. We show also that most of the Arabidopsis ESCRT proteins examined at least partially localize to MVBs in plant cells when ectopically expressed on their own or co-expressed with other interacting ESCRT proteins, and some also induce abnormal MVB phenotypes, consistent with their proposed functional roles in MVB biogenesis. Overall, our results help define the plant ESCRT machinery by highlighting both conserved and unique features when compared to ESCRT in other evolutionarily diverse organisms, providing a foundation for further exploration of ESCRT in plants.

  7. Discovering Protein-Protein Interactions within the Programmed Cell Death Network Using a Protein-Fragment Complementation Screen

    Directory of Open Access Journals (Sweden)

    Yuval Gilad

    2014-08-01

    Full Text Available Apoptosis and autophagy are distinct biological processes, each driven by a different set of protein-protein interactions, with significant crosstalk via direct interactions among apoptotic and autophagic proteins. To measure the global profile of these interactions, we adapted the Gaussia luciferase protein-fragment complementation assay (GLuc PCA, which monitors binding between proteins fused to complementary fragments of a luciferase reporter. A library encompassing 63 apoptotic and autophagic proteins was constructed for the analysis of ∼3,600 protein-pair combinations. This generated a detailed landscape of the apoptotic and autophagic modules and points of interface between them, identifying 46 previously unknown interactions. One of these interactions, between DAPK2, a Ser/Thr kinase that promotes autophagy, and 14-3-3τ, was further investigated. We mapped the region responsible for 14-3-3τ binding and proved that this interaction inhibits DAPK2 dimerization and activity. This proof of concept underscores the power of the GLuc PCA platform for the discovery of biochemical pathways within the cell death network.

  8. Mac-2 binding protein is a cell-adhesive protein of the extracellular matrix which self-assembles into ring-like structures and binds beta1 integrins, collagens and fibronectin

    DEFF Research Database (Denmark)

    Sasaki, T; Brakebusch, C; Engel, J

    1998-01-01

    Human Mac-2 binding protein (M2BP) was prepared in recombinant form from the culture medium of 293 kidney cells and consisted of a 92 kDa subunit. The protein was obtained in a native state as indicated by CD spectroscopy, demonstrating alpha-helical and beta-type structure, and by protease...... in solid-phase assays to collagens IV, V and VI, fibronectin and nidogen, but not to fibrillar collagens I and III or other basement membrane proteins. The protein also mediated adhesion of cell lines at comparable strength with laminin. Adhesion to M2BP was inhibited by antibodies to integrin beta1...

  9. Predicting human protein subcellular locations by the ensemble of multiple predictors via protein-protein interaction network with edge clustering coefficients.

    Directory of Open Access Journals (Sweden)

    Pufeng Du

    Full Text Available One of the fundamental tasks in biology is to identify the functions of all proteins to reveal the primary machinery of a cell. Knowledge of the subcellular locations of proteins will provide key hints to reveal their functions and to understand the intricate pathways that regulate biological processes at the cellular level. Protein subcellular location prediction has been extensively studied in the past two decades. A lot of methods have been developed based on protein primary sequences as well as protein-protein interaction network. In this paper, we propose to use the protein-protein interaction network as an infrastructure to integrate existing sequence based predictors. When predicting the subcellular locations of a given protein, not only the protein itself, but also all its interacting partners were considered. Unlike existing methods, our method requires neither the comprehensive knowledge of the protein-protein interaction network nor the experimentally annotated subcellular locations of most proteins in the protein-protein interaction network. Besides, our method can be used as a framework to integrate multiple predictors. Our method achieved 56% on human proteome in absolute-true rate, which is higher than the state-of-the-art methods.

  10. STRING v9.1: protein-protein interaction networks, with increased coverage and integration.

    Science.gov (United States)

    Franceschini, Andrea; Szklarczyk, Damian; Frankild, Sune; Kuhn, Michael; Simonovic, Milan; Roth, Alexander; Lin, Jianyi; Minguez, Pablo; Bork, Peer; von Mering, Christian; Jensen, Lars J

    2013-01-01

    Complete knowledge of all direct and indirect interactions between proteins in a given cell would represent an important milestone towards a comprehensive description of cellular mechanisms and functions. Although this goal is still elusive, considerable progress has been made-particularly for certain model organisms and functional systems. Currently, protein interactions and associations are annotated at various levels of detail in online resources, ranging from raw data repositories to highly formalized pathway databases. For many applications, a global view of all the available interaction data is desirable, including lower-quality data and/or computational predictions. The STRING database (http://string-db.org/) aims to provide such a global perspective for as many organisms as feasible. Known and predicted associations are scored and integrated, resulting in comprehensive protein networks covering >1100 organisms. Here, we describe the update to version 9.1 of STRING, introducing several improvements: (i) we extend the automated mining of scientific texts for interaction information, to now also include full-text articles; (ii) we entirely re-designed the algorithm for transferring interactions from one model organism to the other; and (iii) we provide users with statistical information on any functional enrichment observed in their networks.

  11. Interaction and localization diversities of global and local hubs in human protein-protein interaction networks.

    Science.gov (United States)

    Kiran, M; Nagarajaram, H A

    2016-08-16

    Hubs, the highly connected nodes in protein-protein interaction networks (PPINs), are associated with several characteristic properties and are known to perform vital roles in cells. We defined two classes of hubs, global (housekeeping) and local (tissue-specific) hubs. These two categories of hubs are distinct from each other with respect to their abundance, structure and function. However, how distinct are the spatial expression pattern and other characteristics of their interacting partners is still not known. Our investigations revealed that the partners of the local hubs compared with those of global hubs are conserved across the tissues in which they are expressed. Partners of local hubs show diverse subcellular localizations as compared with the partners of global hubs. We examined the nature of interacting domains in both categories of hubs and found that they are promiscuous in global hubs but not so in local hubs. Deletion of some of the local and global hubs has an impact on the characteristic path length of the network indicating that those hubs are inter-modular in nature. Our present study has, therefore, shed further light on the characteristic features of the local and global hubs in human PPIN. This knowledge of different topological aspects of hubs with regard to their types and subtypes is essential as it helps in better understanding of roles of hub proteins in various cellular processes under various conditions including those caused by host-pathogen interactions and therefore useful in prioritizing targets for drug design and repositioning.

  12. Global versus local hubs in human protein-protein interaction network.

    Science.gov (United States)

    Kiran, Manjari; Nagarajaram, Hampapathalu Adimurthy

    2013-12-06

    In this study, we have constructed tissue-specific protein-protein interaction networks for 70 human tissues and have identified three types of hubs based on their expression breadths: (a) tissue-specific hubs (TSHs) (proteins that are expressed in ≤ 10 tissues and also form hubs in ≤ 10 tissues), (b) tissue-preferred hubs (TPHs) (proteins expressed in ≥ 60 tissues but are highly connected in ≤ 10 tissues), and (c) housekeeping hubs (HKHs) (proteins that are expressed in ≥ 60 tissues and also form hubs in ≥ 60 tissues). Comparative analyses revealed significant differences between TSHs and HKHs and also revealed that TPHs behave more like HKHs. TSHs are lengthier, more disordered, and also quickly evolving proteins as compared with HKHs. Despite having a similar number of binding surfaces and interacting domains, TSHs are associated with a lower degree of centrality as compared with HKHs, suggesting that TSHs are "unsaturated" with regard to their binding capability and are perhaps evolving with regard to their interactions. TSHs are less abundantly expressed as compared with HKHs and are enriched with PEST motifs, indicating their tight regulation. All of these properties of TSHs and HKHs correlate with their distinct functional roles; TSHs are involved in tissue-specific functional roles, viz., secretors, receptors, and signaling proteins, whereas HKHs are involved in core-cellular functions such as transcription, translation, and so on. Our study, therefore, brings forth a clear and distinct classification of hubs simply based on their expression breadth and further assumes significance in the light of the highly debated dichotomy of date and party hubs, which is based on the coexpression pattern of hubs with their partners.

  13. "Hot cores" in proteins: Comparative analysis of the apolar contact area in structures from hyper/thermophilic and mesophilic organisms

    Directory of Open Access Journals (Sweden)

    Bossa Francesco

    2008-02-01

    Full Text Available Abstract Background A wide variety of stabilizing factors have been invoked so far to elucidate the structural basis of protein thermostability. These include, amongst the others, a higher number of ion-pairs interactions and hydrogen bonds, together with a better packing of hydrophobic residues. It has been frequently observed that packing of hydrophobic side chains is improved in hyperthermophilic proteins, when compared to their mesophilic counterparts. In this work, protein crystal structures from hyper/thermophilic organisms and their mesophilic homologs have been compared, in order to quantify the difference of apolar contact area and to assess the role played by the hydrophobic contacts in the stabilization of the protein core, at high temperatures. Results The construction of two datasets was carried out so as to satisfy several restrictive criteria, such as minimum redundancy, resolution and R-value thresholds and lack of any structural defect in the collected structures. This approach allowed to quantify with relatively high precision the apolar contact area between interacting residues, reducing the uncertainty due to the position of atoms in the crystal structures, the redundancy of data and the size of the dataset. To identify the common core regions of these proteins, the study was focused on segments that conserve a similar main chain conformation in the structures analyzed, excluding the intervening regions whose structure differs markedly. The results indicated that hyperthermophilic proteins underwent a significant increase of the hydrophobic contact area contributed by those residues composing the alpha-helices of the structurally conserved regions. Conclusion This study indicates the decreased flexibility of alpha-helices in proteins core as a major factor contributing to the enhanced termostability of a number of hyperthermophilic proteins. This effect, in turn, may be due to an increased number of buried methyl groups in

  14. Insight into bacterial virulence mechanisms against host immune response via the Yersinia pestis-human protein-protein interaction network.

    Science.gov (United States)

    Yang, Huiying; Ke, Yuehua; Wang, Jian; Tan, Yafang; Myeni, Sebenzile K; Li, Dong; Shi, Qinghai; Yan, Yanfeng; Chen, Hui; Guo, Zhaobiao; Yuan, Yanzhi; Yang, Xiaoming; Yang, Ruifu; Du, Zongmin

    2011-11-01

    A Yersinia pestis-human protein interaction network is reported here to improve our understanding of its pathogenesis. Up to 204 interactions between 66 Y. pestis bait proteins and 109 human proteins were identified by yeast two-hybrid assay and then combined with 23 previously published interactions to construct a protein-protein interaction network. Topological analysis of the interaction network revealed that human proteins targeted by Y. pestis were significantly enriched in the proteins that are central in the human protein-protein interaction network. Analysis of this network showed that signaling pathways important for host immune responses were preferentially targeted by Y. pestis, including the pathways involved in focal adhesion, regulation of cytoskeleton, leukocyte transendoepithelial migration, and Toll-like receptor (TLR) and mitogen-activated protein kinase (MAPK) signaling. Cellular pathways targeted by Y. pestis are highly relevant to its pathogenesis. Interactions with host proteins involved in focal adhesion and cytoskeketon regulation pathways could account for resistance of Y. pestis to phagocytosis. Interference with TLR and MAPK signaling pathways by Y. pestis reflects common characteristics of pathogen-host interaction that bacterial pathogens have evolved to evade host innate immune response by interacting with proteins in those signaling pathways. Interestingly, a large portion of human proteins interacting with Y. pestis (16/109) also interacted with viral proteins (Epstein-Barr virus [EBV] and hepatitis C virus [HCV]), suggesting that viral and bacterial pathogens attack common cellular functions to facilitate infections. In addition, we identified vasodilator-stimulated phosphoprotein (VASP) as a novel interaction partner of YpkA and showed that YpkA could inhibit in vitro actin assembly mediated by VASP.

  15. Domain distribution and intrinsic disorder in hubs in the human protein-protein interaction network.

    Science.gov (United States)

    Patil, Ashwini; Kinoshita, Kengo; Nakamura, Haruki

    2010-08-01

    Intrinsic disorder and distributed surface charge have been previously identified as some of the characteristics that differentiate hubs (proteins with a large number of interactions) from non-hubs in protein-protein interaction networks. In this study, we investigated the differences in the quantity, diversity, and functional nature of Pfam domains, and their relationship with intrinsic disorder, in hubs and non-hubs. We found that proteins with a more diverse domain composition were over-represented in hubs when compared with non-hubs, with the number of interactions in hubs increasing with domain diversity. Conversely, the fraction of intrinsic disorder in hubs decreased with increasing number of ordered domains. The difference in the levels of disorder was more prominent in hubs and non-hubs with fewer domains. Functional analysis showed that hubs were enriched in kinase and adaptor domains acting primarily in signal transduction and transcription regulation, whereas non-hubs had more DNA-binding domains and were involved in catalytic activity. Consistent with the differences in the functional nature of their domains, hubs with two or more domains were more likely to connect distinct functional modules in the interaction network when compared with single domain hubs. We conclude that the availability of greater number and diversity of ordered domains, in addition to the tendency to have promiscuous domains, differentiates hubs from non-hubs and provides an additional means of achieving interaction promiscuity. Further, hubs with fewer domains use greater levels of intrinsic disorder to facilitate interaction promiscuity with the prevalence of disorder decreasing with increasing number of ordered domains.

  16. Perturbation waves in proteins and protein networks: applications of percolation and game theories in signaling and drug design.

    Science.gov (United States)

    Antal, Miklós A; Böde, Csaba; Csermely, Peter

    2009-04-01

    The network paradigm is increasingly used to describe the dynamics of complex systems. Here we review the current results and propose future development areas in the assessment of perturbation waves, i.e. propagating structural changes in amino acid networks building individual protein molecules and in protein-protein interaction networks (interactomes). We assess the possibilities and critically review the initial attempts for the application of game theory to the often rather complicated process, when two protein molecules approach each other, mutually adjust their conformations via multiple communication steps and finally, bind to each other. We also summarize available data on the application of percolation theory for the prediction of amino acid network- and interactome-dynamics. Furthermore, we give an overview of the dissection of signals and noise in the cellular context of various perturbations. Finally, we propose possible applications of the reviewed methodologies in drug design.

  17. Arabidopsis protein phosphatase DBP1 nucleates a protein network with a role in regulating plant defense.

    Directory of Open Access Journals (Sweden)

    José Luis Carrasco

    Full Text Available Arabidopsis thaliana DBP1 belongs to the plant-specific family of DNA-binding protein phosphatases. Although recently identified as a novel host factor mediating susceptibility to potyvirus, little is known about DBP1 targets and partners and the molecular mechanisms underlying its function. Analyzing changes in the phosphoproteome of a loss-of-function dbp1 mutant enabled the identification of 14-3-3λ isoform (GRF6, a previously reported DBP1 interactor, and MAP kinase (MAPK MPK11 as components of a small protein network nucleated by DBP1, in which GRF6 stability is modulated by MPK11 through phosphorylation, while DBP1 in turn negatively regulates MPK11 activity. Interestingly, grf6 and mpk11 loss-of-function mutants showed altered response to infection by the potyvirus Plum pox virus (PPV, and the described molecular mechanism controlling GRF6 stability was recapitulated upon PPV infection. These results not only contribute to a better knowledge of the biology of DBP factors, but also of MAPK signalling in plants, with the identification of GRF6 as a likely MPK11 substrate and of DBP1 as a protein phosphatase regulating MPK11 activity, and unveils the implication of this protein module in the response to PPV infection in Arabidopsis.

  18. Structure and inhibition of the SARS coronavirus envelope protein ion channel.

    Directory of Open Access Journals (Sweden)

    Konstantin Pervushin

    2009-07-01

    Full Text Available The envelope (E protein from coronaviruses is a small polypeptide that contains at least one alpha-helical transmembrane domain. Absence, or inactivation, of E protein results in attenuated viruses, due to alterations in either virion morphology or tropism. Apart from its morphogenetic properties, protein E has been reported to have membrane permeabilizing activity. Further, the drug hexamethylene amiloride (HMA, but not amiloride, inhibited in vitro ion channel activity of some synthetic coronavirus E proteins, and also viral replication. We have previously shown for the coronavirus species responsible for severe acute respiratory syndrome (SARS-CoV that the transmembrane domain of E protein (ETM forms pentameric alpha-helical bundles that are likely responsible for the observed channel activity. Herein, using solution NMR in dodecylphosphatidylcholine micelles and energy minimization, we have obtained a model of this channel which features regular alpha-helices that form a pentameric left-handed parallel bundle. The drug HMA was found to bind inside the lumen of the channel, at both the C-terminal and the N-terminal openings, and, in contrast to amiloride, induced additional chemical shifts in ETM. Full length SARS-CoV E displayed channel activity when transiently expressed in human embryonic kidney 293 (HEK-293 cells in a whole-cell patch clamp set-up. This activity was significantly reduced by hexamethylene amiloride (HMA, but not by amiloride. The channel structure presented herein provides a possible rationale for inhibition, and a platform for future structure-based drug design of this potential pharmacological target.

  19. Crystal Structure of Menin Reveals Binding Site for Mixed Lineage Leukemia (MLL) Protein

    Energy Technology Data Exchange (ETDEWEB)

    Murai, Marcelo J.; Chruszcz, Maksymilian; Reddy, Gireesh; Grembecka, Jolanta; Cierpicki, Tomasz (Michigan); (UV)

    2014-10-02

    Menin is a tumor suppressor protein that is encoded by the MEN1 (multiple endocrine neoplasia 1) gene and controls cell growth in endocrine tissues. Importantly, menin also serves as a critical oncogenic cofactor of MLL (mixed lineage leukemia) fusion proteins in acute leukemias. Direct association of menin with MLL fusion proteins is required for MLL fusion protein-mediated leukemogenesis in vivo, and this interaction has been validated as a new potential therapeutic target for development of novel anti-leukemia agents. Here, we report the first crystal structure of menin homolog from Nematostella vectensis. Due to a very high sequence similarity, the Nematostella menin is a close homolog of human menin, and these two proteins likely have very similar structures. Menin is predominantly an {alpha}-helical protein with the protein core comprising three tetratricopeptide motifs that are flanked by two {alpha}-helical bundles and covered by a {beta}-sheet motif. A very interesting feature of menin structure is the presence of a large central cavity that is highly conserved between Nematostella and human menin. By employing site-directed mutagenesis, we have demonstrated that this cavity constitutes the binding site for MLL. Our data provide a structural basis for understanding the role of menin as a tumor suppressor protein and as an oncogenic co-factor of MLL fusion proteins. It also provides essential structural information for development of inhibitors targeting the menin-MLL interaction as a novel therapeutic strategy in MLL-related leukemias.

  20. Large-scale identification of potential drug targets based on the topological features of human protein-protein interaction network.

    Science.gov (United States)

    Li, Zhan-Chao; Zhong, Wen-Qian; Liu, Zhi-Qing; Huang, Meng-Hua; Xie, Yun; Dai, Zong; Zou, Xiao-Yong

    2015-04-29

    Identifying potential drug target proteins is a crucial step in the process of drug discovery and plays a key role in the study of the molecular mechanisms of disease. Based on the fact that the majority of proteins exert their functions through interacting with each other, we propose a method to recognize target proteins by using the human protein-protein interaction network and graph theory. In the network, vertexes and edges are weighted by using the confidence scores of interactions and descriptors of protein primary structure, respectively. The novel network topological features are defined and employed to characterize protein using existing databases. A widely used minimum redundancy maximum relevance and random forests algorithm are utilized to select the optimal feature subset and construct model for the identification of potential drug target proteins at the proteome scale. The accuracies of training set and test set are 89.55% and 85.23%. Using the constructed model, 2127 potential drug target proteins have been recognized and 156 drug target proteins have been validated in the database of drug target. In addition, some new drug target proteins can be considered as targets for treating diseases of mucopolysaccharidosis, non-arteritic anterior ischemic optic neuropathy, Bernard-Soulier syndrome and pseudo-von Willebrand, etc. It is anticipated that the proposed method may became a powerful high-throughput virtual screening tool of drug target. Copyright © 2015 Elsevier B.V. All rights reserved.

  1. Sequence and expression pattern of a novel human orphan G-protein-coupled receptor, GPRC5B, a family C receptor with a short amino-terminal domain

    DEFF Research Database (Denmark)

    Bräuner-Osborne, Hans; Krogsgaard-Larsen, P

    2000-01-01

    Query of GenBank with the amino acid sequence of human metabotropic glutamate receptor subtype 2 (mGluR2) identified a predicted gene product of unknown function on BAC clone CIT987SK-A-69G12 (located on chromosome band 16p12) as a homologous protein. The transcript, entitled GPRC5B, was cloned...... from an expressed sequence tag clone that contained the entire open reading frame of the transcript encoding a protein of 395 amino acids. Analysis of the protein sequence reveal that GPRC5B contains a signal peptide and seven transmembrane alpha-helices, which is a hallmark of G-protein...

  2. Characterization of dry globular proteins and protein fibrils by synchrotron radiation vacuum UV circular dichroism

    DEFF Research Database (Denmark)

    Nesgaard, Lise W.; Hoffmann, Søren Vrønning; Andersen, Christian Beyschau

    2008-01-01

    different types of protein fibrils, highlighting that bona fide fibrils formed by lysozyme are structurally more similar to the nonclassical fibrillar aggregates formed by the SerADan peptide than with the amyloid formed by alpha-synuclein. Thus, despite the lack of direct structural conclusions......Circular dichroism using synchrotron radiation (SRCD) can extend the spectral range down to approximately 130 nm for dry proteins, potentially providing new structural information. Using a selection of dried model proteins, including alpha-helical, beta-sheet, and mixed-structure proteins, we...... observe a low-wavelength band in the range 130-160 nm, whose intensity and peak position is sensitive to the secondary structure of the protein and may also reflect changes in super-secondary structure. This band has previously been observed for peptides but not for globular proteins, and is compatible...

  3. Cost Function Network-based Design of Protein-Protein Interactions: predicting changes in binding affinity.

    Science.gov (United States)

    Viricel, Clément; de Givry, Simon; Schiex, Thomas; Barbe, Sophie

    2018-02-20

    Accurate and economic methods to predict change in protein binding free energy upon mutation are imperative to accelerate the design of proteins for a wide range of applications. Free energy is defined by enthalpic and entropic contributions. Following the recent progresses of Artificial Intelligence-based algorithms for guaranteed NP-hard energy optimization and partition function computation, it becomes possible to quickly compute minimum energy conformations and to reliably estimate the entropic contribution of side-chains in the change of free energy of large protein interfaces. Using guaranteed Cost Function Network algorithms, Rosetta energy functions and Dunbrack's rotamer library, we developed and assessed EasyE and JayZ, two methods for binding affinity estimation that ignore or include conformational entropic contributions on a large benchmark of binding affinity experimental measures. If both approaches outperform most established tools, we observe that side-chain conformational entropy brings little or no improvement on most systems but becomes crucial in some rare cases. as open-source Python/C ++ code at sourcesup.renater.fr/projects/easy-jayz. thomas.schiex@inra.fr and sophie.barbe@insa-toulouse.fr. Supplementary data are available at Bioinformatics online.

  4. Reconstituting Protein Interaction Networks Using Parameter-Dependent Domain-Domain Interactions

    Science.gov (United States)

    2013-05-07

    that approximately 80% of eukaryotic proteins and 67% of prokaryotic proteins have multiple domains [13,14]. Most annotation databases characterize...domain annotations, Domain-domain interactions, Protein-protein interaction networks Background The living cell is a dynamic, interconnected system...detailed in Methods. Here, we illustrate its application on a well- annotated single- cell organism. We created a merged set of protein-domain annotations

  5. Combining sequence and Gene Ontology for protein module detection in the Weighted Network.

    Science.gov (United States)

    Yu, Yang; Liu, Jie; Feng, Nuan; Song, Bo; Zheng, Zeyu

    2017-01-07

    Studies of protein modules in a Protein-Protein Interaction (PPI) network contribute greatly to the understanding of biological mechanisms. With the development of computing science, computational approaches have played an important role in locating protein modules. In this paper, a new approach combining Gene Ontology and amino acid background frequency is introduced to detect the protein modules in the weighted PPI networks. The proposed approach mainly consists of three parts: the feature extraction, the weighted graph construction and the protein complex detection. Firstly, the topology-sequence information is utilized to present the feature of protein complex. Secondly, six types of the weighed graph are constructed by combining PPI network and Gene Ontology information. Lastly, protein complex algorithm is applied to the weighted graph, which locates the clusters based on three conditions, including density, network diameter and the included angle cosine. Experiments have been conducted on two protein complex benchmark sets for yeast and the results show that the approach is more effective compared to five typical algorithms with the performance of f-measure and precision. The combination of protein interaction network with sequence and gene ontology data is helpful to improve the performance and provide a optional method for protein module detection. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. A network biology approach to understanding the importance of chameleon proteins in human physiology and pathology.

    Science.gov (United States)

    Bahramali, Golnaz; Goliaei, Bahram; Minuchehr, Zarrin; Marashi, Sayed-Amir

    2017-02-01

    Chameleon proteins are proteins which include sequences that can adopt α-helix-β-strand (HE-chameleon) or α-helix-coil (HC-chameleon) or β-strand-coil (CE-chameleon) structures to operate their crucial biological functions. In this study, using a network-based approach, we examined the chameleon proteins to give a better knowledge on these proteins. We focused on proteins with identical chameleon sequences with more than or equal to seven residues long in different PDB entries, which adopt HE-chameleon, HC-chameleon, and CE-chameleon structures in the same protein. One hundred and ninety-one human chameleon proteins were identified via our in-house program. Then, protein-protein interaction (PPI) networks, Gene ontology (GO) enrichment, disease network, and pathway enrichment analyses were performed for our derived data set. We discovered that there are chameleon sequences which reside in protein-protein interaction regions between two proteins critical for their dual function. Analysis of the PPI networks for chameleon proteins introduced five hub proteins, namely TP53, EGFR, HSP90AA1, PPARA, and HIF1A, which were presented in four PPI clusters. The outcomes demonstrate that the chameleon regions are in critical domains of these proteins and are important in the development and treatment of human cancers. The present report is the first network-based functional study of chameleon proteins using computational approaches and might provide a new perspective for understanding the mechanisms of diseases helping us in developing new medical therapies along with discovering new proteins with chameleon properties which are highly important in cancer.

  7. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat

    DEFF Research Database (Denmark)

    Babbitt, Patricia C.; Bagos, Pantelis G.; Bairoch, Amos

    2015-01-01

    protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication...... and funding. An important outcome of this meeting was the creation of a Specialist Protein Resource Network that we believe will improve coordination of the activities of its member resources. We invite further protein database resources to join the network and continue the dialogue....

  8. Protein Signaling Networks from Single Cell Fluctuations and Information Theory Profiling

    Science.gov (United States)

    Shin, Young Shik; Remacle, F.; Fan, Rong; Hwang, Kiwook; Wei, Wei; Ahmad, Habib; Levine, R.D.; Heath, James R.

    2011-01-01

    Protein signaling networks among cells play critical roles in a host of pathophysiological processes, from inflammation to tumorigenesis. We report on an approach that integrates microfluidic cell handling, in situ protein secretion profiling, and information theory to determine an extracellular protein-signaling network and the role of perturbations. We assayed 12 proteins secreted from human macrophages that were subjected to lipopolysaccharide challenge, which emulates the macrophage-based innate immune responses against Gram-negative bacteria. We characterize the fluctuations in protein secretion of single cells, and of small cell colonies (n = 2, 3,···), as a function of colony size. Measuring the fluctuations permits a validation of the conditions required for the application of a quantitative version of the Le Chatelier's principle, as derived using information theory. This principle provides a quantitative prediction of the role of perturbations and allows a characterization of a protein-protein interaction network. PMID:21575571

  9. MetaGO: Predicting Gene Ontology of non-homologous proteins through low-resolution protein structure prediction and protein-protein network mapping.

    Science.gov (United States)

    Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang

    2018-03-10

    Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's-homology based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's-homology based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence-homology based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.

  10. Stoichiometric balance of protein copy numbers is measurable and functionally significant in a protein-protein interaction network for yeast endocytosis.

    Science.gov (United States)

    Holland, David O; Johnson, Margaret E

    2018-03-01

    Stoichiometric balance, or dosage balance, implies that proteins that are subunits of obligate complexes (e.g. the ribosome) should have copy numbers expressed to match their stoichiometry in that complex. Establishing balance (or imbalance) is an important tool for inferring subunit function and assembly bottlenecks. We show here that these correlations in protein copy numbers can extend beyond complex subunits to larger protein-protein interactions networks (PPIN) involving a range of reversible binding interactions. We develop a simple method for quantifying balance in any interface-resolved PPINs based on network structure and experimentally observed protein copy numbers. By analyzing such a network for the clathrin-mediated endocytosis (CME) system in yeast, we found that the real protein copy numbers were significantly more balanced in relation to their binding partners compared to randomly sampled sets of yeast copy numbers. The observed balance is not perfect, highlighting both under and overexpressed proteins. We evaluate the potential cost and benefits of imbalance using two criteria. First, a potential cost to imbalance is that 'leftover' proteins without remaining functional partners are free to misinteract. We systematically quantify how this misinteraction cost is most dangerous for strong-binding protein interactions and for network topologies observed in biological PPINs. Second, a more direct consequence of imbalance is that the formation of specific functional complexes depends on relative copy numbers. We therefore construct simple kinetic models of two sub-networks in the CME network to assess multi-protein assembly of the ARP2/3 complex and a minimal, nine-protein clathrin-coated vesicle forming module. We find that the observed, imperfectly balanced copy numbers are less effective than balanced copy numbers in producing fast and complete multi-protein assemblies. However, we speculate that strategic imbalance in the vesicle forming module

  11. Identifying potential survival strategies of HIV-1 through virus-host protein interaction networks

    Directory of Open Access Journals (Sweden)

    Boucher Charles AB

    2010-07-01

    Full Text Available Abstract Background The National Institute of Allergy and Infectious Diseases has launched the HIV-1 Human Protein Interaction Database in an effort to catalogue all published interactions between HIV-1 and human proteins. In order to systematically investigate these interactions functionally and dynamically, we have constructed an HIV-1 human protein interaction network. This network was analyzed for important proteins and processes that are specific for the HIV life-cycle. In order to expose viral strategies, network motif analysis was carried out showing reoccurring patterns in virus-host dynamics. Results Our analyses show that human proteins interacting with HIV form a densely connected and central sub-network within the total human protein interaction network. The evaluation of this sub-network for connectivity and centrality resulted in a set of proteins essential for the HIV life-cycle. Remarkably, we were able to associate proteins involved in RNA polymerase II transcription with hubs and proteasome formation with bottlenecks. Inferred network motifs show significant over-representation of positive and negative feedback patterns between virus and host. Strikingly, such patterns have never been reported in combined virus-host systems. Conclusions HIV infection results in a reprioritization of cellular processes reflected by an increase in the relative importance of transcriptional machinery and proteasome formation. We conclude that during the evolution of HIV, some patterns of interaction have been selected for resulting in a system where virus proteins preferably interact with central human proteins for direct control and with proteasomal proteins for indirect control over the cellular processes. Finally, the patterns described by network motifs illustrate how virus and host interact with one another.

  12. Similar pathogen targets in Arabidopsis thaliana and homo sapiens protein networks.

    Directory of Open Access Journals (Sweden)

    Paulo Shakarian

    Full Text Available We study the behavior of pathogens on host protein networks for humans and Arabidopsis - noting striking similarities. Specifically, we preform [Formula: see text]-shell decomposition analysis on these networks - which groups the proteins into various "shells" based on network structure. We observe that shells with a higher average degree are more highly targeted (with a power-law relationship and that highly targeted nodes lie in shells closer to the inner-core of the network. Additionally, we also note that the inner core of the network is significantly under-targeted. We show that these core proteins may have a role in intra-cellular communication and hypothesize that they are less attacked to ensure survival of the host. This may explain why certain high-degree proteins are not significantly attacked.

  13. Exploring hierarchical and overlapping modular structure in the yeast protein interaction network

    Directory of Open Access Journals (Sweden)

    Zhao Yi

    2010-12-01

    Full Text Available Abstract Background Developing effective strategies to reveal modular structures in protein interaction networks is crucial for better understanding of molecular mechanisms of underlying biological processes. In this paper, we propose a new density-based algorithm (ADHOC for clustering vertices of a protein interaction network using a novel subgraph density measurement. Results By statistically evaluating several independent criteria, we found that ADHOC could significantly improve the outcome as compared with five previously reported density-dependent methods. We further applied ADHOC to investigate the hierarchical and overlapping modular structure in the yeast PPI network. Our method could effectively detect both protein modules and the overlaps between them, and thus greatly promote the precise prediction of protein functions. Moreover, by further assaying the intermodule layer of the yeast PPI network, we classified hubs into two types, module hubs and inter-module hubs. Each type presents distinct characteristics both in network topology and biological functions, which could conduce to the better understanding of relationship between network architecture and biological implications. Conclusions Our proposed algorithm based on the novel subgraph density measurement makes it possible to more precisely detect hierarchical and overlapping modular structures in protein interaction networks. In addition, our method also shows a strong robustness against the noise in network, which is quite critical for analyzing such a high noise network.

  14. The organisational structure of protein networks: revisiting the centrality-lethality hypothesis.

    Science.gov (United States)

    Raman, Karthik; Damaraju, Nandita; Joshi, Govind Krishna

    2014-03-01

    Protein networks, describing physical interactions as well as functional associations between proteins, have been unravelled for many organisms in the recent past. Databases such as the STRING provide excellent resources for the analysis of such networks. In this contribution, we revisit the organisation of protein networks, particularly the centrality-lethality hypothesis, which hypothesises that nodes with higher centrality in a network are more likely to produce lethal phenotypes on removal, compared to nodes with lower centrality. We consider the protein networks of a diverse set of 20 organisms, with essentiality information available in the Database of Essential Genes and assess the relationship between centrality measures and lethality. For each of these organisms, we obtained networks of high-confidence interactions from the STRING database, and computed network parameters such as degree, betweenness centrality, closeness centrality and pairwise disconnectivity indices. We observe that the networks considered here are predominantly disassortative. Further, we observe that essential nodes in a network have a significantly higher average degree and betweenness centrality, compared to the network average. Most previous studies have evaluated the centrality-lethality hypothesis for Saccharomyces cerevisiae and Escherichia coli; we here observe that the centrality-lethality hypothesis hold goods for a large number of organisms, with certain limitations. Betweenness centrality may also be a useful measure to identify essential nodes, but measures like closeness centrality and pairwise disconnectivity are not significantly higher for essential nodes.

  15. PADPIN: protein-protein interaction networks of angiogenesis, arteriogenesis, and inflammation in peripheral arterial disease

    Science.gov (United States)

    Vijay, Chaitanya G.; Annex, Brian H.; Bader, Joel S.; Popel, Aleksander S.

    2015-01-01

    Peripheral arterial disease (PAD) results from an obstruction of blood flow in the arteries other than the heart, most commonly the arteries that supply the legs. The complexity of the known signaling pathways involved in PAD, including various growth factor pathways and their cross talks, suggests that analyses of high-throughput experimental data could lead to a new level of understanding of the disease as well as novel and heretofore unanticipated potential targets. Such bioinformatic analyses have not been systematically performed for PAD. We constructed global protein-protein interaction networks of angiogenesis (Angiome), immune response (Immunome), and arteriogenesis (Arteriome) using our previously developed algorithm GeneHits. The term “PADPIN” refers to the angiome, immunome, and arteriome in PAD. Here we analyze four microarray gene expression datasets from ischemic and nonischemic gastrocnemius muscles at day 3 posthindlimb ischemia (HLI) in two genetically different C57BL/6 and BALB/c mouse strains that display differential susceptibility to HLI to identify potential targets and signaling pathways in angiogenesis, immune, and arteriogenesis networks. We hypothesize that identification of the differentially expressed genes in ischemic and nonischemic muscles between the strains that recovers better (C57BL/6) vs. the strain that recovers more poorly (BALB/c) will help for the prediction of target genes in PAD. Our bioinformatics analysis identified several genes that are differentially expressed between the two mouse strains with known functions in PAD including TLR4, THBS1, and PRKAA2 and several genes with unknown functions in PAD including EphA4, TSPAN7, SLC22A4, and EIF2a. PMID:26058837

  16. Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks

    Directory of Open Access Journals (Sweden)

    Mazo Ilya

    2007-07-01

    Full Text Available Abstract Background Uncovering cellular roles of a protein is a task of tremendous importance and complexity that requires dedicated experimental work as well as often sophisticated data mining and processing tools. Protein functions, often referred to as its annotations, are believed to manifest themselves through topology of the networks of inter-proteins interactions. In particular, there is a growing body of evidence that proteins performing the same function are more likely to interact with each other than with proteins with other functions. However, since functional annotation and protein network topology are often studied separately, the direct relationship between them has not been comprehensively demonstrated. In addition to having the general biological significance, such demonstration would further validate the data extraction and processing methods used to compose protein annotation and protein-protein interactions datasets. Results We developed a method for automatic extraction of protein functional annotation from scientific text based on the Natural Language Processing (NLP technology. For the protein annotation extracted from the entire PubMed, we evaluated the precision and recall rates, and compared the performance of the automatic extraction technology to that of manual curation used in public Gene Ontology (GO annotation. In the second part of our presentation, we reported a large-scale investigation into the correspondence between communities in the literature-based protein networks and GO annotation groups of functionally related proteins. We found a comprehensive two-way match: proteins within biological annotation groups form significantly denser linked network clusters than expected by chance and, conversely, densely linked network communities exhibit a pronounced non-random overlap with GO groups. We also expanded the publicly available GO biological process annotation using the relations extracted by our NLP technology

  17. Discovering overlapped protein complexes from weighted PPI networks by removing inter-module hubs.

    Science.gov (United States)

    Maddi, A M A; Eslahchi, Ch

    2017-06-12

    Detecting known protein complexes and predicting undiscovered protein complexes from protein-protein interaction (PPI) networks help us to understand principles of cell organization and its functions. Nevertheless, the discovery of protein complexes based on experiment still needs to be explored. Therefore, computational methods are useful approaches to overcome the experimental limitations. Nevertheless, extraction of protein complexes from PPI network is often nontrivial. Two major constraints are large amount of noise and ignorance of occurrence time of different interactions in PPI network. In this paper, an efficient algorithm, Inter Module Hub Removal Clustering (IMHRC), is developed based on inter-module hub removal in the weighted PPI network which can detect overlapped complexes. By removing some of the inter-module hubs and module hubs, IMHRC eliminates high amount of noise in dataset and implicitly considers different occurrence time of the PPI in network. The performance of the IMHRC was evaluated on several benchmark datasets and results were compared with some of the state-of-the-art models. The protein complexes discovered with the IMHRC method show significantly better agreement with the real complexes than other current methods. Our algorithm provides an accurate and scalable method for detecting and predicting protein complexes from PPI networks.

  18. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat.

    Science.gov (United States)

    Babbitt, Patricia C; Bagos, Pantelis G; Bairoch, Amos; Bateman, Alex; Chatonnet, Arnaud; Chen, Mark Jinan; Craik, David J; Finn, Robert D; Gloriam, David; Haft, Daniel H; Henrissat, Bernard; Holliday, Gemma L; Isberg, Vignir; Kaas, Quentin; Landsman, David; Lenfant, Nicolas; Manning, Gerard; Nagano, Nozomi; Srinivasan, Narayanaswamy; O'Donovan, Claire; Pruitt, Kim D; Sowdhamini, Ramanathan; Rawlings, Neil D; Saier, Milton H; Sharman, Joanna L; Spedding, Michael; Tsirigos, Konstantinos D; Vastermark, Ake; Vriend, Gerrit

    2015-01-01

    During 11-12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication and funding. An important outcome of this meeting was the creation of a Specialist Protein Resource Network that we believe will improve coordination of the activities of its member resources. We invite further protein database resources to join the network and continue the dialogue.

  19. Landscape mapping of functional proteins in insulin signal transduction and insulin resistance: a network-based protein-protein interaction analysis.

    Directory of Open Access Journals (Sweden)

    Chiranjib Chakraborty

    Full Text Available The type 2 diabetes has increased rapidly in recent years throughout the world. The insulin signal transduction mechanism gets disrupted sometimes and it's known as insulin-resistance. It is one of the primary causes associated with type-2 diabetes. The signaling mechanisms involved several proteins that include 7 major functional proteins such as INS, INSR, IRS1, IRS2, PIK3CA, Akt2, and GLUT4. Using these 7 principal proteins, multiple sequences alignment has been created. The scores between sequences also have been developed. We have constructed a phylogenetic tree and modified it with node and distance. Besides, we have generated sequence logos and ultimately developed the protein-protein interaction network. The small insulin signal transduction protein arrangement shows complex network between the functional proteins.

  20. Semantic integration to identify overlapping functional modules in protein interaction networks

    Directory of Open Access Journals (Sweden)

    Ramanathan Murali

    2007-07-01

    Full Text Available Abstract Background The systematic analysis of protein-protein interactions can enable a better understanding of cellular organization, processes and functions. Functional modules can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of functional module detection algorithms. Results We have developed novel metrics, called semantic similarity and semantic interactivity, which use Gene Ontology (GO annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. We presented a flow-based modularization algorithm to efficiently identify overlapping modules in the weighted interaction networks. The experimental results show that the semantic similarity and semantic interactivity of interacting pairs were positively correlated with functional co-occurrence. The effectiveness of the algorithm for identifying modules was evaluated using functional categories from the MIPS database. We demonstrated that our algorithm had higher accuracy compared to other competing approaches. Conclusion The integration of protein interaction networks with GO annotation data and the capability of detecting overlapping modules substantially improve the accuracy of module identification.

  1. Novel insights through the integration of structural and functional genomics data with protein networks.

    Science.gov (United States)

    Clarke, Declan; Bhardwaj, Nitin; Gerstein, Mark B

    2012-09-01

    In recent years, major advances in genomics, proteomics, macromolecular structure determination, and the computational resources capable of processing and disseminating the large volumes of data generated by each have played major roles in advancing a more systems-oriented appreciation of biological organization. One product of systems biology has been the delineation of graph models for describing genome-wide protein-protein interaction networks. The network organization and topology which emerges in such models may be used to address fundamental questions in an array of cellular processes, as well as biological features intrinsic to the constituent proteins (or "nodes") themselves. However, graph models alone constitute an abstraction which neglects the underlying biological and physical reality that the network's nodes and edges are highly heterogeneous entities. Here, we explore some of the advantages of introducing a protein structural dimension to such models, as the marriage of conventional network representations with macromolecular structural data helps to place static node and edge constructs in a biologically more meaningful context. We emphasize that 3D protein structures constitute a valuable conceptual and predictive framework by discussing examples of the insights provided, such as enabling in silico predictions of protein-protein interactions, providing rational and compelling classification schemes for network elements, as well as revealing interesting intrinsic differences between distinct node types, such as disorder and evolutionary features, which may then be rationalized in light of their respective functions within networks. Copyright © 2012 Elsevier Inc. All rights reserved.

  2. Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier

    Science.gov (United States)

    Wang, Leilei; Cheng, Jinyong

    2018-03-01

    Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.

  3. Protein-lipid interactions: from membrane domains to cellular networks

    National Research Council Canada - National Science Library

    Tamm, Lukas K

    2005-01-01

    ... membranes is the lipid bilayer. Embedded in the fluid lipid bilayer are proteins of various shapes and traits. This volume illuminates from physical, chemical and biological angles the numerous - mostly quite weak - interactions between lipids, proteins, and proteins and lipids that define the delicate, highly dynamic and yet so stable fabri...

  4. A Web server for predicting proteins involved in pluripotent network

    Indian Academy of Sciences (India)

    2016-11-04

    Nov 4, 2016 ... the self-renewal property. PluriPred predicts whether a protein is involved in pluripotency from primary protein sequence using manually curated pluripotent proteins as training datasets. Machine learning techniques (MLTs) such as Support Vector Machine (SVM), Naïve Base (NB), Random Forest (RF), ...

  5. msiDBN: A Method of Identifying Critical Proteins in Dynamic PPI Networks

    Directory of Open Access Journals (Sweden)

    Yuan Zhang

    2014-01-01

    Full Text Available Dynamics of protein-protein interactions (PPIs reveals the recondite principles of biological processes inside a cell. Shown in a wealth of study, just a small group of proteins, rather than the majority, play more essential roles at crucial points of biological processes. This present work focuses on identifying these critical proteins exhibiting dramatic structural changes in dynamic PPI networks. First, a comprehensive way of modeling the dynamic PPIs is presented which simultaneously analyzes the activity of proteins and assembles the dynamic coregulation correlation between proteins at each time point. Second, a novel method is proposed, named msiDBN, which models a common representation of multiple PPI networks using a deep belief network framework and analyzes the reconstruction errors and the variabilities across the time courses in the biological process. Experiments were implemented on data of yeast cell cycles. We evaluated our network construction method by comparing the functional representations of the derived networks with two other traditional construction methods. The ranking results of critical proteins in msiDBN were compared with the results from the baseline methods. The results of comparison showed that msiDBN had better reconstruction rate and identified more proteins of critical value to yeast cell cycle process.

  6. A Type-2 fuzzy data fusion approach for building reliable weighted protein interaction networks with application in protein complex detection.

    Science.gov (United States)

    Mehranfar, Adele; Ghadiri, Nasser; Kouhsar, Morteza; Golshani, Ashkan

    2017-09-01

    Detecting the protein complexes is an important task in analyzing the protein interaction networks. Although many algorithms predict protein complexes in different ways, surveys on the interaction networks indicate that about 50% of detected interactions are false positives. Consequently, the accuracy of existing methods needs to be improved. In this paper we propose a novel algorithm to detect the protein complexes in 'noisy' protein interaction data. First, we integrate several biological data sources to determine the reliability of each interaction and determine more accurate weights for the interactions. A data fusion component is used for this step, based on the interval type-2 fuzzy voter that provides an efficient combination of the information sources. This fusion component detects the errors and diminishes their effect on the detection protein complexes. So in the first step, the reliability scores have been assigned for every interaction in the network. In the second step, we have proposed a general protein complex detection algorithm by exploiting and adopting the strong points of other algorithms and existing hypotheses regarding real complexes. Finally, the proposed method has been applied for the yeast interaction datasets for predicting the interactions. The results show that our framework has a better performance regarding precision and F-measure than the existing approaches. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Prediction of protein hydration sites from sequence by modular neural networks

    DEFF Research Database (Denmark)

    Ehrlich, L.; Reczko, M.; Bohr, Henrik

    1998-01-01

    The hydration properties of a protein are important determinants of its structure and function. Here, modular neural networks are employed to predict ordered hydration sites using protein sequence information. First, secondary structure and solvent accessibility are predicted from sequence with two...... separate neural networks. These predictions are used as input together with protein sequences for networks predicting hydration of residues, backbone atoms and sidechains. These networks are teined with protein crystal structures. The prediction of hydration is improved by adding information on secondary...... structure and solvent accessibility and, using actual values of these properties, redidue hydration can be predicted to 77% accuracy with a Metthews coefficient of 0.43. However, predicted property data with an accuracy of 60-70% result in less than half the improvement in predictive performance observed...

  8. Defining the protein interaction network of human malaria parasite Plasmodium falciparum

    KAUST Repository

    Ramaprasad, Abhinay

    2012-02-01

    Malaria, caused by the protozoan parasite Plasmodium falciparum, affects around 225. million people yearly and a huge international effort is directed towards combating this grave threat to world health and economic development. Considerable advances have been made in malaria research triggered by the sequencing of its genome in 2002, followed by several high-throughput studies defining the malaria transcriptome and proteome. A protein-protein interaction (PPI) network seeks to trace the dynamic interactions between proteins, thereby elucidating their local and global functional relationships. Experimentally derived PPI network from high-throughput methods such as yeast two hybrid (Y2H) screens are inherently noisy, but combining these independent datasets by computational methods tends to give a greater accuracy and coverage. This review aims to discuss the computational approaches used till date to construct a malaria protein interaction network and to catalog the functional predictions and biological inferences made from analysis of the PPI network. © 2011 Elsevier Inc.

  9. Topology and weights in a protein domain interaction network – a novel way to predict protein interactions

    Directory of Open Access Journals (Sweden)

    Wuchty Stefan

    2006-05-01

    Full Text Available Abstract Background While the analysis of unweighted biological webs as diverse as genetic, protein and metabolic networks allowed spectacular insights in the inner workings of a cell, biological networks are not only determined by their static grid of links. In fact, we expect that the heterogeneity in the utilization of connections has a major impact on the organization of cellular activities as well. Results We consider a web of interactions between protein domains of the Protein Family database (PFAM, which are weighted by a probability score. We apply metrics that combine the static layout and the weights of the underlying interactions. We observe that unweighted measures as well as their weighted counterparts largely share the same trends in the underlying domain interaction network. However, we only find weak signals that weights and the static grid of interactions are connected entities. Therefore assuming that a protein interaction is governed by a single domain interaction, we observe strong and significant correlations of the highest scoring domain interaction and the confidence of protein interactions in the underlying interactions of yeast and fly. Modeling an interaction between proteins if we find a high scoring protein domain interaction we obtain 1, 428 protein interactions among 361 proteins in the human malaria parasite Plasmodium falciparum. Assessing their quality by a logistic regression method we observe that increasing confidence of predicted interactions is accompanied by high scoring domain interactions and elevated levels of functional similarity and evolutionary conservation. Conclusion Our results indicate that probability scores are randomly distributed, allowing to treat static grid and weights of domain interactions as separate entities. In particular, these finding confirms earlier observations that a protein interaction is a matter of a single interaction event on domain level. As an immediate application, we

  10. Emergence of Complexity in Protein Functions and Metabolic Networks

    Science.gov (United States)

    Pohorille, Andzej

    2009-01-01

    In modern organisms proteins perform a majority of cellular functions, such as chemical catalysis, energy transduction and transport of material across cell walls. Although great strides have been made towards understanding protein evolution, a meaningful extrapolation from contemporary proteins to their earliest ancestors is virtually impossible. In an alternative approach, the origin of water-soluble proteins was probed through the synthesis of very large libraries of random amino acid sequences and subsequently subjecting them to in vitro evolution. In combination with computer modeling and simulations, these experiments allow us to address a number of fundamental questions about the origins of proteins. Can functionality emerge from random sequences of proteins? How did the initial repertoire of functional proteins diversify to facilitate new functions? Did this diversification proceed primarily through drawing novel functionalities from random sequences or through evolution of already existing proto-enzymes? Did protein evolution start from a pool of proteins defined by a frozen accident and other collections of proteins could start a different evolutionary pathway? Although we do not have definitive answers to these questions, important clues have been uncovered. Considerable progress has been also achieved in understanding the origins of membrane proteins. We will address this issue in the example of ion channels - proteins that mediate transport of ions across cell walls. Remarkably, despite overall complexity of these proteins in contemporary cells, their structural motifs are quite simple, with -helices being most common. By combining results of experimental and computer simulation studies on synthetic models and simple, natural channels, I will show that, even though architectures of membrane proteins are not nearly as diverse as those of water-soluble proteins, they are sufficiently flexible to adapt readily to the functional demands arising during

  11. Topological, functional, and dynamic properties of the protein interaction networks rewired by benzo(a)pyrene

    Energy Technology Data Exchange (ETDEWEB)

    Ba, Qian [Key Laboratory of Food Safety Research, Institute for Nutritional Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai (China); Key Laboratory of Food Safety Risk Assessment, Ministry of Health, Beijing (China); Li, Junyang; Huang, Chao [Key Laboratory of Food Safety Research, Institute for Nutritional Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai (China); Li, Jingquan; Chu, Ruiai [Key Laboratory of Food Safety Research, Institute for Nutritional Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai (China); Key Laboratory of Food Safety Risk Assessment, Ministry of Health, Beijing (China); Wu, Yongning, E-mail: wuyongning@cfsa.net.cn [Key Laboratory of Food Safety Risk Assessment, Ministry of Health, Beijing (China); Wang, Hui, E-mail: huiwang@sibs.ac.cn [Key Laboratory of Food Safety Research, Institute for Nutritional Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai (China); Key Laboratory of Food Safety Risk Assessment, Ministry of Health, Beijing (China); School of Life Science and Technology, ShanghaiTech University, Shanghai (China)

    2015-03-01

    Benzo(a)pyrene is a common environmental and foodborne pollutant that has been identified as a human carcinogen. Although the carcinogenicity of benzo(a)pyrene has been extensively reported, its precise molecular mechanisms and the influence on system-level protein networks are not well understood. To investigate the system-level influence of benzo(a)pyrene on protein interactions and regulatory networks, a benzo(a)pyrene-rewired protein interaction network was constructed based on 769 key proteins derived from more than 500 literature reports. The protein interaction network rewired by benzo(a)pyrene was a scale-free, highly-connected biological system. Ten modules were identified, and 25 signaling pathways were enriched, most of which belong to the human diseases category, especially cancer and infectious disease. In addition, two lung-specific and two liver-specific pathways were identified. Three pathways were specific in short and medium-term networks (< 48 h), and five pathways were enriched only in the medium-term network (6 h–48 h). Finally, the expression of linker genes in the network was validated by Western blotting. These findings establish the overall, tissue- and time-specific benzo(a)pyrene-rewired protein interaction networks and provide insights into the biological effects and molecular mechanisms of action of benzo(a)pyrene. - Highlights: • Benzo(a)pyrene induced scale-free, highly-connected protein interaction networks. • 25 signaling pathways were enriched through modular analysis. • Tissue- and time-specific pathways were identified.

  12. Protein-Protein Interaction Article Classification Using a Convolutional Recurrent Neural Network with Pre-trained Word Embeddings.

    Science.gov (United States)

    Matos, Sérgio; Antunes, Rui

    2017-12-13

    Curation of protein interactions from scientific articles is an important task, since interaction networks are essential for the understanding of biological processes associated with disease or pharmacological action for example. However, the increase in the number of publications that potentially contain relevant information turns this into a very challenging and expensive task. In this work we used a convolutional recurrent neural network for identifying relevant articles for extracting information regarding protein interactions. Using the BioCreative III Article Classification Task dataset, we achieved an area under the precision-recall curve of 0.715 and a Matthew's correlation coefficient of 0.600, which represents an improvement over previous works.

  13. A graph modification approach for finding core-periphery structures in protein interaction networks.

    Science.gov (United States)

    Bruckner, Sharon; Hüffner, Falk; Komusiewicz, Christian

    2015-01-01

    The core-periphery model for protein interaction (PPI) networks assumes that protein complexes in these networks consist of a dense core and a possibly sparse periphery that is adjacent to vertices in the core of the complex. In this work, we aim at uncovering a global core-periphery structure for a given PPI network. We propose two exact graph-theoretic formulations for this task, which aim to fit the input network to a hypothetical ground truth network by a minimum number of edge modifications. In one model each cluster has its own periphery, and in the other the periphery is shared. We first analyze both models from a theoretical point of view, showing their NP-hardness. Then, we devise efficient exact and heuristic algorithms for both models and finally perform an evaluation on subnetworks of the S. cerevisiae PPI network.

  14. Structure and evolution of protein interaction networks: a statistical model for link dynamics and gene duplications

    Directory of Open Access Journals (Sweden)

    Wagner Andreas

    2004-11-01

    Full Text Available Abstract Background The structure of molecular networks derives from dynamical processes on evolutionary time scales. For protein interaction networks, global statistical features of their structure can now be inferred consistently from several large-throughput datasets. Understanding the underlying evolutionary dynamics is crucial for discerning random parts of the network from biologically important properties shaped by natural selection. Results We present a detailed statistical analysis of the protein interactions in Saccharomyces cerevisiae based on several large-throughput datasets. Protein pairs resulting from gene duplications are used as tracers into the evolutionary past of the network. From this analysis, we infer rate estimates for two key evolutionary processes shaping the network: (i gene duplications and (ii gain and loss of interactions through mutations in existing proteins, which are referred to as link dynamics. Importantly, the link dynamics is asymmetric, i.e., the evolutionary steps are mutations in just one of the binding parters. The link turnover is shown to be much faster than gene duplications. Both processes are assembled into an empirically grounded, quantitative model for the evolution of protein interaction networks. Conclusions According to this model, the link dynamics is the dominant evolutionary force shaping the statistical structure of the network, while the slower gene duplication dynamics mainly affects its size. Specifically, the model predicts (i a broad distribution of the connectivities (i.e., the number of binding partners of a protein and (ii correlations between the connectivities of interacting proteins, a specific consequence of the asymmetry of the link dynamics. Both features have been observed in the protein interaction network of S. cerevisiae.

  15. Detection of secondary structure elements in proteins by hydrophobic cluster analysis.

    Science.gov (United States)

    Woodcock, S; Mornon, J P; Henrissat, B

    1992-10-01

    Hydrophobic cluster analysis (HCA) is a protein sequence comparison method based on alpha-helical representations of the sequences where the size, shape and orientation of the clusters of hydrophobic residues are primarily compared. The effectiveness of HCA has been suggested to originate from its potential ability to focus on the residues forming the hydrophobic core of globular proteins. We have addressed the robustness of the bidimensional representation used for HCA in its ability to detect the regular secondary structure elements of proteins. Various parameters have been studied such as those governing cluster size and limits, the hydrophobic residues constituting the clusters as well as the potential shift of the cluster positions with respect to the position of the regular secondary structure elements. The following results have been found to support the alpha-helical bidimensional representation used in HCA: (i) there is a positive correlation (clearly above background noise) between the hydrophobic clusters and the regular secondary structure elements in proteins; (ii) the hydrophobic clusters are centred on the regular secondary structure elements; (iii) the pitch of the helical representation which gives the best correspondence is that of an alpha-helix. The correspondence between hydrophobic clusters and regular secondary structure elements suggests a way to implement variable gap penalties during the automatic alignment of protein sequences.

  16. Prioritizing Disease Candidate Proteins in Cardiomyopathy-Specific Protein-Protein Interaction Networks Based on “Guilt by Association” Analysis

    Science.gov (United States)

    He, Weiming; Li, Weiguo; Qu, Xiaoli; Liang, Binhua; Gao, Qianping; Feng, Chenchen; Jia, Xu; Lv, Yana; Zhang, Siya; Li, Xia

    2013-01-01

    The cardiomyopathies are a group of heart muscle diseases which can be inherited (familial). Identifying potential disease-related proteins is important to understand mechanisms of cardiomyopathies. Experimental identification of cardiomyophthies is costly and labour-intensive. In contrast, bioinformatics approach has a competitive advantage over experimental method. Based on “guilt by association” analysis, we prioritized candidate proteins involving in human cardiomyopathies. We first built weighted human cardiomyopathy-specific protein-protein interaction networks for three subtypes of cardiomyopathies using the known disease proteins from Online Mendelian Inheritance in Man as seeds. We then developed a method in prioritizing disease candidate proteins to rank candidate proteins in the network based on “guilt by association” analysis. It was found that most candidate proteins with high scores shared disease-related pathways with disease seed proteins. These top ranked candidate proteins were related with the corresponding disease subtypes, and were potential disease-related proteins. Cross-validation and comparison with other methods indicated that our approach could be used for the identification of potentially novel disease proteins, which may provide insights into cardiomyopathy-related mechanisms in a more comprehensive and integrated way. PMID:23940716

  17. Analysis of hepatocellular carcinoma and metastatic hepatic carcinoma via functional modules in a protein-protein interaction network

    Directory of Open Access Journals (Sweden)

    Jun Pan

    2014-01-01

    Full Text Available Introduction: This study aims to identify protein clusters with potential functional relevance in the pathogenesis of hepatocellular carcinoma (HCC and metastatic hepatic carcinoma using network analysis. Materials and Methods: We used human protein interaction data to build a protein-protein interaction network with Cytoscape and then derived functional clusters using MCODE. Combining the gene expression profiles, we calculated the functional scores for the clusters and selected statistically significant clusters. Meanwhile, Gene Ontology was used to assess the functionality of these clusters. Finally, a support vector machine was trained on the gold standard data sets. Results: The differentially expressed genes of HCC were mainly involved in metabolic and signaling processes. We acquired 13 significant modules from the gene expression profiles. The area under the curve value based on the differentially expressed modules were 98.31%, which outweighed the classification with DEGs. Conclusions: Differentially expressed modules are valuable to screen biomarkers combined with functional modules.

  18. Scale-space measures for graph topology link protein network architecture to function

    NARCIS (Netherlands)

    Hulsman, M.; Dimitrakopoulos, C.; De Ridder, J.

    2014-01-01

    MOTIVATION: The network architecture of physical protein interactions is an important determinant for the molecular functions that are carried out within each cell. To study this relation, the network architecture can be characterized by graph topological characteristics such as shortest paths and

  19. Scale-space measures for graph topology link protein network architecture to function.

    Science.gov (United States)

    Hulsman, Marc; Dimitrakopoulos, Christos; de Ridder, Jeroen

    2014-06-15

    The network architecture of physical protein interactions is an important determinant for the molecular functions that are carried out within each cell. To study this relation, the network architecture can be characterized by graph topological characteristics such as shortest paths and network hubs. These characteristics have an important shortcoming: they do not take into account that interactions occur across different scales. This is important because some cellular functions may involve a single direct protein interaction (small scale), whereas others require more and/or indirect interactions, such as protein complexes (medium scale) and interactions between large modules of proteins (large scale). In this work, we derive generalized scale-aware versions of known graph topological measures based on diffusion kernels. We apply these to characterize the topology of networks across all scales simultaneously, generating a so-called graph topological scale-space. The comprehensive physical interaction network in yeast is used to show that scale-space based measures consistently give superior performance when distinguishing protein functional categories and three major types of functional interactions-genetic interaction, co-expression and perturbation interactions. Moreover, we demonstrate that graph topological scale spaces capture biologically meaningful features that provide new insights into the link between function and protein network architecture. Matlab(TM) code to calculate the scale-aware topological measures (STMs) is available at http://bioinformatics.tudelft.nl/TSSA © The Author 2014. Published by Oxford University Press.

  20. Efficient identification of critical residues based only on protein structure by network analysis.

    Directory of Open Access Journals (Sweden)

    Michael P Cusack

    2007-05-01

    Full Text Available Despite the increasing number of published protein structures, and the fact that each protein's function relies on its three-dimensional structure, there is limited access to automatic programs used for the identification of critical residues from the protein structure, compared with those based on protein sequence. Here we present a new algorithm based on network analysis applied exclusively on protein structures to identify critical residues. Our results show that this method identifies critical residues for protein function with high reliability and improves automatic sequence-based approaches and previous network-based approaches. The reliability of the method depends on the conformational diversity screened for the protein of interest. We have designed a web site to give access to this software at http://bis.ifc.unam.mx/jamming/. In summary, a new method is presented that relates critical residues for protein function with the most traversed residues in networks derived from protein structures. A unique feature of the method is the inclusion of the conformational diversity of proteins in the prediction, thus reproducing a basic feature of the structure/function relationship of proteins.

  1. Protein Network Signatures Associated with Exogenous Biofuels Treatments in Cyanobacterium Synechocystis sp. PCC 6803.

    Science.gov (United States)

    Pei, Guangsheng; Chen, Lei; Wang, Jiangxin; Qiao, Jianjun; Zhang, Weiwen

    2014-01-01

    Although recognized as a promising microbial cell factory for producing biofuels, current productivity in cyanobacterial systems is low. To make the processes economically feasible, one of the hurdles, which need to be overcome is the low tolerance of hosts to toxic biofuels. Meanwhile, little information is available regarding the cellular responses to biofuels stress in cyanobacteria, which makes it challenging for tolerance engineering. Using large proteomic datasets of Synechocystis under various biofuels stress and environmental perturbation, a protein co-expression network was first constructed and then combined with the experimentally determined protein-protein interaction network. Proteins with statistically higher topological overlap in the integrated network were identified as common responsive proteins to both biofuels stress and environmental perturbations. In addition, a weighted gene co-expression network analysis was performed to distinguish unique responses to biofuels from those to environmental perturbations and to uncover metabolic modules and proteins uniquely associated with biofuels stress. The results showed that biofuel-specific proteins and modules were enriched in several functional categories, including photosynthesis, carbon fixation, and amino acid metabolism, which may represent potential key signatures for biofuels stress responses in Synechocystis. Network-based analysis allowed determination of the responses specifically related to biofuels stress, and the results constituted an important knowledge foundation for tolerance engineering against biofuels in Synechocystis.

  2. Inference of a Geminivirus-Host Protein-Protein Interaction Network through Affinity Purification and Mass Spectrometry Analysis.

    Science.gov (United States)

    Wang, Liping; Ding, Xue; Xiao, Jiajing; Jiménez-Gόngora, Tamara; Liu, Renyi; Lozano-Durán, Rosa

    2017-09-25

    Viruses reshape the intracellular environment of their hosts, largely through protein-protein interactions, to co-opt processes necessary for viral infection and interference with antiviral defences. Due to genome size constraints and the concomitant limited coding capacity of viruses, viral proteins are generally multifunctional and have evolved to target diverse host proteins. Inference of the virus-host interaction network can be instrumental for understanding how viruses manipulate the host machinery and how re-wiring of specific pathways can contribute to disease. Here, we use affinity purification and mass spectrometry analysis (AP-MS) to define the global landscape of interactions between the geminivirus Tomato yellow leaf curl virus (TYLCV) and its host Nicotiana benthamiana . For this purpose, we expressed tagged versions of each of TYLCV-encoded proteins (C1/Rep, C2/TrAP, C3/REn, C4, V2, and CP) in planta in the presence of the virus. Using a quantitative scoring system, 728 high-confidence plant interactors were identified, and the interaction network of each viral protein was inferred; TYLCV-targeted proteins are more connected than average, and connect with other proteins through shorter paths, which would allow the virus to exert large effects with few interactions. Comparative analyses of divergence patterns between N. benthamiana and potato, a non-host Solanaceae , showed evolutionary constraints on TYLCV-targeted proteins. Our results provide a comprehensive overview of plant proteins targeted by TYLCV during the viral infection, which may contribute to uncovering the underlying molecular mechanisms of plant viral diseases and provide novel potential targets for anti-viral strategies and crop engineering. Interestingly, some of the TYLCV-interacting proteins appear to be convergently targeted by other pathogen effectors, which suggests a central role for these proteins in plant-pathogen interactions, and pinpoints them as potential targets to

  3. Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding

    KAUST Repository

    Cannistraci, Carlo

    2013-06-21

    Motivation: Most functions within the cell emerge thanks to protein-protein interactions (PPIs), yet experimental determination of PPIs is both expensive and time-consuming. PPI networks present significant levels of noise and incompleteness. Predicting interactions using only PPI-network topology (topological prediction) is difficult but essential when prior biological knowledge is absent or unreliable.Methods: Network embedding emphasizes the relations between network proteins embedded in a low-dimensional space, in which protein pairs that are closer to each other represent good candidate interactions. To achieve network denoising, which boosts prediction performance, we first applied minimum curvilinear embedding (MCE), and then adopted shortest path (SP) in the reduced space to assign likelihood scores to candidate interactions. Furthermore, we introduce (i) a new valid variation of MCE, named non-centred MCE (ncMCE); (ii) two automatic strategies for selecting the appropriate embedding dimension; and (iii) two new randomized procedures for evaluating predictions.Results: We compared our method against several unsupervised and supervisedly tuned embedding approaches and node neighbourhood techniques. Despite its computational simplicity, ncMCE-SP was the overall leader, outperforming the current methods in topological link prediction.Conclusion: Minimum curvilinearity is a valuable non-linear framework that we successfully applied to the embedding of protein networks for the unsupervised prediction of novel PPIs. The rationale for our approach is that biological and evolutionary information is imprinted in the non-linear patterns hidden behind the protein network topology, and can be exploited for predicting new protein links. The predicted PPIs represent good candidates for testing in high-throughput experiments or for exploitation in systems biology tools such as those used for network-based inference and prediction of disease-related functional modules. The

  4. Scoring protein relationships in functional interaction networks predicted from sequence data.

    Directory of Open Access Journals (Sweden)

    Gaston K Mazandu

    Full Text Available UNLABELLED: The abundance of diverse biological data from various sources constitutes a rich source of knowledge, which has the power to advance our understanding of organisms. This requires computational methods in order to integrate and exploit these data effectively and elucidate local and genome wide functional connections between protein pairs, thus enabling functional inferences for uncharacterized proteins. These biological data are primarily in the form of sequences, which determine functions, although functional properties of a protein can often be predicted from just the domains it contains. Thus, protein sequences and domains can be used to predict protein pair-wise functional relationships, and thus contribute to the function prediction process of uncharacterized proteins in order to ensure that knowledge is gained from sequencing efforts. In this work, we introduce information-theoretic based approaches to score protein-protein functional interaction pairs predicted from protein sequence similarity and conserved protein signature matches. The proposed schemes are effective for data-driven scoring of connections between protein pairs. We applied these schemes to the Mycobacterium tuberculosis proteome to produce a homology-based functional network of the organism with a high confidence and coverage. We use the network for predicting functions of uncharacterised proteins. AVAILABILITY: Protein pair-wise functional relationship scores for Mycobacterium tuberculosis strain CDC1551 sequence data and python scripts to compute these scores are available at http://web.cbio.uct.ac.za/~gmazandu/scoringschemes.

  5. The human-bacterial pathogen protein interaction networks of Bacillus anthracis, Francisella tularensis, and Yersinia pestis.

    Directory of Open Access Journals (Sweden)

    Matthew D Dyer

    2010-08-01

    Full Text Available Bacillus anthracis, Francisella tularensis, and Yersinia pestis are bacterial pathogens that can cause anthrax, lethal acute pneumonic disease, and bubonic plague, respectively, and are listed as NIAID Category A priority pathogens for possible use as biological weapons. However, the interactions between human proteins and proteins in these bacteria remain poorly characterized leading to an incomplete understanding of their pathogenesis and mechanisms of immune evasion.In this study, we used a high-throughput yeast two-hybrid assay to identify physical interactions between human proteins and proteins from each of these three pathogens. From more than 250,000 screens performed, we identified 3,073 human-B. anthracis, 1,383 human-F. tularensis, and 4,059 human-Y. pestis protein-protein interactions including interactions involving 304 B. anthracis, 52 F. tularensis, and 330 Y. pestis proteins that are uncharacterized. Computational analysis revealed that pathogen proteins preferentially interact with human proteins that are hubs and bottlenecks in the human PPI network. In addition, we computed modules of human-pathogen PPIs that are conserved amongst the three networks. Functionally, such conserved modules reveal commonalities between how the different pathogens interact with crucial host pathways involved in inflammation and immunity.These data constitute the first extensive protein interaction networks constructed for bacterial pathogens and their human hosts. This study provides novel insights into host-pathogen interactions.

  6. The function of communities in protein interaction networks at multiple scales

    Directory of Open Access Journals (Sweden)

    Jones Nick S

    2010-07-01

    Full Text Available Abstract Background If biology is modular then clusters, or communities, of proteins derived using only protein interaction network structure should define protein modules with similar biological roles. We investigate the link between biological modules and network communities in yeast and its relationship to the scale at which we probe the network. Results Our results demonstrate that the functional homogeneity of communities depends on the scale selected, and that almost all proteins lie in a functionally homogeneous community at some scale. We judge functional homogeneity using a novel test and three independent characterizations of protein function, and find a high degree of overlap between these measures. We show that a high mean clustering coefficient of a community can be used to identify those that are functionally homogeneous. By tracing the community membership of a protein through multiple scales we demonstrate how our approach could be useful to biologists focusing on a particular protein. Conclusions We show that there is no one scale of interest in the community structure of the yeast protein interaction network, but we can identify the range of resolution parameters that yield the most functionally coherent communities, and predict which communities are most likely to be functionally homogeneous.

  7. Identifying protein complex by integrating characteristic of core-attachment into dynamic PPI network.

    Directory of Open Access Journals (Sweden)

    Xianjun Shen

    Full Text Available How to identify protein complex is an important and challenging task in proteomics. It would make great contribution to our knowledge of molecular mechanism in cell life activities. However, the inherent organization and dynamic characteristic of cell system have rarely been incorporated into the existing algorithms for detecting protein complexes because of the limitation of protein-protein interaction (PPI data produced by high throughput techniques. The availability of time course gene expression profile enables us to uncover the dynamics of molecular networks and improve the detection of protein complexes. In order to achieve this goal, this paper proposes a novel algorithm DCA (Dynamic Core-Attachment. It detects protein-complex core comprising of continually expressed and highly connected proteins in dynamic PPI network, and then the protein complex is formed by including the attachments with high adhesion into the core. The integration of core-attachment feature into the dynamic PPI network is responsible for the superiority of our algorithm. DCA has been applied on two different yeast dynamic PPI networks and the experimental results show that it performs significantly better than the state-of-the-art techniques in terms of prediction accuracy, hF-measure and statistical significance in biology. In addition, the identified complexes with strong biological significance provide potential candidate complexes for biologists to validate.

  8. Identification of phosphorylation sites in protein kinase A substrates using artificial neural networks and mass spectrometry

    DEFF Research Database (Denmark)

    Hjerrild, Majbrit; Stensballe, Allan; Rasmussen, Thomas E

    2011-01-01

    Protein phosphorylation plays a key role in cell regulation and identification of phosphorylation sites is important for understanding their functional significance. Here, we present an artificial neural network algorithm: NetPhosK (http://www.cbs.dtu.dk/services/NetPhosK/) that predicts protein...

  9. A Web server for predicting proteins involved in pluripotent network

    Indian Academy of Sciences (India)

    2016-11-04

    Nov 4, 2016 ... Furthermore, PluriPred gives the confidence of the prediction from training dataset's. SVM score distribution ... Machine learning techniques; Pluripotency; primary protein sequence; self-renewal; sequence alignment technique. Supplementary ... sets by discarding the proteins used in 5-fold cross validation.

  10. Topological and functional properties of the small GTPases protein interaction network.

    Directory of Open Access Journals (Sweden)

    Anna Delprato

    Full Text Available Small GTP binding proteins of the Ras superfamily (Ras, Rho, Rab, Arf, and Ran regulate key cellular processes such as signal transduction, cell proliferation, cell motility, and vesicle transport. A great deal of experimental evidence supports the existence of signaling cascades and feedback loops within and among the small GTPase subfamilies suggesting that these proteins function in a coordinated and cooperative manner. The interplay occurs largely through association with bi-partite regulatory and effector proteins but can also occur through the active form of the small GTPases themselves. In order to understand the connectivity of the small GTPases signaling routes, a systems-level approach that analyzes data describing direct and indirect interactions was used to construct the small GTPases protein interaction network. The data were curated from the Search Tool for the Retrieval of Interacting Genes (STRING database and include only experimentally validated interactions. The network method enables the conceptualization of the overall structure as well as the underlying organization of the protein-protein interactions. The interaction network described here is comprised of 778 nodes and 1943 edges and has a scale-free topology. Rac1, Cdc42, RhoA, and HRas are identified as the hubs. Ten sub-network motifs are also identified in this study with themes in apoptosis, cell growth/proliferation, vesicle traffic, cell adhesion/junction dynamics, the nicotinamide adenine dinucleotide phosphate (NADPH oxidase response, transcription regulation, receptor-mediated endocytosis, gene silencing, and growth factor signaling. Bottleneck proteins that bridge signaling paths and proteins that overlap in multiple small GTPase networks are described along with the functional annotation of all proteins in the network.

  11. Integration and visualization of non-coding RNA and protein interaction networks

    DEFF Research Database (Denmark)

    Junge, Alexander; Refsgaard, Jan Christian; Garde, Christian

    Association and Interaction Networks) - a database that combines ncRNA-ncRNA, ncRNA-mRNA and ncRNA-protein interactions with large-scale protein association networks available in the STRING database. By integrating ncRNA and protein networks, RAIN provides a more complete picture of the cell’s complex...... interaction network. RAIN aggregates associations and (predicted) interactions of a vast collection of ncRNA classes, including microRNAs and long ncRNAs, collected from a wide range of resources: a) curated knowledge, b) experimentally supported interactions, c) predicted microRNA-target interactions, and d......) co-occurrences found by text mining Medline abstracts. Each resource was assigned a reliability score by assessing its agreement with a gold standard set of microRNA-target interactions. RAIN is available at: http://rth.dk/resources/rain...

  12. Identification of phosphorylation sites in protein kinase A substrates using artificial neural networks and mass spectrometry

    DEFF Research Database (Denmark)

    Hjerrild, M.; Stensballe, A.; Rasmussen, T.E.

    2004-01-01

    Protein phosphorylation plays a key role in cell regulation and identification of phosphorylation sites is important for understanding their functional significance. Here, we present an artificial neural network algorithm: NetPhosK (http://www.cbs.dtu.dk/services/NetPhosK/) that predicts protein...... kinase A (PKA) phosphorylation sites. The neural network was trained with a positive set of 258 experimentally verified PKA phosphorylation sites. The predictions by NetPhosK were! validated using four novel PKA substrates: Necdin, RFX5, En-2, and Wee 1. The four proteins were phosphorylated by PKA...

  13. Protein Network Signatures Associated with Exogenous Biofuels Treatments in Cyanobacterium Synechocystis sp. PCC 6803

    International Nuclear Information System (INIS)

    Pei, Guangsheng; Chen, Lei; Wang, Jiangxin; Qiao, Jianjun; Zhang, Weiwen

    2014-01-01

    Although recognized as a promising microbial cell factory for producing biofuels, current productivity in cyanobacterial systems is low. To make the processes economically feasible, one of the hurdles, which need to be overcome is the low tolerance of hosts to toxic biofuels. Meanwhile, little information is available regarding the cellular responses to biofuels stress in cyanobacteria, which makes it challenging for tolerance engineering. Using large proteomic datasets of Synechocystis under various biofuels stress and environmental perturbation, a protein co-expression network was first constructed and then combined with the experimentally determined protein–protein interaction network. Proteins with statistically higher topological overlap in the integrated network were identified as common responsive proteins to both biofuels stress and environmental perturbations. In addition, a weighted gene co-expression network analysis was performed to distinguish unique responses to biofuels from those to environmental perturbations and to uncover metabolic modules and proteins uniquely associated with biofuels stress. The results showed that biofuel-specific proteins and modules were enriched in several functional categories, including photosynthesis, carbon fixation, and amino acid metabolism, which may represent potential key signatures for biofuels stress responses in Synechocystis. Network-based analysis allowed determination of the responses specifically related to biofuels stress, and the results constituted an important knowledge foundation for tolerance engineering against biofuels in Synechocystis.

  14. Domain distribution and intrinsic disorder in hubs in the human protein–protein interaction network

    OpenAIRE

    Patil, Ashwini; Kinoshita, Kengo; Nakamura, Haruki

    2010-01-01

    Intrinsic disorder and distributed surface charge have been previously identified as some of the characteristics that differentiate hubs (proteins with a large number of interactions) from non-hubs in protein–protein interaction networks. In this study, we investigated the differences in the quantity, diversity, and functional nature of Pfam domains, and their relationship with intrinsic disorder, in hubs and non-hubs. We found that proteins with a more diverse domain composition were over-re...

  15. Neuroplasticity pathways and protein-interaction networks are modulated by vortioxetine in rodents

    DEFF Research Database (Denmark)

    Waller, Jessica A.; Nygaard, Sara Holm; Li, Yan

    2017-01-01

    and rat in response to distinct treatment regimens and in different brain regions. Furthermore, analysis of complexes of physically-interacting proteins reveal that biomarkers involved in transcriptional regulation, neurodevelopment, neuroplasticity, and endocytosis are modulated by vortioxetine....... A subsequent qPCR study examining the expression of targets in the protein-protein interactome space in response to chronic vortioxetine treatment over a range of doses provides further biological validation that vortioxetine engages neuroplasticity networks. Thus, the same biology is regulated in different...

  16. Protein and signaling networks in vertebrate photoreceptor cells

    Directory of Open Access Journals (Sweden)

    Karl-Wilhelm eKoch

    2015-11-01

    Full Text Available Vertebrate photoreceptor cells are exquisite light detectors operating under very dim and bright illumination. The photoexcitation and adaptation machinery in photoreceptor cells consists of protein complexes that can form highly ordered supramolecular structures and control the homeostasis and mutual dependence of the secondary messengers cGMP and Ca2+. The visual pigment in rod photoreceptors, the G protein-coupled receptor rhodopsin is organized in tracks of dimers thereby providing a signaling platform for the dynamic scaffolding of the G protein transducin. Illuminated rhodopsin is turned off by phosphorylation catalyzed by rhodopsin kinase GRK1 under control of Ca2+-recoverin. The GRK1 protein complex partly assembles in lipid raft structures, where shutting off rhodopsin seems to be more effective. Re-synthesis of cGMP is another crucial step in the recovery of the photoresponse after illumination. It is catalyzed by membrane bound sensory guanylate cyclases and is regulated by specific neuronal Ca2+-sensor proteins called GCAPs. At least one guanylate cyclase (ROS-GC1 was shown to be part of a multiprotein complex having strong interactions with the cytoskeleton and being controlled in a multimodal Ca2+-dependent fashion. The final target of the cGMP signaling cascade is a cyclic nucleotide-gated channel that is a hetero-oligomeric protein located in the plasma membrane and interacting with accessory proteins in highly organized microdomains. We summarize results and interpretations of findings related to the inhomogeneous organization of signaling units in photoreceptor outer segments.

  17. Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality.

    Directory of Open Access Journals (Sweden)

    Elena Zotenko

    2008-08-01

    Full Text Available The centrality-lethality rule, which notes that high-degree nodes in a protein interaction network tend to correspond to proteins that are essential, suggests that the topological prominence of a protein in a protein interaction network may be a good predictor of its biological importance. Even though the correlation between degree and essentiality was confirmed by many independent studies, the reason for this correlation remains illusive. Several hypotheses about putative connections between essentiality of hubs and the topology of protein-protein interaction networks have been proposed, but as we demonstrate, these explanations are not supported by the properties of protein interaction networks. To identify the main topological determinant of essentiality and to provide a biological explanation for the connection between the network topology and essentiality, we performed a rigorous analysis of six variants of the genomewide protein interaction network for Saccharomyces cerevisiae obtained using different techniques. We demonstrated that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, a group of densely connected proteins with shared biological function that are enriched in essential proteins. Moreover, we rejected two previously proposed explanations for the centrality-lethality rule, one relating the essentiality of hubs to their role in the overall network connectivity and another relying on the recently published essential protein interactions model.

  18. Identification of putative drug targets for human sperm-egg interaction defect using protein network approach.

    Science.gov (United States)

    Sabetian, Soudabeh; Shamsir, Mohd Shahir

    2015-07-18

    Sperm-egg interaction defect is a significant cause of in-vitro fertilization failure for infertile cases. Numerous molecular interactions in the form of protein-protein interactions mediate the sperm-egg membrane interaction process. Recent studies have demonstrated that in addition to experimental techniques, computational methods, namely protein interaction network approach, can address protein-protein interactions between human sperm and egg. Up to now, no drugs have been detected to treat sperm-egg interaction disorder, and the initial step in drug discovery research is finding out essential proteins or drug targets for a biological process. The main purpose of this study is to identify putative drug targets for human sperm-egg interaction deficiency and consider if the detected essential proteins are targets for any known drugs using protein-protein interaction network and ingenuity pathway analysis. We have created human sperm-egg protein interaction networks with high confidence, including 106 nodes and 415 interactions. Through topological analysis of the network with calculation of some metrics, such as connectivity and betweenness centrality, we have identified 13 essential proteins as putative drug targets. The potential drug targets are from integrins, fibronectins, epidermal growth factor receptors, collagens and tetraspanins protein families. We evaluated these targets by ingenuity pathway analysis, and the known drugs for the targets have been detected, and the possible effective role of the drugs on sperm-egg interaction defect has been considered. These results showed that the drugs ocriplasmin (Jetrea©), gefitinib (Iressa©), erlotinib hydrochloride (Tarceva©), clingitide, cetuximab (Erbitux©) and panitumumab (Vectibix©) are possible candidates for efficacy testing for the treatment of sperm-egg interaction deficiency. Further experimental validation can be carried out to confirm these results. We have identified the first potential list of

  19. Prediction of Protein Thermostability by an Efficient Neural Network Approach

    Directory of Open Access Journals (Sweden)

    Jalal Rezaeenour

    2016-10-01

    Full Text Available Introduction: Manipulation of protein stability is important for understanding the principles that govern protein thermostability, both in basic research and industrial applications. Various data mining techniques exist for prediction of thermostable proteins. Furthermore, ANN methods have attracted significant attention for prediction of thermostability, because they constitute an appropriate approach to mapping the non-linear input-output relationships and massive parallel computing. Method: An Extreme Learning Machine (ELM was applied to estimate thermal behavior of 1289 proteins. In the proposed algorithm, the parameters of ELM were optimized using a Genetic Algorithm (GA, which tuned a set of input variables, hidden layer biases, and input weights, to and enhance the prediction performance. The method was executed on a set of amino acids, yielding a total of 613 protein features. A number of feature selection algorithms were used to build subsets of the features. A total of 1289 protein samples and 613 protein features were calculated from UniProt database to understand features contributing to the enzymes’ thermostability and find out the main features that influence this valuable characteristic. Results:At the primary structure level, Gln, Glu and polar were the features that mostly contributed to protein thermostability. At the secondary structure level, Helix_S, Coil, and charged_Coil were the most important features affecting protein thermostability. These results suggest that the thermostability of proteins is mainly associated with primary structural features of the protein. According to the results, the influence of primary structure on the thermostabilty of a protein was more important than that of the secondary structure. It is shown that prediction accuracy of ELM (mean square error can improve dramatically using GA with error rates RMSE=0.004 and MAPE=0.1003. Conclusion: The proposed approach for forecasting problem

  20. Mining protein interactomes to improve their reliability and support the advancement of network medicine

    KAUST Repository

    Alanis Lobato, Gregorio

    2015-09-23

    High-throughput detection of protein interactions has had a major impact in our understanding of the intricate molecular machinery underlying the living cell, and has permitted the construction of very large protein interactomes. The protein networks that are currently available are incomplete and a significant percentage of their interactions are false positives. Fortunately, the structural properties observed in good quality social or technological networks are also present in biological systems. This has encouraged the development of tools, to improve the reliability of protein networks and predict new interactions based merely on the topological characteristics of their components. Since diseases are rarely caused by the malfunction of a single protein, having a more complete and reliable interactome is crucial in order to identify groups of inter-related proteins involved in disease etiology. These system components can then be targeted with minimal collateral damage. In this article, an important number of network mining tools is reviewed, together with resources from which reliable protein interactomes can be constructed. In addition to the review, a few representative examples of how molecular and clinical data can be integrated to deepen our understanding of pathogenesis are discussed.

  1. Exploring overlapping functional units with various structure in protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Xiao-Fei Zhang

    Full Text Available Revealing functional units in protein-protein interaction (PPI networks are important for understanding cellular functional organization. Current algorithms for identifying functional units mainly focus on cohesive protein complexes which have more internal interactions than external interactions. Most of these approaches do not handle overlaps among complexes since they usually allow a protein to belong to only one complex. Moreover, recent studies have shown that other non-cohesive structural functional units beyond complexes also exist in PPI networks. Thus previous algorithms that just focus on non-overlapping cohesive complexes are not able to present the biological reality fully. Here, we develop a new regularized sparse random graph model (RSRGM to explore overlapping and various structural functional units in PPI networks. RSRGM is principally dominated by two model parameters. One is used to define the functional units as groups of proteins that have similar patterns of connections to others, which allows RSRGM to detect non-cohesive structural functional units. The other one is used to represent the degree of proteins belonging to the units, which supports a protein belonging to more than one revealed unit. We also propose a regularizer to control the smoothness between the estimators of these two parameters. Experimental results on four S. cerevisiae PPI networks show that the performance of RSRGM on detecting cohesive complexes and overlapping complexes is superior to that of previous competing algorithms. Moreover, RSRGM has the ability to discover biological significant functional units besides complexes.

  2. Control of Cellular Structural Networks Through Unstructured Protein Domains

    Science.gov (United States)

    2016-07-01

    Distribution Unlimited UU UU UU UU 01-07-2016 1-Oct-2009 30-Sep-2015 Final Report: WHITEPAPER ; Research Area 8; Control of cellular structural networks...Projects Office 2150 Shattuck Avenue, Suite 300 Berkeley, CA 94704 -5940 ABSTRACT Final Report: WHITEPAPER ; Research Area 8; Control of cellular structural

  3. Toward a rigorous network of protein-protein interactions of the model sulfate reducer Desulfovibrio vulgaris Hildenborough

    Energy Technology Data Exchange (ETDEWEB)

    Chhabra, S.R.; Joachimiak, M.P.; Petzold, C.J.; Zane, G.M.; Price, M.N.; Gaucher, S.; Reveco, S.A.; Fok, V.; Johanson, A.R.; Batth, T.S.; Singer, M.; Chandonia, J.M.; Joyner, D.; Hazen, T.C.; Arkin, A.P.; Wall, J.D.; Singh, A.K.; Keasling, J.D.

    2011-05-01

    Protein–protein interactions offer an insight into cellular processes beyond what may be obtained by the quantitative functional genomics tools of proteomics and transcriptomics. The aforementioned tools have been extensively applied to study E. coli and other aerobes and more recently to study the stress response behavior of Desulfovibrio 5 vulgaris Hildenborough, a model anaerobe and sulfate reducer. In this paper we present the first attempt to identify protein-protein interactions in an obligate anaerobic bacterium. We used suicide vector-assisted chromosomal modification of 12 open reading frames encoded by this sulfate reducer to append an eight amino acid affinity tag to the carboxy-terminus of the chosen proteins. Three biological replicates of the 10 ‘pulled-down’ proteins were separated and analyzed using liquid chromatography-mass spectrometry. Replicate agreement ranged between 35% and 69%. An interaction network among 12 bait and 90 prey proteins was reconstructed based on 134 bait-prey interactions computationally identified to be of high confidence. We discuss the biological significance of several unique metabolic features of D. vulgaris revealed by this protein-protein interaction data 15 and protein modifications that were observed. These include the distinct role of the putative carbon monoxide-induced hydrogenase, unique electron transfer routes associated with different oxidoreductases, and the possible role of methylation in regulating sulfate reduction.

  4. A combinatorial approach to detect coevolved amino acid networks in protein families of variable divergence.

    Directory of Open Access Journals (Sweden)

    Julie Baussand

    2009-09-01

    Full Text Available Communication between distant sites often defines the biological role of a protein: amino acid long-range interactions are as important in binding specificity, allosteric regulation and conformational change as residues directly contacting the substrate. The maintaining of functional and structural coupling of long-range interacting residues requires coevolution of these residues. Networks of interaction between coevolved residues can be reconstructed, and from the networks, one can possibly derive insights into functional mechanisms for the protein family. We propose a combinatorial method for mapping conserved networks of amino acid interactions in a protein which is based on the analysis of a set of aligned sequences, the associated distance tree and the combinatorics of its subtrees. The degree of coevolution of all pairs of coevolved residues is identified numerically, and networks are reconstructed with a dedicated clustering algorithm. The method drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed. We apply the method to four protein families where we show an accurate detection of functional networks and the possibility to treat sets of protein sequences of variable divergence.

  5. Exploring the Ligand-Protein Networks in Traditional Chinese Medicine: Current Databases, Methods, and Applications

    Directory of Open Access Journals (Sweden)

    Mingzhu Zhao

    2013-01-01

    Full Text Available The traditional Chinese medicine (TCM, which has thousands of years of clinical application among China and other Asian countries, is the pioneer of the “multicomponent-multitarget” and network pharmacology. Although there is no doubt of the efficacy, it is difficult to elucidate convincing underlying mechanism of TCM due to its complex composition and unclear pharmacology. The use of ligand-protein networks has been gaining significant value in the history of drug discovery while its application in TCM is still in its early stage. This paper firstly surveys TCM databases for virtual screening that have been greatly expanded in size and data diversity in recent years. On that basis, different screening methods and strategies for identifying active ingredients and targets of TCM are outlined based on the amount of network information available, both on sides of ligand bioactivity and the protein structures. Furthermore, applications of successful in silico target identification attempts are discussed in detail along with experiments in exploring the ligand-protein networks of TCM. Finally, it will be concluded that the prospective application of ligand-protein networks can be used not only to predict protein targets of a small molecule, but also to explore the mode of action of TCM.

  6. Discovery of intramolecular signal transduction network based on a new protein dynamics model of energy dissipation.

    Directory of Open Access Journals (Sweden)

    Cheng-Wei Ma

    Full Text Available A novel approach to reveal intramolecular signal transduction network is proposed in this work. To this end, a new algorithm of network construction is developed, which is based on a new protein dynamics model of energy dissipation. A key feature of this approach is that direction information is specified after inferring protein residue-residue interaction network involved in the process of signal transduction. This enables fundamental analysis of the regulation hierarchy and identification of regulation hubs of the signaling network. A well-studied allosteric enzyme, E. coli aspartokinase III, is used as a model system to demonstrate the new method. Comparison with experimental results shows that the new approach is able to predict all the sites that have been experimentally proved to desensitize allosteric regulation of the enzyme. In addition, the signal transduction network shows a clear preference for specific structural regions, secondary structural types and residue conservation. Occurrence of super-hubs in the network indicates that allosteric regulation tends to gather residues with high connection ability to collectively facilitate the signaling process. Furthermore, a new parameter of propagation coefficient is defined to determine the propagation capability of residues within a signal transduction network. In conclusion, the new approach is useful for fundamental understanding of the process of intramolecular signal transduction and thus has significant impact on rational design of novel allosteric proteins.

  7. Molecular Principles of Gene Fusion Mediated Rewiring of Protein Interaction Networks in Cancer.

    Science.gov (United States)

    Latysheva, Natasha S; Oates, Matt E; Maddox, Louis; Flock, Tilman; Gough, Julian; Buljan, Marija; Weatheritt, Robert J; Babu, M Madan

    2016-08-18

    Gene fusions are common cancer-causing mutations, but the molecular principles by which fusion protein products affect interaction networks and cause disease are not well understood. Here, we perform an integrative analysis of the structural, interactomic, and regulatory properties of thousands of putative fusion proteins. We demonstrate that genes that form fusions (i.e., parent genes) tend to be highly connected hub genes, whose protein products are enriched in structured and disordered interaction-mediating features. Fusion often results in the loss of these parental features and the depletion of regulatory sites such as post-translational modifications. Fusion products disproportionately connect proteins that did not previously interact in the protein interaction network. In this manner, fusion products can escape cellular regulation and constitutively rewire protein interaction networks. We suggest that the deregulation of central, interaction-prone proteins may represent a widespread mechanism by which fusion proteins alter the topology of cellular signaling pathways and promote cancer. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Dynamic changes in protein functional linkage networks revealed by integration with gene expression data.

    Directory of Open Access Journals (Sweden)

    Shubhada R Hegde

    2008-11-01

    Full Text Available Response of cells to changing environmental conditions is governed by the dynamics of intricate biomolecular interactions. It may be reasonable to assume, proteins being the dominant macromolecules that carry out routine cellular functions, that understanding the dynamics of protein:protein interactions might yield useful insights into the cellular responses. The large-scale protein interaction data sets are, however, unable to capture the changes in the profile of protein:protein interactions. In order to understand how these interactions change dynamically, we have constructed conditional protein linkages for Escherichia coli by integrating functional linkages and gene expression information. As a case study, we have chosen to analyze UV exposure in wild-type and SOS deficient E. coli at 20 minutes post irradiation. The conditional networks exhibit similar topological properties. Although the global topological properties of the networks are similar, many subtle local changes are observed, which are suggestive of the cellular response to the perturbations. Some such changes correspond to differences in the path lengths among the nodes of carbohydrate metabolism correlating with its loss in efficiency in the UV treated cells. Similarly, expression of hubs under unique conditions reflects the importance of these genes. Various centrality measures applied to the networks indicate increased importance for replication, repair, and other stress proteins for the cells under UV treatment, as anticipated. We thus propose a novel approach for studying an organism at the systems level by integrating genome-wide functional linkages and the gene expression data.

  9. Bio::Homology::InterologWalk--a Perl module to build putative protein-protein interaction networks through interolog mapping.

    Science.gov (United States)

    Gallone, Giuseppe; Simpson, T Ian; Armstrong, J Douglas; Jarman, Andrew P

    2011-07-18

    Protein-protein interaction (PPI) data are widely used to generate network models that aim to describe the relationships between proteins in biological systems. The fidelity and completeness of such networks is primarily limited by the paucity of protein interaction information and by the restriction of most of these data to just a few widely studied experimental organisms. In order to extend the utility of existing PPIs, computational methods can be used that exploit functional conservation between orthologous proteins across taxa to predict putative PPIs or 'interologs'. To date most interolog prediction efforts have been restricted to specific biological domains with fixed underlying data sources and there are no software tools available that provide a generalised framework for 'on-the-fly' interolog prediction. We introduce Bio::Homology::InterologWalk, a Perl module to retrieve, prioritise and visualise putative protein-protein interactions through an orthology-walk method. The module uses orthology and experimental interaction data to generate putative PPIs and optionally collates meta-data into an Interaction Prioritisation Index that can be used to help prioritise interologs for further analysis. We show the application of our interolog prediction method to the genomic interactome of the fruit fly, Drosophila melanogaster. We analyse the resulting interaction networks and show that the method proposes new interactome members and interactions that are candidates for future experimental investigation. Our interolog prediction tool employs the Ensembl Perl API and PSICQUIC enabled protein interaction data sources to generate up to date interologs 'on-the-fly'. This represents a significant advance on previous methods for interolog prediction as it allows the use of the latest orthology and protein interaction data for all of the genomes in Ensembl. The module outputs simple text files, making it easy to customise the results by post-processing, allowing the

  10. Bio::Homology::InterologWalk - A Perl module to build putative protein-protein interaction networks through interolog mapping

    Directory of Open Access Journals (Sweden)

    Armstrong J Douglas

    2011-07-01

    Full Text Available Abstract Background Protein-protein interaction (PPI data are widely used to generate network models that aim to describe the relationships between proteins in biological systems. The fidelity and completeness of such networks is primarily limited by the paucity of protein interaction information and by the restriction of most of these data to just a few widely studied experimental organisms. In order to extend the utility of existing PPIs, computational methods can be used that exploit functional conservation between orthologous proteins across taxa to predict putative PPIs or 'interologs'. To date most interolog prediction efforts have been restricted to specific biological domains with fixed underlying data sources and there are no software tools available that provide a generalised framework for 'on-the-fly' interolog prediction. Results We introduce Bio::Homology::InterologWalk, a Perl module to retrieve, prioritise and visualise putative protein-protein interactions through an orthology-walk method. The module uses orthology and experimental interaction data to generate putative PPIs and optionally collates meta-data into an Interaction Prioritisation Index that can be used to help prioritise interologs for further analysis. We show the application of our interolog prediction method to the genomic interactome of the fruit fly, Drosophila melanogaster. We analyse the resulting interaction networks and show that the method proposes new interactome members and interactions that are candidates for future experimental investigation. Conclusions Our interolog prediction tool employs the Ensembl Perl API and PSICQUIC enabled protein interaction data sources to generate up to date interologs 'on-the-fly'. This represents a significant advance on previous methods for interolog prediction as it allows the use of the latest orthology and protein interaction data for all of the genomes in Ensembl. The module outputs simple text files, making it easy

  11. Systematic discovery of new recognition peptides mediating protein interaction networks

    DEFF Research Database (Denmark)

    Neduva, Victor; Linding, Rune; Su-Angrand, Isabelle

    2005-01-01

    Many aspects of cell signalling, trafficking, and targeting are governed by interactions between globular protein domains and short peptide segments. These domains often bind multiple peptides that share a common sequence pattern, or "linear motif" (e.g., SH3 binding to PxxP). Many domains are kn...

  12. Functional protein networks unifying limb girdle muscular dystrophy

    NARCIS (Netherlands)

    Morrée, Antoine de

    2011-01-01

    Limb Girdle Muscular Dystrophy (LGMD) is a rare progressive heterogeneous disorder that can be caused by mutations in at least 21 different genes. These genes are often widely expressed and encode proteins with highly differing functions. And yet mutations in all of them give rise to a similar

  13. Analysis of core–periphery organization in protein contact networks ...

    Indian Academy of Sciences (India)

    Cartoon tube representation of the tertiary structure of Inositol monophosphatase (PDB Id: 1g0h) protein showing binding with a ligand molecule IPD (shown in blue) and two carbon atoms (shown in yellow). Most of the residues interacting with the ligand, indicated in spear format, belong to the innermost core (shown in red) ...

  14. Insights into biological information processing: structural and dynamical analysis of a human protein signalling network

    Energy Technology Data Exchange (ETDEWEB)

    Fuente, Alberto de la; Fotia, Giorgio; Maggio, Fabio; Mancosu, Gianmaria; Pieroni, Enrico [CRS4 Bioinformatica, Parco Tecnologico POLARIS, Ed.1, Loc Piscinamanna, Pula (Italy)], E-mail: alf@crs4.it

    2008-06-06

    We present an investigation on the structural and dynamical properties of a 'human protein signalling network' (HPSN). This biological network is composed of nodes that correspond to proteins and directed edges that represent signal flows. In order to gain insight into the organization of cell information processing this network is analysed taking into account explicitly the edge directions. We explore the topological properties of the HPSN at the global and the local scale, further applying the generating function formalism to provide a suitable comparative model. The relationship between the node degrees and the distribution of signals through the network is characterized using degree correlation profiles. Finally, we analyse the dynamical properties of small sub-graphs showing high correlation between their occurrence and dynamic stability.

  15. When the Web meets the cell: using personalized PageRank for analyzing protein interaction networks.

    Science.gov (United States)

    Iván, Gábor; Grolmusz, Vince

    2011-02-01

    Enormous and constantly increasing quantity of biological information is represented in metabolic and in protein interaction network databases. Most of these data are freely accessible through large public depositories. The robust analysis of these resources needs novel technologies, being developed today. Here we demonstrate a technique, originating from the PageRank computation for the World Wide Web, for analyzing large interaction networks. The method is fast, scalable and robust, and its capabilities are demonstrated on metabolic network data of the tuberculosis bacterium and the proteomics analysis of the blood of melanoma patients. The Perl script for computing the personalized PageRank in protein networks is available for non-profit research applications (together with sample input files) at the address: http://uratim.com/pp.zip.

  16. Transmembrane protein topology prediction using support vector machines

    Directory of Open Access Journals (Sweden)

    Nugent Timothy

    2009-05-01

    Full Text Available Abstract Background Alpha-helical transmembrane (TM proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been estimated that more than half of all drugs currently on the market target membrane proteins. However, due to the experimental difficulties involved in obtaining high quality crystals, this class of protein is severely under-represented in structural databases. In the absence of structural data, sequence-based prediction methods allow TM protein topology to be investigated. Results We present a support vector machine-based (SVM TM protein topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of 131 sequences with known crystal structures. The method achieves topology prediction accuracy of 89%, while signal peptides and re-entrant helices are predicted with 93% and 44% accuracy respectively. An additional SVM trained to discriminate between globular and TM proteins detected zero false positives, with a low false negative rate of 0.4%. We present the results of applying these tools to a number of complete genomes. Source code, data sets and a web server are freely available from http://bioinf.cs.ucl.ac.uk/psipred/. Conclusion The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins.

  17. Dynamic Proteomic Characteristics and Network Integration Revealing Key Proteins for Two Kernel Tissue Developments in Popcorn.

    Directory of Open Access Journals (Sweden)

    Yongbin Dong

    Full Text Available The formation and development of maize kernel is a complex dynamic physiological and biochemical process that involves the temporal and spatial expression of many proteins and the regulation of metabolic pathways. In this study, the protein profiles of the endosperm and pericarp at three important developmental stages were analyzed by isobaric tags for relative and absolute quantification (iTRAQ labeling coupled with LC-MS/MS in popcorn inbred N04. Comparative quantitative proteomic analyses among developmental stages and between tissues were performed, and the protein networks were integrated. A total of 6,876 proteins were identified, of which 1,396 were nonredundant. Specific proteins and different expression patterns were observed across developmental stages and tissues. The functional annotation of the identified proteins revealed the importance of metabolic and cellular processes, and binding and catalytic activities for the development of the tissues. The whole, endosperm-specific and pericarp-specific protein networks integrated 125, 9 and 77 proteins, respectively, which were involved in 54 KEGG pathways and reflected their complex metabolic interactions. Confirmation for the iTRAQ endosperm proteins by two-dimensional gel electrophoresis showed that 44.44% proteins were commonly found. However, the concordance between mRNA level and the protein abundance varied across different proteins, stages, tissues and inbred lines, according to the gene cloning and expression analyses of four relevant proteins with important functions and different expression levels. But the result by western blot showed their same expression tendency for the four proteins as by iTRAQ. These results could provide new insights into the developmental mechanisms of endosperm and pericarp, and grain formation in maize.

  18. Building and analyzing protein interactome networks by cross-species comparisons

    Directory of Open Access Journals (Sweden)

    Blackman Barron

    2010-03-01

    Full Text Available Abstract Background A genomic catalogue of protein-protein interactions is a rich source of information, particularly for exploring the relationships between proteins. Numerous systems-wide and small-scale experiments have been conducted to identify interactions; however, our knowledge of all interactions for any one species is incomplete, and alternative means to expand these network maps is needed. We therefore took a comparative biology approach to predict protein-protein interactions across five species (human, mouse, fly, worm, and yeast and developed InterologFinder for research biologists to easily navigate this data. We also developed a confidence score for interactions based on available experimental evidence and conservation across species. Results The connectivity of the resultant networks was determined to have scale-free distribution, small-world properties, and increased local modularity, indicating that the added interactions do not disrupt our current understanding of protein network structures. We show examples of how these improved interactomes can be used to analyze a genome-scale dataset (RNAi screen and to assign new function to proteins. Predicted interactions within this dataset were tested by co-immunoprecipitation, resulting in a high rate of validation, suggesting the high quality of networks produced. Conclusions Protein-protein interactions were predicted in five species, based on orthology. An InteroScore, a score accounting for homology, number of orthologues with evidence of interactions, and number of unique observations of interactions, is given to each known and predicted interaction. Our website http://www.interologfinder.org provides research biologists intuitive access to this data.

  19. Revisiting date and party hubs: novel approaches to role assignment in protein interaction networks.

    Science.gov (United States)

    Agarwal, Sumeet; Deane, Charlotte M; Porter, Mason A; Jones, Nick S

    2010-06-17

    The idea of "date" and "party" hubs has been influential in the study of protein-protein interaction networks. Date hubs display low co-expression with their partners, whilst party hubs have high co-expression. It was proposed that party hubs are local coordinators whereas date hubs are global connectors. Here, we show that the reported importance of date hubs to network connectivity can in fact be attributed to a tiny subset of them. Crucially, these few, extremely central, hubs do not display particularly low expression correlation, undermining the idea of a link between this quantity and hub function. The date/party distinction was originally motivated by an approximately bimodal distribution of hub co-expression; we show that this feature is not always robust to methodological changes. Additionally, topological properties of hubs do not in general correlate with co-expression. However, we find significant correlations between interaction centrality and the functional similarity of the interacting proteins. We suggest that thinking in terms of a date/party dichotomy for hubs in protein interaction networks is not meaningful, and it might be more useful to conceive of roles for protein-protein interactions rather than for individual proteins.

  20. Revisiting date and party hubs: novel approaches to role assignment in protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Sumeet Agarwal

    2010-06-01

    Full Text Available The idea of "date" and "party" hubs has been influential in the study of protein-protein interaction networks. Date hubs display low co-expression with their partners, whilst party hubs have high co-expression. It was proposed that party hubs are local coordinators whereas date hubs are global connectors. Here, we show that the reported importance of date hubs to network connectivity can in fact be attributed to a tiny subset of them. Crucially, these few, extremely central, hubs do not display particularly low expression correlation, undermining the idea of a link between this quantity and hub function. The date/party distinction was originally motivated by an approximately bimodal distribution of hub co-expression; we show that this feature is not always robust to methodological changes. Additionally, topological properties of hubs do not in general correlate with co-expression. However, we find significant correlations between interaction centrality and the functional similarity of the interacting proteins. We suggest that thinking in terms of a date/party dichotomy for hubs in protein interaction networks is not meaningful, and it might be more useful to conceive of roles for protein-protein interactions rather than for individual proteins.

  1. Mass-action equilibrium and non-specific interactions in protein binding networks

    Science.gov (United States)

    Maslov, Sergei

    2009-03-01

    Large-scale protein binding networks serve as a paradigm of complex properties of living cells. These networks are naturally weighted with edges characterized by binding strength and protein-nodes -- by their concentrations. However, the state-of-the-art high-throughput experimental techniques generate just a binary (yes or no) information about individual interactions. As a result, most of the previous research concentrated just on topology of these networks. In a series of recent publications [1-4] my collaborators and I went beyond purely topological studies and calculated the mass-action equilibrium of a genome-wide binding network using experimentally determined protein concentrations, localizations, and reliable binding interactions in baker's yeast. We then studied how this equilibrium responds to large perturbations [1-2] and noise [3] in concentrations of proteins. We demonstrated that the change in the equilibrium concentration of a protein exponentially decays (and sign-alternates) with its network distance away from the perturbed node. This explains why, despite a globally connected topology, individual functional modules in such networks are able to operate fairly independently. In a separate study [4] we quantified the interplay between specific and non-specific binding interactions under crowded conditions inside living cells. We show how the need to limit the waste of resources constrains the number of types and concentrations of proteins that are present at the same time and at the same place in yeast cells. [1] S Maslov, I. Ispolatov, PNAS 104:13655 (2007). [2] S. Maslov, K. Sneppen, I. Ispolatov, New J. of Phys. 9: 273 (2007). [3] K-K. Yan, D. Walker, S. Maslov, PRL accepted (2008). [4] J. Zhang, S. Maslov, and E. I. Shakhnovich, Mol Syst Biol 4, 210 (2008).

  2. Identification and network of outer membrane proteins regulating streptomysin resistance in Escherichia coli.

    Science.gov (United States)

    Li, Hui; Wang, Bao-Cheng; Xu, Wen-Jiao; Lin, Xiang-Min; Peng, Xuan-Xian

    2008-09-01

    Bacterial Outer membrane (OM) proteins involved in antibiotic resistance have been reported. However, little is known about the OM proteins and their interaction network regulating streptomycin (SM) resistance. In the present study, a subproteomic approach was utilized to characterize OM proteins of Escherichia coli with SM resistance. TolC, OmpT and LamB were found to be up-regulated, and FadL, OmpW and a location-unknown protein Dps were down-regulated in the SM-resistant E. coli strain. These changes at the level of protein expression were validated using Western blotting. The possible roles of the altered proteins involved in the SM resistance were investigated using genetic modified strains with the deletion of these altered genes. It is found that decreased and elevated minimum inhibitory concentrations and survival capabilities of the gene deleted strains and their resistant strains, Delta tolC, Delta ompT, Delta dps, Delta tolC-R, Delta ompT-R, Delta dps-R and Delta fadL-R, were correlated with the changes of TolC, OmpT, Dps and FadL at the protein expression levels detected by 2-DE gels, respectively. The results may suggest that these proteins are the key OM proteins and play important roles in the regulation of SM resistance in E. coli. Furthermore, an interaction network of altered OM proteins involved in the SM resistance was proposed in this report. Of the six altered proteins, TolC may play a central role in the network. These findings may provide novel insights into mechanisms of SM resistance in E. coli.

  3. Sequence similarity network reveals common ancestry of multidomain proteins.

    Directory of Open Access Journals (Sweden)

    Nan Song

    2008-05-01

    Full Text Available We address the problem of homology identification in complex multidomain families with varied domain architectures. The challenge is to distinguish sequence pairs that share common ancestry from pairs that share an inserted domain but are otherwise unrelated. This distinction is essential for accuracy in gene annotation, function prediction, and comparative genomics. There are two major obstacles to multidomain homology identification: lack of a formal definition and lack of curated benchmarks for evaluating the performance of new methods. We offer preliminary solutions to both problems: 1 an extension of the traditional model of homology to include domain insertions; and 2 a manually curated benchmark of well-studied families in mouse and human. We further present Neighborhood Correlation, a novel method that exploits the local structure of the sequence similarity network to identify homologs with great accuracy based on the observation that gene duplication and domain shuffling leave distinct patterns in the sequence similarity network. In a rigorous, empirical comparison using our curated data, Neighborhood Correlation outperforms sequence similarity, alignment length, and domain architecture comparison. Neighborhood Correlation is well suited for automated, genome-scale analyses. It is easy to compute, does not require explicit knowledge of domain architecture, and classifies both single and multidomain homologs with high accuracy. Homolog predictions obtained with our method, as well as our manually curated benchmark and a web-based visualization tool for exploratory analysis of the network neighborhood structure, are available at http://www.neighborhoodcorrelation.org. Our work represents a departure from the prevailing view that the concept of homology cannot be applied to genes that have undergone domain shuffling. In contrast to current approaches that either focus on the homology of individual domains or consider only families with

  4. Characterization of the CLASP2 Protein Interaction Network Identifies SOGA1 as a Microtubule-Associated Protein

    DEFF Research Database (Denmark)

    Sørensen, Rikke Kruse; Krantz, James; Barker, Natalie

    2017-01-01

    and built a CLASP2 protein network in 3T3-L1 adipocytes. Using two different commercially available antibodies for CLASP2 and an antibody for epitope-tagged, overexpressed CLASP2, we performed multiple affinity purification coupled with mass spectrometry (AP-MS) experiments in combination with label-free......, glycogen synthase, and glycogenin. Investigating the SOGA1 interactome confirmed SOGA1 can reciprocal co-IP both CLASP2 and MARK2 as well as glycogen synthase and glycogenin. SOGA1 was confirmed to colocalize with CLASP2 and also with tubulin, which identifies SOGA1 as a new microtubule-associated protein...

  5. Protein-protein networks construction and their relevance measurement based on multi-epitope-ligand-kartographie and gene ontology data of T-cell surface proteins for polymyositis.

    Science.gov (United States)

    Li, Fang-Zhen; Gao, Feng

    2012-08-01

    Polymyositis is an inflammatory myopathy characterized by muscle invasion of T-cells penetrating the basal lamina and displacing the plasma membrane of normal muscle fibers. In order to understand the different adhesive mechanisms at the T-cell surface, Schubert randomly selected 19 proteins expressed at the T-cell surface and studied them using MELK technique [4], among which 15 proteins are picked up for further study by us. Two types of functional similarity networks are constructed for these proteins. The first type is MELK similarity network, which is constructed based on their MELK data by using the McNemar's test [24]. The second type is GO similarity network, which is constructed based on their GO annotation data by using the RSS method to measuring functional similarity. Then the subset surprisology theory is employed to measure the degree of similarity between two networks. Our computing results show that these two types of networks are high related. This conclusion added new values on MELK technique and expanded its applications greatly.

  6. Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of 'date' and 'party' hubs.

    Science.gov (United States)

    Chang, Xiao; Xu, Tao; Li, Yun; Wang, Kai

    2013-01-01

    The protein-protein interaction (PPI) networks are dynamically organized as modules, and are typically described by hub dichotomy: 'party' hubs act as intramodule hubs and are coexpressed with their partners, yet 'date' hubs act as coordinators among modules and are incoherently expressed with their partners. However, there remains skepticism about the existence of hub dichotomy. Since different algorithms and data sets were used in previous studies to test the model of hub classification, the conclusions may be largely influenced by the potential inherent biases. In this study, we evaluated two data sets of yeast interactome, and systematically investigated the behavior of hubs from multiple perspectives including co-expression patterns, topological roles and functional classifications. Our results revealed consistency between the two data sets, confirming the presence of hub dichotomy. Furthermore, we analyzed a human interactome data set, and demonstrated that the modular architecture of the PPI networks was more complicated than hub dichotomy.

  7. Similar Pathogen Targets in Arabidopsis thaliana and Homo sapiens Protein Networks

    Science.gov (United States)

    2012-09-21

    the average degree (kav) and average number of pathogen effectors per protein ( pav ) in each shell for both networks - despite being associated with...species from different kingdoms (for humans, power-law regression produces pav ~0:033:kav 0:744, r2~0:788, p~3:039:10{10, MIC~0:820; for Arabidopsis, pav ...pathogen interactions ( pav ) per node in a shell (log-log scale) with power- law fits. The core of each network is circled. (A) Human protein interaction

  8. A Bayesian framework for cell-level protein network analysis for multivariate proteomics image data

    Science.gov (United States)

    Kovacheva, Violet N.; Sirinukunwattana, Korsuk; Rajpoot, Nasir M.

    2014-03-01

    The recent development of multivariate imaging techniques, such as the Toponome Imaging System (TIS), has facilitated the analysis of multiple co-localisation of proteins. This could hold the key to understanding complex phenomena such as protein-protein interaction in cancer. In this paper, we propose a Bayesian framework for cell level network analysis allowing the identification of several protein pairs having significantly higher co-expression levels in cancerous tissue samples when compared to normal colon tissue. It involves segmenting the DAPI-labeled image into cells and determining the cell phenotypes according to their protein-protein dependence profile. The cells are phenotyped using Gaussian Bayesian hierarchical clustering (GBHC) after feature selection is performed. The phenotypes are then analysed using Difference in Sums of Weighted cO-dependence Profiles (DiSWOP), which detects differences in the co-expression patterns of protein pairs. We demonstrate that the pairs highlighted by the proposed framework have high concordance with recent results using a different phenotyping method. This demonstrates that the results are independent of the clustering method used. In addition, the highlighted protein pairs are further analysed via protein interaction pathway databases and by considering the localization of high protein-protein dependence within individual samples. This suggests that the proposed approach could identify potentially functional protein complexes active in cancer progression and cell differentiation.

  9. Modification of gene duplicability during the evolution of protein interaction network.

    Directory of Open Access Journals (Sweden)

    Matteo D'Antonio

    2011-04-01

    Full Text Available Duplications of genes encoding highly connected and essential proteins are selected against in several species but not in human, where duplicated genes encode highly connected proteins. To understand when and how gene duplicability changed in evolution, we compare gene and network properties in four species (Escherichia coli, yeast, fly, and human that are representative of the increase in evolutionary complexity, defined as progressive growth in the number of genes, cells, and cell types. We find that the origin and conservation of a gene significantly correlates with the properties of the encoded protein in the protein-protein interaction network. All four species preserve a core of singleton and central hubs that originated early in evolution, are highly conserved, and accomplish basic biological functions. Another group of hubs appeared in metazoans and duplicated in vertebrates, mostly through vertebrate-specific whole genome duplication. Such recent and duplicated hubs are frequently targets of microRNAs and show tissue-selective expression, suggesting that these are alternative mechanisms to control their dosage. Our study shows how networks modified during evolution and contributes to explaining the occurrence of somatic genetic diseases, such as cancer, in terms of network perturbations.

  10. AtPID: the overall hierarchical functional protein interaction network interface and analytic platform for Arabidopsis.

    Science.gov (United States)

    Li, Peng; Zang, Weidong; Li, Yuhua; Xu, Feng; Wang, Jigang; Shi, Tieliu

    2011-01-01

    Protein interactions are involved in important cellular functions and biological processes that are the fundamentals of all life activities. With improvements in experimental techniques and progress in research, the overall protein interaction network frameworks of several model organisms have been created through data collection and integration. However, most of the networks processed only show simple relationships without boundary, weight or direction, which do not truly reflect the biological reality. In vivo, different types of protein interactions, such as the assembly of protein complexes or phosphorylation, often have their specific functions and qualifications. Ignorance of these features will bring much bias to the network analysis and application. Therefore, we annotate the Arabidopsis proteins in the AtPID database with further information (e.g. functional annotation, subcellular localization, tissue-specific expression, phosphorylation information, SNP phenotype and mutant phenotype, etc.) and interaction qualifications (e.g. transcriptional regulation, complex assembly, functional collaboration, etc.) via further literature text mining and integration of other resources. Meanwhile, the related information is vividly displayed to users through a comprehensive and newly developed display and analytical tools. The system allows the construction of tissue-specific interaction networks with display of canonical pathways. The latest updated AtPID database is available at http://www.megabionet.org/atpid/.

  11. HPIminer: A text mining system for building and visualizing human protein interaction networks and pathways.

    Science.gov (United States)

    Subramani, Suresh; Kalpana, Raja; Monickaraj, Pankaj Moses; Natarajan, Jeyakumar

    2015-04-01

    The knowledge on protein-protein interactions (PPI) and their related pathways are equally important to understand the biological functions of the living cell. Such information on human proteins is highly desirable to understand the mechanism of several diseases such as cancer, diabetes, and Alzheimer's disease. Because much of that information is buried in biomedical literature, an automated text mining system for visualizing human PPI and pathways is highly desirable. In this paper, we present HPIminer, a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG. To our knowledge, HPIminer is the first system to build interaction networks from literature as well as curated databases. Further, the new interactions mined only from literature and not reported earlier in databases are highlighted as new. A comparative study with other similar tools shows that the resultant network is more informative and provides additional information on interacting proteins and their associated networks. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. Coevolution analysis of Hepatitis C virus genome to identify the structural and functional dependency network of viral proteins

    Science.gov (United States)

    Champeimont, Raphaël; Laine, Elodie; Hu, Shuang-Wei; Penin, Francois; Carbone, Alessandra

    2016-05-01

    A novel computational approach of coevolution analysis allowed us to reconstruct the protein-protein interaction network of the Hepatitis C Virus (HCV) at the residue resolution. For the first time, coevolution analysis of an entire viral genome was realized, based on a limited set of protein sequences with high sequence identity within genotypes. The identified coevolving residues constitute highly relevant predictions of protein-protein interactions for further experimental identification of HCV protein complexes. The method can be used to analyse other viral genomes and to predict the associated protein interaction networks.

  13. How curved membranes recruit amphipathic helices and protein anchoring motifs.

    Science.gov (United States)

    Hatzakis, Nikos S; Bhatia, Vikram K; Larsen, Jannik; Madsen, Kenneth L; Bolinger, Pierre-Yves; Kunding, Andreas H; Castillo, John; Gether, Ulrik; Hedegård, Per; Stamou, Dimitrios

    2009-11-01

    Lipids and several specialized proteins are thought to be able to sense the curvature of membranes (MC). Here we used quantitative fluorescence microscopy to measure curvature-selective binding of amphipathic motifs on single liposomes 50-700 nm in diameter. Our results revealed that sensing is predominantly mediated by a higher density of binding sites on curved membranes instead of higher affinity. We proposed a model based on curvature-induced defects in lipid packing that related these findings to lipid sorting and accurately predicted the existence of a new ubiquitous class of curvature sensors: membrane-anchored proteins. The fact that unrelated structural motifs such as alpha-helices and alkyl chains sense MC led us to propose that MC sensing is a generic property of curved membranes rather than a property of the anchoring molecules. We therefore anticipate that MC will promote the redistribution of proteins that are anchored in membranes through other types of hydrophobic moieties.

  14. Do natural proteins differ from random sequences polypeptides? Natural vs. random proteins classification using an evolutionary neural network.

    Directory of Open Access Journals (Sweden)

    Davide De Lucrezia

    Full Text Available Are extant proteins the exquisite result of natural selection or are they random sequences slightly edited by evolution? This question has puzzled biochemists for long time and several groups have addressed this issue comparing natural protein sequences to completely random ones coming to contradicting conclusions. Previous works in literature focused on the analysis of primary structure in an attempt to identify possible signature of evolutionary editing. Conversely, in this work we compare a set of 762 natural proteins with an average length of 70 amino acids and an equal number of completely random ones of comparable length on the basis of their structural features. We use an ad hoc Evolutionary Neural Network Algorithm (ENNA in order to assess whether and to what extent natural proteins are edited from random polypeptides employing 11 different structure-related variables (i.e. net charge, volume, surface area, coil, alpha helix, beta sheet, percentage of coil, percentage of alpha helix, percentage of beta sheet, percentage of secondary structure and surface hydrophobicity. The ENNA algorithm is capable to correctly distinguish natural proteins from random ones with an accuracy of 94.36%. Furthermore, we study the structural features of 32 random polypeptides misclassified as natural ones to unveil any structural similarity to natural proteins. Results show that random proteins misclassified by the ENNA algorithm exhibit a significant fold similarity to portions or subdomains of extant proteins at atomic resolution. Altogether, our results suggest that natural proteins are significantly edited from random polypeptides and evolutionary editing can be readily detected analyzing structural features. Furthermore, we also show that the ENNA, employing simple structural descriptors, can predict whether a protein chain is natural or random.

  15. Supervised maximum-likelihood weighting of composite protein networks for complex prediction

    Directory of Open Access Journals (Sweden)

    Yong Chern Han

    2012-12-01

    Full Text Available Abstract Background Protein complexes participate in many important cellular functions, so finding the set of existent complexes is essential for understanding the organization and regulation of processes in the cell. With the availability of large amounts of high-throughput protein-protein interaction (PPI data, many algorithms have been proposed to discover protein complexes from PPI networks. However, such approaches are hindered by the high rate of noise in high-throughput PPI data, including spurious and missing interactions. Furthermore, many transient interactions are detected between proteins that are not from the same complex, while not all proteins from the same complex may actually interact. As a result, predicted complexes often do not match true complexes well, and many true complexes go undetected. Results We address these challenges by integrating PPI data with other heterogeneous data sources to construct a composite protein network, and using a supervised maximum-likelihood approach to weight each edge based on its posterior probability of belonging to a complex. We then use six different clustering algorithms, and an aggregative clustering strategy, to discover complexes in the weighted network. We test our method on Saccharomyces cerevisiae and Homo sapiens, and show that complex discovery is improved: compared to previously proposed supervised and unsupervised weighting approaches, our method recalls more known complexes, achieves higher precision at all recall levels, and generates novel complexes of greater functional similarity. Furthermore, our maximum-likelihood approach allows learned parameters to be used to visualize and evaluate the evidence of novel predictions, aiding human judgment of their credibility. Conclusions Our approach integrates multiple data sources with supervised learning to create a weighted composite protein network, and uses six clustering algorithms with an aggregative clustering strategy to

  16. A multilayer protein-protein interaction network analysis of different life stages in Caenorhabditis elegans

    Science.gov (United States)

    Shinde, Pramod; Jalan, Sarika

    2015-12-01

    Molecular networks act as the backbone of cellular activities, providing an excellent opportunity to understand the developmental changes in an organism. While network data usually constitute only stationary network graphs, constructing a multilayer PPI network may provide clues to the particular developmental role at each stage of life and may unravel the importance of these developmental changes. The developmental biology model of Caenorhabditis elegans analyzed here provides a ripe platform to understand the patterns of evolution during the life stages of an organism. In the present study, the widely studied network properties exhibit overall similar statistics for all the PPI layers. Further, the analysis of the degree-degree correlation and spectral properties not only reveals crucial differences in each PPI layer but also indicates the presence of the varying complexity among them. The PPI layer of the nematode life stage exhibits various network properties different to the rest of the PPI layers, indicating the specific role of cellular diversity and developmental transitions at this stage. The framework presented here provides a direction to explore and understand the developmental changes occurring in the different life stages of an organism.

  17. The Oncogenic Palmitoyi-Protein Network in Prostate Cancer

    Science.gov (United States)

    2015-06-01

    SILAC amino acids in parallel. One group of control cells were cultured in “light” medium containing natural lysine (Lys0) and arginine (Arg0), DHHC3...knockdown cells were cultured in “heavy” medium containing 13C6,15N2- lysine (Lys8) and 13C6,15N4- arginine (Arg10), and the other group of control...cells were cultured in “medium” medium containing 4,4,5,5-D4- lysine (Lys4) and 13C6- arginine (Arg6). After six doublings, when cellular proteins were

  18. Salt-bridge networks within globular and disordered proteins: characterizing trends for designable interactions.

    Science.gov (United States)

    Basu, Sankar; Mukharjee, Debasish

    2017-07-01

    There has been considerable debate about the contribution of salt bridges to the stabilization of protein folds, in spite of their participation in crucial protein functions. Salt bridges appear to contribute to the activity-stability trade-off within proteins by bringing high-entropy charged amino acids into close contacts during the course of their functions. The current study analyzes the modes of association of salt bridges (in terms of networks) within globular proteins and at protein-protein interfaces. While the most common and trivial type of salt bridge is the isolated salt bridge, bifurcated salt bridge appears to be a distinct salt-bridge motif having a special topology and geometry. Bifurcated salt bridges are found ubiquitously in proteins and interprotein complexes. Interesting and attractive examples presenting different modes of interaction are highlighted. Bifurcated salt bridges appear to function as molecular clips that are used to stitch together large surface contours at interacting protein interfaces. The present work also emphasizes the key role of salt-bridge-mediated interactions in the partial folding of proteins containing long stretches of disordered regions. Salt-bridge-mediated interactions seem to be pivotal to the promotion of "disorder-to-order" transitions in small disordered protein fragments and their stabilization upon binding. The results obtained in this work should help to guide efforts to elucidate the modus operandi of these partially disordered proteins, and to conceptualize how these proteins manage to maintain the required amount of disorder even in their bound forms. This work could also potentially facilitate explorations of geometrically specific designable salt bridges through the characterization of composite salt-bridge networks. Graphical abstract ᅟ.

  19. Integration of relational and hierarchical network information for protein function prediction

    Directory of Open Access Journals (Sweden)

    Jiang Xiaoyu

    2008-08-01

    Full Text Available Abstract Background In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions. Results We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing. Conclusion A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased

  20. Atomic resolution structure of cucurmosin, a novel type 1 ribosome-inactivating protein from the sarcocarp of Cucurbita moschata

    Energy Technology Data Exchange (ETDEWEB)

    Hou, Xiaomin; Meehan, Edward J.; Xie, Jieming; Huang, Mingdong; Chen, Minghuang; Chen, Liqing (UAH); (Fujian); (Chinese Aca. Sci.)

    2008-10-27

    A novel type 1 ribosome-inactivating protein (RIP) designated cucurmosin was isolated from the sarcocarp of Cucurbita moschata (pumpkin). Besides rRNA N-glycosidase activity, cucurmosin exhibits strong cytotoxicities to three cancer cell lines of both human and murine origins, but low toxicity to normal cells. Plant genomic DNA extracted from the tender leaves was amplified by PCR between primers based on the N-terminal sequence and X-ray sequence of the C-terminal. The complete mature protein sequence was obtained from N-terminal protein sequencing and partial DNA sequencing, confirmed by high resolution crystal structure analysis. The crystal structure of cucurmosin has been determined at 1.04 {angstrom}, a resolution that has never been achieved before for any RIP. The structure contains two domains: a large N-terminal domain composed of seven {alpha}-helices and eight {beta}-strands, and a smaller C-terminal domain consisting of three {alpha}-helices and two {beta}-strands. The high resolution structure established a glycosylation pattern of GlcNAc{sub 2}Man3Xyl. Asn225 was identified as a glycosylation site. Residues Tyr70, Tyr109, Glu158 and Arg161 define the active site of cucurmosin as an RNA N-glycosidase. The structural basis of cytotoxicity difference between cucurmosin and trichosanthin is discussed.

  1. SLIDER: A Generic Metaheuristic for the Discovery of Correlated Motifs in Protein-Protein Interaction Networks

    NARCIS (Netherlands)

    Boyen, P.; Dyck, van D.; Neven, F.; Ham, van R.C.H.J.; Dijk, van A.D.J.

    2011-01-01

    Correlated motif mining (CMM) is the problem of finding overrepresented pairs of patterns, called motifs, in sequences of interacting proteins. Algorithmic solutions for CMM thereby provide a computational method for predicting binding sites for protein interaction. In this paper, we adopt a

  2. Global Alignment of Pairwise Protein Interaction Networks for Maximal Common Conserved Patterns

    Directory of Open Access Journals (Sweden)

    Wenhong Tian

    2013-01-01

    Full Text Available A number of tools for the alignment of protein-protein interaction (PPI networks have laid the foundation for PPI network analysis. Most of alignment tools focus on finding conserved interaction regions across the PPI networks through either local or global mapping of similar sequences. Researchers are still trying to improve the speed, scalability, and accuracy of network alignment. In view of this, we introduce a connected-components based fast algorithm, HopeMap, for network alignment. Observing that the size of true orthologs across species is small comparing to the total number of proteins in all species, we take a different approach based on a precompiled list of homologs identified by KO terms. Applying this approach to S. cerevisiae (yeast and D. melanogaster (fly, E. coli K12 and S. typhimurium, E. coli K12 and C. crescenttus, we analyze all clusters identified in the alignment. The results are evaluated through up-to-date known gene annotations, gene ontology (GO, and KEGG ortholog groups (KO. Comparing to existing tools, our approach is fast with linear computational cost, highly accurate in terms of KO and GO terms specificity and sensitivity, and can be extended to multiple alignments easily.

  3. Predicting adverse drug reaction profiles by integrating protein interaction networks with drug structures.

    Science.gov (United States)

    Huang, Liang-Chin; Wu, Xiaogang; Chen, Jake Y

    2013-01-01

    The prediction of adverse drug reactions (ADRs) has become increasingly important, due to the rising concern on serious ADRs that can cause drugs to fail to reach or stay in the market. We proposed a framework for predicting ADR profiles by integrating protein-protein interaction (PPI) networks with drug structures. We compared ADR prediction performances over 18 ADR categories through four feature groups-only drug targets, drug targets with PPI networks, drug structures, and drug targets with PPI networks plus drug structures. The results showed that the integration of PPI networks and drug structures can significantly improve the ADR prediction performance. The median AUC values for the four groups were 0.59, 0.61, 0.65, and 0.70. We used the protein features in the best two models, "Cardiac disorders" (median-AUC: 0.82) and "Psychiatric disorders" (median-AUC: 0.76), to build ADR-specific PPI networks with literature supports. For validation, we examined 30 drugs withdrawn from the U.S. market to see if our approach can predict their ADR profiles and explain why they were withdrawn. Except for three drugs having ADRs in the categories we did not predict, 25 out of 27 withdrawn drugs (92.6%) having severe ADRs were successfully predicted by our approach. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. PerturbationAnalyzer: a tool for investigating the effects of concentration perturbation on protein interaction networks.

    Science.gov (United States)

    Li, Fei; Li, Peng; Xu, Wenjian; Peng, Yuxing; Bo, Xiaochen; Wang, Shengqi

    2010-01-15

    The propagation of perturbations in protein concentration through a protein interaction network (PIN) can shed light on network dynamics and function. In order to facilitate this type of study, PerturbationAnalyzer, which is an open source plugin for Cytoscape, has been developed. PerturbationAnalyzer can be used in manual mode for simulating user-defined perturbations, as well as in batch mode for evaluating network robustness and identifying significant proteins that cause large propagation effects in the PINs when their concentrations are perturbed. Results from PerturbationAnalyzer can be represented in an intuitive and customizable way and can also be exported for further exploration. PerturbationAnalyzer has great potential in mining the design principles of protein networks, and may be a useful tool for identifying drug targets. PerturbationAnalyzer can be accessed from the Cytoscape web site http://www.cytoscape.org/plugins/index.php or http://biotech.bmi.ac.cn/PerturbationAnalyzer. Supplementary data are available at Bioinformatics online.

  5. The effect of oil type on network formation by protein aggregates into oleogels

    NARCIS (Netherlands)

    Vries, de Auke; Lopez Gomez, Yuly; Linden, van der Erik; Scholten, Elke

    2017-01-01

    The aim of this study was to assess the effect of oil type on the network formation of heat-set protein aggregates in liquid oil. The gelling properties of such aggregates to structure oil into so-called ‘oleogels’ are related to both the particle-particle and particle-solvent interactions. To

  6. Salivary Defense Proteins: Their Network and Role in Innate and Acquired Oral Immunity

    Directory of Open Access Journals (Sweden)

    Gábor Fábián

    2012-04-01

    Full Text Available There are numerous defense proteins present in the saliva. Although some of these molecules are present in rather low concentrations, their effects are additive and/or synergistic, resulting in an efficient molecular defense network of the oral cavity. Moreover, local concentrations of these proteins near the mucosal surfaces (mucosal transudate, periodontal sulcus (gingival crevicular fluid and oral wounds and ulcers (transudate may be much greater, and in many cases reinforced by immune and/or inflammatory reactions of the oral mucosa. Some defense proteins, like salivary immunoglobulins and salivary chaperokine HSP70/HSPAs (70 kDa heat shock proteins, are involved in both innate and acquired immunity. Cationic peptides and other defense proteins like lysozyme, bactericidal/permeability increasing protein (BPI, BPI-like proteins, PLUNC (palate lung and nasal epithelial clone proteins, salivary amylase, cystatins, prolin-rich proteins, mucins, peroxidases, statherin and others are primarily responsible for innate immunity. In this paper, this complex system and function of the salivary defense proteins will be reviewed.

  7. A topology-constrained distance network algorithm for protein structure determination from NOESY data.

    Science.gov (United States)

    Huang, Yuanpeng Janet; Tejero, Roberto; Powers, Robert; Montelione, Gaetano T

    2006-03-15

    This article formulates the multidimensional nuclear Overhauser effect spectroscopy (NOESY) interpretation problem using graph theory and presents a novel, bottom-up, topology-constrained distance network analysis algorithm for NOESY cross peak interpretation using assigned resonances. AutoStructure is a software suite that implements this topology-constrained distance network analysis algorithm and iteratively generates structures using the three-dimensional (3D) protein structure calculation programs XPLOR/CNS or DYANA. The minimum input for AutoStructure includes the amino acid sequence, a list of resonance assignments, and lists of 2D, 3D, and/or 4D-NOESY cross peaks. AutoStructure can also analyze homodimeric proteins when X-filtered NOESY experiments are available. The quality of input data and final 3D structures is evaluated using recall, precision, and F-measure (RPF) scores, a statistical measure of goodness of fit with the input data. AutoStructure has been tested on three protein NMR data sets for which high-quality structures have previously been solved by an expert, and yields comparable high-quality distance constraint lists and 3D protein structures in hours. We also compare several protein structures determined using AutoStructure with corresponding homologous proteins determined with other independent methods. The program has been used in more than two dozen protein structure determinations, several of which have already been published. (c) 2005 Wiley-Liss, Inc.

  8. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.

    Science.gov (United States)

    Szklarczyk, Damian; Morris, John H; Cook, Helen; Kuhn, Michael; Wyder, Stefan; Simonovic, Milan; Santos, Alberto; Doncheva, Nadezhda T; Roth, Alexander; Bork, Peer; Jensen, Lars J; von Mering, Christian

    2017-01-04

    A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein-protein interactions, and importing known pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer of interaction knowledge between organisms based on gene orthology. In the latest version 10.5 of STRING, the biggest changes are concerned with data dissemination: the web frontend has been completely redesigned to reduce dependency on outdated browser technologies, and the database can now also be queried from inside the popular Cytoscape software framework. Further improvements include automated background analysis of user inputs for functional enrichments, and streamlined download options. The STRING resource is available online, at http://string-db.org/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Isotachophoresis of proteins in a networked microfluidic chip: experiment and 2-D simulation.

    Science.gov (United States)

    Cui, Huanchun; Dutta, Prashanta; Ivory, Cornelius F

    2007-04-01

    This paper reports both the experimental application and 2-D simulation of ITP of proteins in a networked microfluidic chip. Experiments demonstrate that a mixture of three fluorescent proteins can be concentrated and stacked into adjacent zones of pure protein under a constant voltage of 100 V over a 2 cm long microchannel. Measurements of the isotachophoretic velocity of the moving zones demonstrates that, during ITP under a constant voltage, the zone velocity decreases as more of the channel is occupied by the terminating electrolyte. A 2-D ITP model based on the Nernst-Planck equations illustrates the stacking and separation features of ITP using simulations of three virtual proteins. The self-sharpening behavior of ITP zones dispersed by a T-junction is clearly demonstrated both by experiment and by simulation. Comparison of 2-D simulations of ITP and zone electrophoresis (ZE) confirms that ZE lacks the ability to resharpen protein zones after they pass through a T-junction.

  10. Transport Vesicle Tethering at the Trans Golgi Network: Coiled Coil Proteins in Action.

    Science.gov (United States)

    Cheung, Pak-Yan P; Pfeffer, Suzanne R

    2016-01-01

    The Golgi complex is decorated with so-called Golgin proteins that share a common feature: a large proportion of their amino acid sequences are predicted to form coiled-coil structures. The possible presence of extensive coiled coils implies that these proteins are highly elongated molecules that can extend a significant distance from the Golgi surface. This property would help them to capture or trap inbound transport vesicles and to tether Golgi mini-stacks together. This review will summarize our current understanding of coiled coil tethers that are needed for the receipt of transport vesicles at the trans Golgi network (TGN). How do long tethering proteins actually catch vesicles? Golgi-associated, coiled coil tethers contain numerous binding sites for small GTPases, SNARE proteins, and vesicle coat proteins. How are these interactions coordinated and are any or all of them important for the tethering process? Progress toward understanding these questions and remaining, unresolved mysteries will be discussed.

  11. Generating functional analysis of complex formation and dissociation in large protein interaction networks

    International Nuclear Information System (INIS)

    Coolen, A C C; Rabello, S

    2009-01-01

    We analyze large systems of interacting proteins, using techniques from the non-equilibrium statistical mechanics of disordered many-particle systems. Apart from protein production and removal, the most relevant microscopic processes in the proteome are complex formation and dissociation, and the microscopic degrees of freedom are the evolving concentrations of unbound proteins (in multiple post-translational states) and of protein complexes. Here we only include dimer-complexes, for mathematical simplicity, and we draw the network that describes which proteins are reaction partners from an ensemble of random graphs with an arbitrary degree distribution. We show how generating functional analysis methods can be used successfully to derive closed equations for dynamical order parameters, representing an exact macroscopic description of the complex formation and dissociation dynamics in the infinite system limit. We end this paper with a discussion of the possible routes towards solving the nontrivial order parameter equations, either exactly (in specific limits) or approximately.

  12. Maximum flow approach to prioritize potential drug targets of Mycobacterium tuberculosis H37Rv from protein-protein interaction network.

    Science.gov (United States)

    Melak, Tilahun; Gakkhar, Sunita

    2015-12-01

    In spite of the implementations of several strategies, tuberculosis (TB) is overwhelmingly a serious global public health problem causing millions of infections and deaths every year. This is mainly due to the emergence of drug-resistance varieties of TB. The current treatment strategies for the drug-resistance TB are of longer duration, more expensive and have side effects. This highlights the importance of identification and prioritization of targets for new drugs. This study has been carried out to prioritize potential drug targets of Mycobacterium tuberculosis H37Rv based on their flow to resistance genes. The weighted proteome interaction network of the pathogen was constructed using a dataset from STRING database. Only a subset of the dataset with interactions that have a combined score value ≥770 was considered. Maximum flow approach has been used to prioritize potential drug targets. The potential drug targets were obtained through comparative genome and network centrality analysis. The curated set of resistance genes was retrieved from literatures. Detail literature review and additional assessment of the method were also carried out for validation. A list of 537 proteins which are essential to the pathogen and non-homologous with human was obtained from the comparative genome analysis. Through network centrality measures, 131 of them were found within the close neighborhood of the centre of gravity of the proteome network. These proteins were further prioritized based on their maximum flow value to resistance genes and they are proposed as reliable drug targets of the pathogen. Proteins which interact with the host were also identified in order to understand the infection mechanism. Potential drug targets of Mycobacterium tuberculosis H37Rv were successfully prioritized based on their flow to resistance genes of existing drugs which is believed to increase the druggability of the targets since inhibition of a protein that has a maximum flow to

  13. Structural and thermodynamic studies of the tobacco calmodulin-like rgs-CaM protein.

    Science.gov (United States)

    Makiyama, Rodrigo K; Fernandes, Carlos A H; Dreyer, Thiago R; Moda, Bruno S; Matioli, Fabio F; Fontes, Marcos R M; Maia, Ivan G

    2016-11-01

    The tobacco calmodulin-like protein rgs-CaM is involved in host defense against virus and is reported to possess an associated RNA silencing suppressor activity. Rgs-CaM is also believed to act as an antiviral factor by interacting and targeting viral silencing suppressors for autophagic degradation. Despite these functional data, calcium interplay in the modulation of rgs-CaM is still poorly understood. Here we show that rgs-CaM displays a prevalent alpha-helical conformation and possesses three functional Ca 2+ -binding sites. Using computational modeling and molecular dynamics simulation, we demonstrate that Ca 2+ binding to rgs-CaM triggers expansion of its tertiary structure with reorientation of alpha-helices within the EF-hands. This conformational change leads to the exposure of a large negatively charged region that may be implicated in the electrostatic interactions between rgs-CaM and viral suppressors. Moreover, the k d values obtained for Ca 2+ binding to the three functional sites are not within the affinity range of a typical Ca 2+ sensor. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. POINeT: protein interactome with sub-network analysis and hub prioritization

    Directory of Open Access Journals (Sweden)

    Lai Jin-Mei

    2009-04-01

    Full Text Available Abstract Background Protein-protein interactions (PPIs are critical to every aspect of biological processes. Expansion of all PPIs from a set of given queries often results in a complex PPI network lacking spatiotemporal consideration. Moreover, the reliability of available PPI resources, which consist of low- and high-throughput data, for network construction remains a significant challenge. Even though a number of software tools are available to facilitate PPI network analysis, an integrated tool is crucial to alleviate the burden on querying across multiple web servers and software tools. Results We have constructed an integrated web service, POINeT, to simplify the process of PPI searching, analysis, and visualization. POINeT merges PPI and tissue-specific expression data from multiple resources. The tissue-specific PPIs and the numbers of research papers supporting the PPIs can be filtered with user-adjustable threshold values and are dynamically updated in the viewer. The network constructed in POINeT can be readily analyzed with, for example, the built-in centrality calculation module and an integrated network viewer. Nodes in global networks can also be ranked and filtered using various network analysis formulas, i.e., centralities. To prioritize the sub-network, we developed a ranking filtered method (S3 to uncover potential novel mediators in the midbody network. Several examples are provided to illustrate the functionality of POINeT. The network constructed from four schizophrenia risk markers suggests that EXOC4 might be a novel marker for this disease. Finally, a liver-specific PPI network has been filtered with adult and fetal liver expression profiles. Conclusion The functionalities provided by POINeT are highly improved compared to previous version of POINT. POINeT enables the identification and ranking of potential novel genes involved in a sub-network. Combining with tissue-specific gene expression profiles, PPIs specific to

  15. The G protein-coupled receptor heterodimer network (GPCR-HetNet) and its hub components.

    Science.gov (United States)

    Borroto-Escuela, Dasiel O; Brito, Ismel; Romero-Fernandez, Wilber; Di Palma, Michael; Oflijan, Julia; Skieterska, Kamila; Duchou, Jolien; Van Craenenbroeck, Kathleen; Suárez-Boomgaard, Diana; Rivera, Alicia; Guidolin, Diego; Agnati, Luigi F; Fuxe, Kjell

    2014-05-14

    G protein-coupled receptors (GPCRs) oligomerization has emerged as a vital characteristic of receptor structure. Substantial experimental evidence supports the existence of GPCR-GPCR interactions in a coordinated and cooperative manner. However, despite the current development of experimental techniques for large-scale detection of GPCR heteromers, in order to understand their connectivity it is necessary to develop novel tools to study the global heteroreceptor networks. To provide insight into the overall topology of the GPCR heteromers and identify key players, a collective interaction network was constructed. Experimental interaction data for each of the individual human GPCR protomers was obtained manually from the STRING and SCOPUS databases. The interaction data were used to build and analyze the network using Cytoscape software. The network was treated as undirected throughout the study. It is comprised of 156 nodes, 260 edges and has a scale-free topology. Connectivity analysis reveals a significant dominance of intrafamily versus interfamily connections. Most of the receptors within the network are linked to each other by a small number of edges. DRD2, OPRM, ADRB2, AA2AR, AA1R, OPRK, OPRD and GHSR are identified as hubs. In a network representation 10 modules/clusters also appear as a highly interconnected group of nodes. Information on this GPCR network can improve our understanding of molecular integration. GPCR-HetNet has been implemented in Java and is freely available at http://www.iiia.csic.es/~ismel/GPCR-Nets/index.html.

  16. IsoBase: a database of functionally related proteins across PPI networks.

    Science.gov (United States)

    Park, Daniel; Singh, Rohit; Baym, Michael; Liao, Chung-Shou; Berger, Bonnie

    2011-01-01

    We describe IsoBase, a database identifying functionally related proteins, across five major eukaryotic model organisms: Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus and Homo Sapiens. Nearly all existing algorithms for orthology detection are based on sequence comparison. Although these have been successful in orthology prediction to some extent, we seek to go beyond these methods by the integration of sequence data and protein-protein interaction (PPI) networks to help in identifying true functionally related proteins. With that motivation, we introduce IsoBase, the first publicly available ortholog database that focuses on functionally related proteins. The groupings were computed using the IsoRankN algorithm that uses spectral methods to combine sequence and PPI data and produce clusters of functionally related proteins. These clusters compare favorably with those from existing approaches: proteins within an IsoBase cluster are more likely to share similar Gene Ontology (GO) annotation. A total of 48,120 proteins were clustered into 12,693 functionally related groups. The IsoBase database may be browsed for functionally related proteins across two or more species and may also be queried by accession numbers, species-specific identifiers, gene name or keyword. The database is freely available for download at http://isobase.csail.mit.edu/.

  17. Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks.

    Science.gov (United States)

    Hanson, Jack; Yang, Yuedong; Paliwal, Kuldip; Zhou, Yaoqi

    2017-03-01

    Capturing long-range interactions between structural but not sequence neighbors of proteins is a long-standing challenging problem in bioinformatics. Recently, long short-term memory (LSTM) networks have significantly improved the accuracy of speech and image classification problems by remembering useful past information in long sequential events. Here, we have implemented deep bidirectional LSTM recurrent neural networks in the problem of protein intrinsic disorder prediction. The new method, named SPOT-Disorder, has steadily improved over a similar method using a traditional, window-based neural network (SPINE-D) in all datasets tested without separate training on short and long disordered regions. Independent tests on four other datasets including the datasets from critical assessment of structure prediction (CASP) techniques and >10 000 annotated proteins from MobiDB, confirmed SPOT-Disorder as one of the best methods in disorder prediction. Moreover, initial studies indicate that the method is more accurate in predicting functional sites in disordered regions. These results highlight the usefulness combining LSTM with deep bidirectional recurrent neural networks in capturing non-local, long-range interactions for bioinformatics applications. SPOT-disorder is available as a web server and as a standalone program at: http://sparks-lab.org/server/SPOT-disorder/index.php . j.hanson@griffith.edu.au or yuedong.yang@griffith.edu.au or yaoqi.zhou@griffith.edu.au. Supplementary data is available at Bioinformatics online.

  18. The Prediction of Key Cytoskeleton Components Involved in Glomerular Diseases Based on a Protein-Protein Interaction Network

    Science.gov (United States)

    Ju, Wenjun; Li, Xuejuan; Li, Shao; Ding, Jie

    2016-01-01

    Maintenance of the physiological morphologies of different types of cells and tissues is essential for the normal functioning of each system in the human body. Dynamic variations in cell and tissue morphologies depend on accurate adjustments of the cytoskeletal system. The cytoskeletal system in the glomerulus plays a key role in the normal process of kidney filtration. To enhance the understanding of the possible roles of the cytoskeleton in glomerular diseases, we constructed the Glomerular Cytoskeleton Network (GCNet), which shows the protein-protein interaction network in the glomerulus, and identified several possible key cytoskeletal components involved in glomerular diseases. In this study, genes/proteins annotated to the cytoskeleton were detected by Gene Ontology analysis, and glomerulus-enriched genes were selected from nine available glomerular expression datasets. Then, the GCNet was generated by combining these two sets of information. To predict the possible key cytoskeleton components in glomerular diseases, we then examined the common regulation of the genes in GCNet in the context of five glomerular diseases based on their transcriptomic data. As a result, twenty-one cytoskeleton components as potential candidate were highlighted for consistently down- or up-regulating in all five glomerular diseases. And then, these candidates were examined in relation to existing known glomerular diseases and genes to determine their possible functions and interactions. In addition, the mRNA levels of these candidates were also validated in a puromycin aminonucleoside(PAN) induced rat nephropathy model and were also matched with existing Diabetic Nephropathy (DN) transcriptomic data. As a result, there are 15 of 21 candidates in PAN induced nephropathy model were consistent with our predication and also 12 of 21 candidates were matched with differentially expressed genes in the DN transcriptomic data. By providing a novel interaction network and prediction, GCNet

  19. The Prediction of Key Cytoskeleton Components Involved in Glomerular Diseases Based on a Protein-Protein Interaction Network.

    Science.gov (United States)

    Ding, Fangrui; Tan, Aidi; Ju, Wenjun; Li, Xuejuan; Li, Shao; Ding, Jie

    2016-01-01

    Maintenance of the physiological morphologies of different types of cells and tissues is essential for the normal functioning of each system in the human body. Dynamic variations in cell and tissue morphologies depend on accurate adjustments of the cytoskeletal system. The cytoskeletal system in the glomerulus plays a key role in the normal process of kidney filtration. To enhance the understanding of the possible roles of the cytoskeleton in glomerular diseases, we constructed the Glomerular Cytoskeleton Network (GCNet), which shows the protein-protein interaction network in the glomerulus, and identified several possible key cytoskeletal components involved in glomerular diseases. In this study, genes/proteins annotated to the cytoskeleton were detected by Gene Ontology analysis, and glomerulus-enriched genes were selected from nine available glomerular expression datasets. Then, the GCNet was generated by combining these two sets of information. To predict the possible key cytoskeleton components in glomerular diseases, we then examined the common regulation of the genes in GCNet in the context of five glomerular diseases based on their transcriptomic data. As a result, twenty-one cytoskeleton components as potential candidate were highlighted for consistently down- or up-regulating in all five glomerular diseases. And then, these candidates were examined in relation to existing known glomerular diseases and genes to determine their possible functions and interactions. In addition, the mRNA levels of these candidates were also validated in a puromycin aminonucleoside(PAN) induced rat nephropathy model and were also matched with existing Diabetic Nephropathy (DN) transcriptomic data. As a result, there are 15 of 21 candidates in PAN induced nephropathy model were consistent with our predication and also 12 of 21 candidates were matched with differentially expressed genes in the DN transcriptomic data. By providing a novel interaction network and prediction, GCNet

  20. Protein network analysis - A new approach for quantifying wheat dough microstructure.

    Science.gov (United States)

    Bernklau, Isabelle; Lucas, Lars; Jekle, Mario; Becker, Thomas

    2016-11-01

    Clarification of wheat dough functionalities by visualizing the protein microstructure demands a precise image analysis, which is still challenging. Thus, a novel method for quantifying dough microstructure called protein network analysis (PNA) was established in this study. Hereby, absolute morphological attributes such as junctions' density, branching rate, end-point rate, and lacunarity quantify and characterize the strength of a network. The method was validated in a large range of varying microstructural shapes by increasing the bulk water concentration. In addition, the effect of two different magnifications (objectives with various numerical apparatus) was studied. Resulting values of the branching rate showed a significant linear decrease (R 2 =0.97) by ~40% for both magnifications indicating a decrease in connectivity and cohesion within the network. Rheological measurements, used as reference methods confirmed the loss of a network structure with increasing water addition (e.g. G* decreased by 89%). Additionally, significant correlations between both methods validated the innovative image analysis PNA. With this new approach of image analysis, effects of additives, varying dough ingredients or changing process conditions on gluten network - the most structure-relevant component in wheat dough - can be quantitatively identified, and targeted functionalities can be controlled. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Increased signaling entropy in cancer requires the scale-free property of protein interaction networks

    Science.gov (United States)

    Teschendorff, Andrew E.; Banerji, Christopher R. S.; Severini, Simone; Kuehn, Reimer; Sollich, Peter

    2015-01-01

    One of the key characteristics of cancer cells is an increased phenotypic plasticity, driven by underlying genetic and epigenetic perturbations. However, at a systems-level it is unclear how these perturbations give rise to the observed increased plasticity. Elucidating such systems-level principles is key for an improved understanding of cancer. Recently, it has been shown that signaling entropy, an overall measure of signaling pathway promiscuity, and computable from integrating a sample's gene expression profile with a protein interaction network, correlates with phenotypic plasticity and is increased in cancer compared to normal tissue. Here we develop a computational framework for studying the effects of network perturbations on signaling entropy. We demonstrate that the increased signaling entropy of cancer is driven by two factors: (i) the scale-free (or near scale-free) topology of the interaction network, and (ii) a subtle positive correlation between differential gene expression and node connectivity. Indeed, we show that if protein interaction networks were random graphs, described by Poisson degree distributions, that cancer would generally not exhibit an increased signaling entropy. In summary, this work exposes a deep connection between cancer, signaling entropy and interaction network topology. PMID:25919796

  2. Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization.

    Science.gov (United States)

    Wu, Qingyao; Wang, Zhenyu; Li, Chunshan; Ye, Yunming; Li, Yueping; Sun, Ning

    2015-01-01

    Predicting functional properties of proteins in protein-protein interaction (PPI) networks presents a challenging problem and has important implication in computational biology. Collective classification (CC) that utilizes both attribute features and relational information to jointly classify related proteins in PPI networks has been shown to be a powerful computational method for this problem setting. Enabling CC usually increases accuracy when given a fully-labeled PPI network with a large amount of labeled data. However, such labels can be difficult to obtain in many real-world PPI networks in which there are usually only a limited number of labeled proteins and there are a large amount of unlabeled proteins. In this case, most of the unlabeled proteins may not connected to the labeled ones, the supervision knowledge cannot be obtained effectively from local network connections. As a consequence, learning a CC model in sparsely-labeled PPI networks can lead to poor performance. We investigate a latent graph approach for finding an integration latent graph by exploiting various latent linkages and judiciously integrate the investigated linkages to link (separate) the proteins with similar (different) functions. We develop a regularized non-negative matrix factorization (RNMF) algorithm for CC to make protein functional properties prediction by utilizing various data sources that are available in this problem setting, including attribute features, latent graph, and unlabeled data information. In RNMF, a label matrix factorization term and a network regularization term are incorporated into the non-negative matrix factorization (NMF) objective function to seek a matrix factorization that respects the network structure and label information for classification prediction. Experimental results on KDD Cup tasks predicting the localization and functions of proteins to yeast genes demonstrate the effectiveness of the proposed RNMF method for predicting the protein

  3. A novel member of the split betaalphabeta fold: Solution structure of the hypothetical protein YML108W from Saccharomyces cerevisiae.

    Science.gov (United States)

    Pineda-Lucena, Antonio; Liao, Jack C C; Cort, John R; Yee, Adelinda; Kennedy, Michael A; Edwards, Aled M; Arrowsmith, Cheryl H

    2003-05-01

    As part of the Northeast Structural Genomics Consortium pilot project focused on small eukaryotic proteins and protein domains, we have determined the NMR structure of the protein encoded by ORF YML108W from Saccharomyces cerevisiae. YML108W belongs to one of the numerous structural proteomics targets whose biological function is unknown. Moreover, this protein does not have sequence similarity to any other protein. The NMR structure of YML108W consists of a four-stranded beta-sheet with strand order 2143 and two alpha-helices, with an overall topology of betabetaalphabetabetaalpha. Strand beta1 runs parallel to beta4, and beta2:beta1 and beta4:beta3 pairs are arranged in an antiparallel fashion. Although this fold belongs to the split betaalphabeta family, it appears to be unique among this family; it is a novel arrangement of secondary structure, thereby expanding the universe of protein folds.

  4. A mathematical model for generating bipartite graphs and its application to protein networks

    International Nuclear Information System (INIS)

    Nacher, J C; Ochiai, T; Hayashida, M; Akutsu, T

    2009-01-01

    Complex systems arise in many different contexts from large communication systems and transportation infrastructures to molecular biology. Most of these systems can be organized into networks composed of nodes and interacting edges. Here, we present a theoretical model that constructs bipartite networks with the particular feature that the degree distribution can be tuned depending on the probability rate of fundamental processes. We then use this model to investigate protein-domain networks. A protein can be composed of up to hundreds of domains. Each domain represents a conserved sequence segment with specific functional tasks. We analyze the distribution of domains in Homo sapiens and Arabidopsis thaliana organisms and the statistical analysis shows that while (a) the number of domain types shared by k proteins exhibits a power-law distribution, (b) the number of proteins composed of k types of domains decays as an exponential distribution. The proposed mathematical model generates bipartite graphs and predicts the emergence of this mixing of (a) power-law and (b) exponential distributions. Our theoretical and computational results show that this model requires (1) growth process and (2) copy mechanism.

  5. A mathematical model for generating bipartite graphs and its application to protein networks

    Energy Technology Data Exchange (ETDEWEB)

    Nacher, J C [Department of Complex Systems, Future University-Hakodate (Japan); Ochiai, T [Faculty of Engineering, Toyama Prefectural University (Japan); Hayashida, M; Akutsu, T [Bioinformatics Center, Institute for Chemical Research, Kyoto University (Japan)

    2009-12-04

    Complex systems arise in many different contexts from large communication systems and transportation infrastructures to molecular biology. Most of these systems can be organized into networks composed of nodes and interacting edges. Here, we present a theoretical model that constructs bipartite networks with the particular feature that the degree distribution can be tuned depending on the probability rate of fundamental processes. We then use this model to investigate protein-domain networks. A protein can be composed of up to hundreds of domains. Each domain represents a conserved sequence segment with specific functional tasks. We analyze the distribution of domains in Homo sapiens and Arabidopsis thaliana organisms and the statistical analysis shows that while (a) the number of domain types shared by k proteins exhibits a power-law distribution, (b) the number of proteins composed of k types of domains decays as an exponential distribution. The proposed mathematical model generates bipartite graphs and predicts the emergence of this mixing of (a) power-law and (b) exponential distributions. Our theoretical and computational results show that this model requires (1) growth process and (2) copy mechanism.

  6. Distinct configurations of protein complexes and biochemical pathways revealed by epistatic interaction network motifs

    LENUS (Irish Health Repository)

    Casey, Fergal

    2011-08-22

    Abstract Background Gene and protein interactions are commonly represented as networks, with the genes or proteins comprising the nodes and the relationship between them as edges. Motifs, or small local configurations of edges and nodes that arise repeatedly, can be used to simplify the interpretation of networks. Results We examined triplet motifs in a network of quantitative epistatic genetic relationships, and found a non-random distribution of particular motif classes. Individual motif classes were found to be associated with different functional properties, suggestive of an underlying biological significance. These associations were apparent not only for motif classes, but for individual positions within the motifs. As expected, NNN (all negative) motifs were strongly associated with previously reported genetic (i.e. synthetic lethal) interactions, while PPP (all positive) motifs were associated with protein complexes. The two other motif classes (NNP: a positive interaction spanned by two negative interactions, and NPP: a negative spanned by two positives) showed very distinct functional associations, with physical interactions dominating for the former but alternative enrichments, typical of biochemical pathways, dominating for the latter. Conclusion We present a model showing how NNP motifs can be used to recognize supportive relationships between protein complexes, while NPP motifs often identify opposing or regulatory behaviour between a gene and an associated pathway. The ability to use motifs to point toward underlying biological organizational themes is likely to be increasingly important as more extensive epistasis mapping projects in higher organisms begin.

  7. Perturbed human sub-networks by Fusobacterium nucleatum candidate virulence proteins.

    Science.gov (United States)

    Zanzoni, Andreas; Spinelli, Lionel; Braham, Shérazade; Brun, Christine

    2017-08-10

    Fusobacterium nucleatum is a gram-negative anaerobic species residing in the oral cavity and implicated in several inflammatory processes in the human body. Although F. nucleatum abundance is increased in inflammatory bowel disease subjects and is prevalent in colorectal cancer patients, the causal role of the bacterium in gastrointestinal disorders and the mechanistic details of host cell functions subversion are not fully understood. We devised a computational strategy to identify putative secreted F. nucleatum proteins (FusoSecretome) and to infer their interactions with human proteins based on the presence of host molecular mimicry elements. FusoSecretome proteins share similar features with known bacterial virulence factors thereby highlighting their pathogenic potential. We show that they interact with human proteins that participate in infection-related cellular processes and localize in established cellular districts of the host-pathogen interface. Our network-based analysis identified 31 functional modules in the human interactome preferentially targeted by 138 FusoSecretome proteins, among which we selected 26 as main candidate virulence proteins, representing both putative and known virulence proteins. Finally, six of the preferentially targeted functional modules are implicated in the onset and progression of inflammatory bowel diseases and colorectal cancer. Overall, our computational analysis identified candidate virulence proteins potentially involved in the F. nucleatum-human cross-talk in the context of gastrointestinal diseases.

  8. Creating and analyzing pathway and protein interaction compendia for modelling signal transduction networks

    Directory of Open Access Journals (Sweden)

    Kirouac Daniel C

    2012-05-01

    Full Text Available Abstract Background Understanding the information-processing capabilities of signal transduction networks, how those networks are disrupted in disease, and rationally designing therapies to manipulate diseased states require systematic and accurate reconstruction of network topology. Data on networks central to human physiology, such as the inflammatory signalling networks analyzed here, are found in a multiplicity of on-line resources of pathway and interactome databases (Cancer CellMap, GeneGo, KEGG, NCI-Pathway Interactome Database (NCI-PID, PANTHER, Reactome, I2D, and STRING. We sought to determine whether these databases contain overlapping information and whether they can be used to construct high reliability prior knowledge networks for subsequent modeling of experimental data. Results We have assembled an ensemble network from multiple on-line sources representing a significant portion of all machine-readable and reconcilable human knowledge on proteins and protein interactions involved in inflammation. This ensemble network has many features expected of complex signalling networks assembled from high-throughput data: a power law distribution of both node degree and edge annotations, and topological features of a “bow tie” architecture in which diverse pathways converge on a highly conserved set of enzymatic cascades focused around PI3K/AKT, MAPK/ERK, JAK/STAT, NFκB, and apoptotic signaling. Individual pathways exhibit “fuzzy” modularity that is statistically significant but still involving a majority of “cross-talk” interactions. However, we find that the most widely used pathway databases are highly inconsistent with respect to the actual constituents and interactions in this network. Using a set of growth factor signalling networks as examples (epidermal growth factor, transforming growth factor-beta, tumor necrosis factor, and wingless, we find a multiplicity of network topologies in which receptors couple to downstream

  9. Functional equivalency inferred from "authoritative sources" in networks of homologous proteins.

    Science.gov (United States)

    Natarajan, Shreedhar; Jakobsson, Eric

    2009-06-12

    A one-on-one mapping of protein functionality across different species is a critical component of comparative analysis. This paper presents a heuristic algorithm for discovering the Most Likely Functional Counterparts (MoLFunCs) of a protein, based on simple concepts from network theory. A key feature of our algorithm is utilization of the user's knowledge to assign high confidence to selected functional identification. We show use of the algorithm to retrieve functional equivalents for 7 membrane proteins, from an exploration of almost 40 genomes form multiple online resources. We verify the functional equivalency of our dataset through a series of tests that include sequence, structure and function comparisons. Comparison is made to the OMA methodology, which also identifies one-on-one mapping between proteins from different species. Based on that comparison, we believe that incorporation of user's knowledge as a key aspect of the technique adds value to purely statistical formal methods.

  10. Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review

    Science.gov (United States)

    Zhang, Xue; Acencio, Marcio Luis; Lemke, Ney

    2016-01-01

    Essential proteins/genes are indispensable to the survival or reproduction of an organism, and the deletion of such essential proteins will result in lethality or infertility. The identification of essential genes is very important not only for understanding the minimal requirements for survival of an organism, but also for finding human disease genes and new drug targets. Experimental methods for identifying essential genes are costly, time-consuming, and laborious. With the accumulation of sequenced genomes data and high-throughput experimental data, many computational methods for identifying essential proteins are proposed, which are useful complements to experimental methods. In this review, we show the state-of-the-art methods for identifying essential genes and proteins based on machine learning and network topological features, point out the progress and limitations of current methods, and discuss the challenges and directions for further research. PMID:27014079

  11. The membrane stress response buffers lethal effects of lipid disequilibrium by reprogramming the protein homeostasis network.

    Science.gov (United States)

    Thibault, Guillaume; Shui, Guanghou; Kim, Woong; McAlister, Graeme C; Ismail, Nurzian; Gygi, Steven P; Wenk, Markus R; Ng, Davis T W

    2012-10-12

    Lipid composition can differ widely among organelles and even between leaflets of a membrane. Lipid homeostasis is critical because disequilibrium can have disease outcomes. Despite their importance, mechanisms maintaining lipid homeostasis remain poorly understood. Here, we establish a model system to study the global effects of lipid imbalance. Quantitative lipid profiling was integral to monitor changes to lipid composition and for system validation. Applying global transcriptional and proteomic analyses, a dramatically altered biochemical landscape was revealed from adaptive cells. The resulting composite regulation we term the "membrane stress response" (MSR) confers compensation, not through restoration of lipid composition, but by remodeling the protein homeostasis network. To validate its physiological significance, we analyzed the unfolded protein response (UPR), one facet of the MSR and a key regulator of protein homeostasis. We demonstrate that the UPR maintains protein biogenesis, quality control, and membrane integrity-functions otherwise lethally compromised in lipid dysregulated cells. Copyright © 2012 Elsevier Inc. All rights reserved.

  12. Identification of phosphorylation sites in protein kinase A substrates using artificial neural networks and mass spectrometry

    DEFF Research Database (Denmark)

    Hjerrild, M.; Stensballe, A.; Rasmussen, T.E.

    2004-01-01

    kinase A (PKA) phosphorylation sites. The neural network was trained with a positive set of 258 experimentally verified PKA phosphorylation sites. The predictions by NetPhosK were! validated using four novel PKA substrates: Necdin, RFX5, En-2, and Wee 1. The four proteins were phosphorylated by PKA...... in vitro and 13 PKA phosphorylation sites were identified by mass spectrometry. NetPhosK was 100% sensitive and 41% specific in predicting PKA sites in the four proteins. These results demonstrate the potential of using integrated computational and experimental methods for detailed investigations...

  13. K-core decomposition of a protein domain co-occurrence network reveals lower cancer mutation rates for interior cores.

    Science.gov (United States)

    Emerson, Arnold I; Andrews, Simeon; Ahmed, Ikhlak; Azis, Thasni Ka; Malek, Joel A

    2015-01-01

    Network biology currently focuses primarily on metabolic pathways, gene regulatory, and protein-protein interaction networks. While these approaches have yielded critical information, alternative methods to network analysis will offer new perspectives on biological information. A little explored area is the interactions between domains that can be captured using domain co-occurrence networks (DCN). A DCN can be used to study the function and interaction of proteins by representing protein domains and their co-existence in genes and by mapping cancer mutations to the individual protein domains to identify signals. The domain co-occurrence network was constructed for the human proteome based on PFAM domains in proteins. Highly connected domains in the central cores were identified using the k-core decomposition technique. Here we show that these domains were found to be more evolutionarily conserved than the peripheral domains. The somatic mutations for ovarian, breast and prostate cancer diseases were obtained from the TCGA database. We mapped the somatic mutations to the individual protein domains and the local false discovery rate was used to identify significantly mutated domains in each cancer type. Significantly mutated domains were found to be enriched in cancer disease pathways. However, we found that the inner cores of the DCN did not contain any of the significantly mutated domains. We observed that the inner core protein domains are highly conserved and these domains co-exist in large numbers with other protein domains. Mutations and domain co-occurrence networks provide a framework for understanding hierarchal designs in protein function from a network perspective. This study provides evidence that a majority of protein domains in the inner core of the DCN have a lower mutation frequency and that protein domains present in the peripheral regions of the k-core contribute more heavily to the disease. These findings may contribute further to drug development.

  14. Characterizing genes with distinct methylation patterns in the context of protein-protein interaction network: application to human brain tissues.

    Science.gov (United States)

    Li, Yongsheng; Xu, Juan; Chen, Hong; Zhao, Zheng; Li, Shengli; Bai, Jing; Wu, Aiwei; Jiang, Chunjie; Wang, Yuan; Su, Bin; Li, Xia

    2013-01-01

    DNA methylation is an essential epigenetic mechanism involved in transcriptional control. However, how genes with different methylation patterns are assembled in the protein-protein interaction network (PPIN) remains a mystery. In the present study, we systematically dissected the characterization of genes with different methylation patterns in the PPIN. A negative association was detected between the methylation levels in the brain tissues and topological centralities. By focusing on two classes of genes with considerably different methylation levels in the brain tissues, namely the low methylated genes (LMGs) and high methylated genes (HMGs), we found that their organizing principles in the PPIN are distinct. The LMGs tend to be the center of the PPIN, and attacking them causes a more deleterious effect on the network integrity. Furthermore, the LMGs express their functions in a modular pattern and substantial differences in functions are observed between the two types of genes. The LMGs are enriched in the basic biological functions, such as binding activity and regulation of transcription. More importantly, cancer genes, especially recessive cancer genes, essential genes, and aging-related genes were all found more often in the LMGs. Additionally, our analysis presented that the intra-classes communications are enhanced, but inter-classes communications are repressed. Finally, a functional complementation was revealed between methylation and miRNA regulation in the human genome. We have elucidated the assembling principles of genes with different methylation levels in the context of the PPIN, providing key insights into the complex epigenetic regulation mechanisms.

  15. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Shen Yang; Bax, Ad, E-mail: bax@nih.gov [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)

    2013-07-15

    A new program, TALOS-N, is introduced for predicting protein backbone torsion angles from NMR chemical shifts. The program relies far more extensively on the use of trained artificial neural networks than its predecessor, TALOS+. Validation on an independent set of proteins indicates that backbone torsion angles can be predicted for a larger, {>=}90 % fraction of the residues, with an error rate smaller than ca 3.5 %, using an acceptance criterion that is nearly two-fold tighter than that used previously, and a root mean square difference between predicted and crystallographically observed ({phi}, {psi}) torsion angles of ca 12 Masculine-Ordinal-Indicator . TALOS-N also reports sidechain {chi}{sup 1} rotameric states for about 50 % of the residues, and a consistency with reference structures of 89 %. The program includes a neural network trained to identify secondary structure from residue sequence and chemical shifts.

  16. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks

    International Nuclear Information System (INIS)

    Shen Yang; Bax, Ad

    2013-01-01

    A new program, TALOS-N, is introduced for predicting protein backbone torsion angles from NMR chemical shifts. The program relies far more extensively on the use of trained artificial neural networks than its predecessor, TALOS+. Validation on an independent set of proteins indicates that backbone torsion angles can be predicted for a larger, ≥90 % fraction of the residues, with an error rate smaller than ca 3.5 %, using an acceptance criterion that is nearly two-fold tighter than that used previously, and a root mean square difference between predicted and crystallographically observed (φ, ψ) torsion angles of ca 12º. TALOS-N also reports sidechain χ 1 rotameric states for about 50 % of the residues, and a consistency with reference structures of 89 %. The program includes a neural network trained to identify secondary structure from residue sequence and chemical shifts

  17. STITCH 2: an interaction network database for small molecules and proteins

    DEFF Research Database (Denmark)

    Kuhn, Michael; Szklarczyk, Damian; Franceschini, Andrea

    2010-01-01

    Over the last years, the publicly available knowledge on interactions between small molecules and proteins has been steadily increasing. To create a network of interactions, STITCH aims to integrate the data dispersed over the literature and various databases of biological pathways, drug......-target relationships and binding affinities. In STITCH 2, the number of relevant interactions is increased by incorporation of BindingDB, PharmGKB and the Comparative Toxicogenomics Database. The resulting network can be explored interactively or used as the basis for large-scale analyses. To facilitate links to other...... chemical databases, we adopt InChIKeys that allow identification of chemicals with a short, checksum-like string. STITCH 2.0 connects proteins from 630 organisms to over 74,000 different chemicals, including 2200 drugs. STITCH can be accessed at http://stitch.embl.de/....

  18. Dissecting spatio-temporal protein networks driving human heart development and related disorders

    DEFF Research Database (Denmark)

    Hansen, Kasper Lage; Mølgård, Kjeld; Greenway, Steven

    2010-01-01

    development, we combined detailed phenotype information from deleterious mutations in 255 genes with high-confidence experimental interactome data, and coupled the results to thorough experimental validation. Hereby, we made the first systematic analysis of spatio-temporal protein networks driving many stages......Aberrant organ development is associated with a wide spectrum of disorders, from schizophrenia to congenital heart disease, but systems-level insight into the underlying processes is very limited. Using heart morphogenesis as general model for dissecting the functional architecture of organ...... of a developing organ identifying several novel signaling modules. Our results show that organ development relies on surprisingly few, extensively recycled, protein modules that integrate into complex higher-order networks. This design allows the formation of a complicated organ using simple building blocks...

  19. Bayesian network model for identification of pathways by integrating protein interaction with genetic interaction data.

    Science.gov (United States)

    Fu, Changhe; Deng, Su; Jin, Guangxu; Wang, Xinxin; Yu, Zu-Guo

    2017-09-21

    Molecular interaction data at proteomic and genetic levels provide physical and functional insights into a molecular biosystem and are helpful for the construction of pathway structures complementarily. Despite advances in inferring biological pathways using genetic interaction data, there still exists weakness in developed models, such as, activity pathway networks (APN), when integrating the data from proteomic and genetic levels. It is necessary to develop new methods to infer pathway structure by both of interaction data. We utilized probabilistic graphical model to develop a new method that integrates genetic interaction and protein interaction data and infers exquisitely detailed pathway structure. We modeled the pathway network as Bayesian network and applied this model to infer pathways for the coherent subsets of the global genetic interaction profiles, and the available data set of endoplasmic reticulum genes. The protein interaction data were derived from the BioGRID database. Our method can accurately reconstruct known cellular pathway structures, including SWR complex, ER-Associated Degradation (ERAD) pathway, N-Glycan biosynthesis pathway, Elongator complex, Retromer complex, and Urmylation pathway. By comparing N-Glycan biosynthesis pathway and Urmylation pathway identified from our approach with that from APN, we found that our method is able to overcome its weakness (certain edges are inexplicable). According to underlying protein interaction network, we defined a simple scoring function that only adopts genetic interaction information to avoid the balance difficulty in the APN. Using the effective stochastic simulation algorithm, the performance of our proposed method is significantly high. We developed a new method based on Bayesian network to infer detailed pathway structures from interaction data at proteomic and genetic levels. The results indicate that the developed method performs better in predicting signaling pathways than previously

  20. The driving regulators of the connectivity protein network of brain malignancies

    Science.gov (United States)

    Tahmassebi, Amirhessam; Pinker-Domenig, Katja; Wengert, Georg; Lobbes, Marc; Stadlbauer, Andreas; Wildburger, Norelle C.; Romero, Francisco J.; Morales, Diego P.; Castillo, Encarnacion; Garcia, Antonio; Botella, Guillermo; Meyer-Bäse, Anke

    2017-05-01

    An important problem in modern therapeutics at the proteomic level remains to identify therapeutic targets in a plentitude of high-throughput data from experiments relevant to a variety of diseases. This paper presents the application of novel modern control concepts, such as pinning controllability and observability applied to the glioma cancer stem cells (GSCs) protein graph network with known and novel association to glioblastoma (GBM). The theoretical frameworks provides us with the minimal number of "driver nodes", which are necessary, and their location to determine the full control over the obtained graph network in order to provide a change in the network's dynamics from an initial state (disease) to a desired state (non-disease). The achieved results will provide biochemists with techniques to identify more metabolic regions and biological pathways for complex diseases, to design and test novel therapeutic solutions.

  1. Annotating gene sets by mining large literature collections with protein networks.

    Science.gov (United States)

    Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey

    2018-01-01

    Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.

  2. A network model to correlate conformational change and the impedance spectrum of single proteins

    Science.gov (United States)

    Alfinito, Eleonora; Pennetta, Cecilia; Reggiani, Lino

    2008-02-01

    Integrated nanodevices based on proteins or biomolecules are attracting increasing interest in today's research. In fact, it has been shown that proteins such as azurin and bacteriorhodopsin manifest some electrical properties that are promising for the development of active components of molecular electronic devices. Here we focus on two relevant kinds of protein: bovine rhodopsin, prototype of G-protein-coupled-receptor (GPCR) proteins, and the enzyme acetylcholinesterase (AChE), whose inhibition is one of the most qualified treatments of Alzheimer's disease. Both these proteins exert their function starting with a conformational change of their native structure. Our guess is that such a change should be accompanied with a detectable variation of their electrical properties. To investigate this conjecture, we present an impedance network model of proteins, able to estimate the different impedance spectra associated with the different configurations. The distinct types of conformational change of rhodopsin and AChE agree with their dissimilar electrical responses. In particular, for rhodopsin the model predicts variations of the impedance spectra up to about 30%, while for AChE the same variations are limited to about 10%, which supports the existence of a dynamical equilibrium between its native and complexed states.

  3. Revealing the potential pathogenesis of glioma by utilizing a glioma associated protein-protein interaction network.

    Science.gov (United States)

    Pan, Weiran; Li, Gang; Yang, Xiaoxiao; Miao, Jinming

    2015-04-01

    This study aims to explore the potential mechanism of glioma through bioinformatic approaches. The gene expression profile (GSE4290) of glioma tumor and non-tumor samples was downloaded from Gene Expression Omnibus database. A total of 180 samples were available, including 23 non-tumor and 157 tumor samples. Then the raw data were preprocessed using robust multiarray analysis, and 8,890 differentially expressed genes (DEGs) were identified by using t-test (false discovery rate What' more, for the top 10 sub-networks, Gene Ontology (GO) enrichment analysis (p value tissue-specific genes were calculated (p value = 1.0, 1.0, and 0.00014, respectively) and visualized by Venn Diagram package in R. About 61% of human tissue-specific genes were DEGs as well. This research shed new light on the pathogenesis of glioma based on DEGs and GAPN, and our findings might provide potential targets for clinical glioma treatment.

  4. GH32 family activity: a topological approach through protein contact networks.

    Science.gov (United States)

    Cimini, Sara; Di Paola, Luisa; Giuliani, Alessandro; Ridolfi, Alessandra; De Gara, Laura

    2016-11-01

    The application of Protein Contact Networks methodology allowed to highlight a novel response of border region between the two domains to substrate binding. Glycoside hydrolases (GH) are enzymes that mainly hydrolyze the glycosidic bond between two carbohydrates or a carbohydrate and a non-carbohydrate moiety. These enzymes are involved in many fundamental and diverse biological processes in plants. We have focused on the GH32 family, including enzymes very similar in both sequence and structure, each having however clear specificities of substrate preferences and kinetic properties. Structural and topological differences among proteins of the GH32 family have been here identified by means of an emerging approach (Protein Contact network, PCN) based on the formalization of 3D structures as contact networks among amino-acid residues. The PCN approach proved successful in both reconstructing the already known functional domains and in identifying the structural counterpart of the properties of GH32 enzymes, which remain uncertain, like their allosteric character. The main outcome of the study was the discovery of the activation upon binding of the border (cleft) region between the two domains. This reveals the allosteric nature of the enzymatic activity for all the analyzed forms in the GH32 family, a character yet to be highlighted in biochemical studies. Furthermore, we have been able to recognize a topological signature (graph energy) of the different affinity of the enzymes towards small and large substrates.

  5. Features analysis for identification of date and party hubs in protein interaction network of Saccharomyces Cerevisiae.

    Science.gov (United States)

    Mirzarezaee, Mitra; Araabi, Babak N; Sadeghi, Mehdi

    2010-12-19

    It has been understood that biological networks have modular organizations which are the sources of their observed complexity. Analysis of networks and motifs has shown that two types of hubs, party hubs and date hubs, are responsible for this complexity. Party hubs are local coordinators because of their high co-expressions with their partners, whereas date hubs display low co-expressions and are assumed as global connectors. However there is no mutual agreement on these concepts in related literature with different studies reporting their results on different data sets. We investigated whether there is a relation between the biological features of Saccharomyces Cerevisiae's proteins and their roles as non-hubs, intermediately connected, party hubs, and date hubs. We propose a classifier that separates these four classes. We extracted different biological characteristics including amino acid sequences, domain contents, repeated domains, functional categories, biological processes, cellular compartments, disordered regions, and position specific scoring matrix from various sources. Several classifiers are examined and the best feature-sets based on average correct classification rate and correlation coefficients of the results are selected. We show that fusion of five feature-sets including domains, Position Specific Scoring Matrix-400, cellular compartments level one, and composition pairs with two and one gaps provide the best discrimination with an average correct classification rate of 77%. We study a variety of known biological feature-sets of the proteins and show that there is a relation between domains, Position Specific Scoring Matrix-400, cellular compartments level one, composition pairs with two and one gaps of Saccharomyces Cerevisiae's proteins, and their roles in the protein interaction network as non-hubs, intermediately connected, party hubs and date hubs. This study also confirms the possibility of predicting non-hubs, party hubs and date hubs

  6. Network reconstruction based on proteomic data and prior knowledge of protein connectivity using graph theory.

    Directory of Open Access Journals (Sweden)

    Vassilis Stavrakas

    Full Text Available Modeling of signal transduction pathways is instrumental for understanding cells' function. People have been tackling modeling of signaling pathways in order to accurately represent the signaling events inside cells' biochemical microenvironment in a way meaningful for scientists in a biological field. In this article, we propose a method to interrogate such pathways in order to produce cell-specific signaling models. We integrate available prior knowledge of protein connectivity, in a form of a Prior Knowledge Network (PKN with phosphoproteomic data to construct predictive models of the protein connectivity of the interrogated cell type. Several computational methodologies focusing on pathways' logic modeling using optimization formulations or machine learning algorithms have been published on this front over the past few years. Here, we introduce a light and fast approach that uses a breadth-first traversal of the graph to identify the shortest pathways and score proteins in the PKN, fitting the dependencies extracted from the experimental design. The pathways are then combined through a heuristic formulation to produce a final topology handling inconsistencies between the PKN and the experimental scenarios. Our results show that the algorithm we developed is efficient and accurate for the construction of medium and large scale signaling networks. We demonstrate the applicability of the proposed approach by interrogating a manually curated interaction graph model of EGF/TNFA stimulation against made up experimental data. To avoid the possibility of erroneous predictions, we performed a cross-validation analysis. Finally, we validate that the introduced approach generates predictive topologies, comparable to the ILP formulation. Overall, an efficient approach based on graph theory is presented herein to interrogate protein-protein interaction networks and to provide meaningful biological insights.

  7. Resemblance of actin-binding protein/actin gels to covalently crosslinked networks

    Science.gov (United States)

    Janmey, Paul A.; Hvidt, Søren; Lamb, Jennifer; Stossel, Thomas P.

    1990-05-01

    THE maintainance of the shape of cells is often due to their surface elasticity, which arises mainly from an actin-rich cytoplasmic cortex1,2. On locomotion, phagocytosis or fission, however, these cells become partially fluid-like. The finding of proteins that can bind to actin and control the assembly of, or crosslink, actin filaments, and of intracellular messages that regulate the activities of some of these actin-binding proteins, indicates that such 'gel sol' transformations result from the rearrangement of cortical actin-rich networks3. Alternatively, on the basis of a study of the mechanical properties of mixtures of actin filaments and an Acanthamoeba actin-binding protein, α-actinin, it has been proposed that these transformations can be accounted for by rapid exchange of crosslinks between actin filaments4: the cortical network would be solid when the deformation rate is greater than the rate of crosslink exchange, but would deform or 'creep' when deformation is slow enough to permit crosslinker molecules to rearrange. Here we report, however, that mixtures of actin filaments and actin-binding protein (ABP), an actin crosslinking protein of many higher eukaryotes, form gels Theologically equivalent to covalently crosslinked networks. These gels do not creep in response to applied stress on a time scale compatible with most cell-surface movements. These findings support a more complex and controlled mechanism underlying the dynamic mechanical properties of cortical cytoplasm, and can explain why cells do not collapse under the constant shear forces that often exist in tissues.

  8. Knowledge base and neural network approach for protein secondary structure prediction.

    Science.gov (United States)

    Patel, Maulika S; Mazumdar, Himanshu S

    2014-11-21

    Protein structure prediction is of great relevance given the abundant genomic and proteomic data generated by the genome sequencing projects. Protein secondary structure prediction is addressed as a sub task in determining the protein tertiary structure and function. In this paper, a novel algorithm, KB-PROSSP-NN, which is a combination of knowledge base and modeling of the exceptions in the knowledge base using neural networks for protein secondary structure prediction (PSSP), is proposed. The knowledge base is derived from a proteomic sequence-structure database and consists of the statistics of association between the 5-residue words and corresponding secondary structure. The predicted results obtained using knowledge base are refined with a Backpropogation neural network algorithm. Neural net models the exceptions of the knowledge base. The Q3 accuracy of 90% and 82% is achieved on the RS126 and CB396 test sets respectively which suggest improvement over existing state of art methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Uncovering the protein lysine and arginine methylation network in Arabidopsis chloroplasts.

    Science.gov (United States)

    Alban, Claude; Tardif, Marianne; Mininno, Morgane; Brugière, Sabine; Gilgen, Annabelle; Ma, Sheng; Mazzoleni, Meryl; Gigarel, Océane; Martin-Laffon, Jacqueline; Ferro, Myriam; Ravanel, Stéphane

    2014-01-01

    Post-translational modification of proteins by the addition of methyl groups to the side chains of Lys and Arg residues is proposed to play important roles in many cellular processes. In plants, identification of non-histone methylproteins at a cellular or subcellular scale is still missing. To gain insights into the extent of this modification in chloroplasts we used a bioinformatics approach to identify protein methyltransferases targeted to plastids and set up a workflow to specifically identify Lys and Arg methylated proteins from proteomic data used to produce the Arabidopsis chloroplast proteome. With this approach we could identify 31 high-confidence Lys and Arg methylation sites from 23 chloroplastic proteins, of which only two were previously known to be methylated. These methylproteins are split between the stroma, thylakoids and envelope sub-compartments. They belong to essential metabolic processes, including photosynthesis, and to the chloroplast biogenesis and maintenance machinery (translation, protein import, division). Also, the in silico identification of nine protein methyltransferases that are known or predicted to be targeted to plastids provided a foundation to build the enzymes/substrates relationships that govern methylation in chloroplasts. Thereby, using in vitro methylation assays with chloroplast stroma as a source of methyltransferases we confirmed the methylation sites of two targets, plastid ribosomal protein L11 and the β-subunit of ATP synthase. Furthermore, a biochemical screening of recombinant chloroplastic protein Lys methyltransferases allowed us to identify the enzymes involved in the modification of these substrates. The present study provides a useful resource to build the methyltransferases/methylproteins network and to elucidate the role of protein methylation in chloroplast biology.

  10. Uncovering the protein lysine and arginine methylation network in Arabidopsis chloroplasts.

    Directory of Open Access Journals (Sweden)

    Claude Alban

    Full Text Available Post-translational modification of proteins by the addition of methyl groups to the side chains of Lys and Arg residues is proposed to play important roles in many cellular processes. In plants, identification of non-histone methylproteins at a cellular or subcellular scale is still missing. To gain insights into the extent of this modification in chloroplasts we used a bioinformatics approach to identify protein methyltransferases targeted to plastids and set up a workflow to specifically identify Lys and Arg methylated proteins from proteomic data used to produce the Arabidopsis chloroplast proteome. With this approach we could identify 31 high-confidence Lys and Arg methylation sites from 23 chloroplastic proteins, of which only two were previously known to be methylated. These methylproteins are split between the stroma, thylakoids and envelope sub-compartments. They belong to essential metabolic processes, including photosynthesis, and to the chloroplast biogenesis and maintenance machinery (translation, protein import, division. Also, the in silico identification of nine protein methyltransferases that are known or predicted to be targeted to plastids provided a foundation to build the enzymes/substrates relationships that govern methylation in chloroplasts. Thereby, using in vitro methylation assays with chloroplast stroma as a source of methyltransferases we confirmed the methylation sites of two targets, plastid ribosomal protein L11 and the β-subunit of ATP synthase. Furthermore, a biochemical screening of recombinant chloroplastic protein Lys methyltransferases allowed us to identify the enzymes involved in the modification of these substrates. The present study provides a useful resource to build the methyltransferases/methylproteins network and to elucidate the role of protein methylation in chloroplast biology.

  11. Signaling by Small GTPases at Cell-Cell junctions: Protein Interactions Building Control and Networks.

    Science.gov (United States)

    Braga, Vania

    2017-09-11

    A number of interesting reports highlight the intricate network of signaling proteins that coordinate formation and maintenance of cell-cell contacts. We have much yet to learn about how the in vitro binding data is translated into protein association inside the cells and whether such interaction modulates the signaling properties of the protein. What emerges from recent studies is the importance to carefully consider small GTPase activation in the context of where its activation occurs, which upstream regulators are involved in the activation/inactivation cycle and the GTPase interacting partners that determine the intracellular niche and extent of signaling. Data discussed here unravel unparalleled cooperation and coordination of functions among GTPases and their regulators in supporting strong adhesion between cells. Copyright © 2017 Cold Spring Harbor Laboratory Press; all rights reserved.

  12. Simplified Swarm Optimization-Based Function Module Detection in Protein–Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Xianghan Zheng

    2017-04-01

    Full Text Available Proteomics research has become one of the most important topics in the field of life science and natural science. At present, research on protein–protein interaction networks (PPIN mainly focuses on detecting protein complexes or function modules. However, existing approaches are either ineffective or incomplete. In this paper, we investigate detection mechanisms of functional modules in PPIN, including open database, existing detection algorithms, and recent solutions. After that, we describe the proposed approach based on the simplified swarm optimization (SSO algorithm and the knowledge of Gene Ontology (GO. The proposed solution implements the SSO algorithm for clustering proteins with similar function, and imports biological gene ontology knowledge for further identifying function complexes and improving detection accuracy. Furthermore, we use four different categories of species datasets for experiment: fruitfly, mouse, scere, and human. The testing and analysis result show that the proposed solution is feasible, efficient, and could achieve a higher accuracy of prediction than existing approaches.

  13. Evolutionary Conservation and Emerging Functional Diversity of the Cytosolic Hsp70:J Protein Chaperone Network of Arabidopsis thaliana.

    Science.gov (United States)

    Verma, Amit K; Diwan, Danish; Raut, Sandeep; Dobriyal, Neha; Brown, Rebecca E; Gowda, Vinita; Hines, Justin K; Sahi, Chandan

    2017-06-07

    Heat shock proteins of 70 kDa (Hsp70s) partner with structurally diverse Hsp40s (J proteins), generating distinct chaperone networks in various cellular compartments that perform myriad housekeeping and stress-associated functions in all organisms. Plants, being sessile, need to constantly maintain their cellular proteostasis in response to external environmental cues. In these situations, the Hsp70:J protein machines may play an important role in fine-tuning cellular protein quality control. Although ubiquitous, the functional specificity and complexity of the plant Hsp70:J protein network has not been studied. Here, we analyzed the J protein network in the cytosol of Arabidopsis thaliana and, using yeast genetics, show that the functional specificities of most plant J proteins in fundamental chaperone functions are conserved across long evolutionary timescales. Detailed phylogenetic and functional analysis revealed that increased number, regulatory differences, and neofunctionalization in J proteins together contribute to the emerging functional diversity and complexity in the Hsp70:J protein network in higher plants. Based on the data presented, we propose that higher plants have orchestrated their "chaperome," especially their J protein complement, according to their specialized cellular and physiological stipulations. Copyright © 2017 Verma et al.

  14. Understanding gene essentiality by finely characterizing hubs in the yeast protein interaction network.

    Science.gov (United States)

    Pang, Kaifang; Sheng, Huanye; Ma, Xiaotu

    2010-10-08

    The centrality-lethality rule, i.e., high-degree proteins or hubs tend to be more essential than low-degree proteins in the yeast protein interaction network, reveals that a protein's central position indicates its important function, but whether and why hubs tend to be more essential have been heavily debated. Here, we integrated gene expression and functional module data to classify hubs into four types: non-co-expressed non-co-cluster hubs, non-co-expressed co-cluster hubs, co-expressed non-co-cluster hubs and co-expressed co-cluster hubs. We found that all the four hub types are more essential than non-hubs, but they also show different enrichments in essential proteins. Non-co-expressed non-co-cluster hubs play key role in organizing different modules formed by the other three hub types, but they are less important to the survival of the yeast cell. Among the four hub types, co-expressed co-cluster hubs, which likely correspond to the core components of stable protein complexes, are the most essential. These results demonstrated that our classification of hubs into four types could better improve the understanding of gene essentiality. Copyright © 2010 Elsevier Inc. All rights reserved.

  15. Amino Acid Flux from Metabolic Network Benefits Protein Translation: the Role of Resource Availability.

    Science.gov (United States)

    Hu, Xiao-Pan; Yang, Yi; Ma, Bin-Guang

    2015-06-09

    Protein translation is a central step in gene expression and affected by many factors such as codon usage bias, mRNA folding energy and tRNA abundance. Despite intensive previous studies, how metabolic amino acid supply correlates with protein translation efficiency remains unknown. In this work, we estimated the amino acid flux from metabolic network for each protein in Escherichia coli and Saccharomyces cerevisiae by using Flux Balance Analysis. Integrated with the mRNA expression level, protein abundance and ribosome profiling data, we provided a detailed description of the role of amino acid supply in protein translation. Our results showed that amino acid supply positively correlates with translation efficiency and ribosome density. Moreover, with the rank-based regression model, we found that metabolic amino acid supply facilitates ribosome utilization. Based on the fact that the ribosome density change of well-amino-acid-supplied genes is smaller than poorly-amino-acid-supply genes under amino acid starvation, we reached the conclusion that amino acid supply may buffer ribosome density change against amino acid starvation and benefit maintaining a relatively stable translation environment. Our work provided new insights into the connection between metabolic amino acid supply and protein translation process by revealing a new regulation strategy that is dependent on resource availability.

  16. Prediction of protein function using a deep convolutional neural network ensemble

    Directory of Open Access Journals (Sweden)

    Evangelia I. Zacharaki

    2017-07-01

    Full Text Available Background The availability of large databases containing high resolution three-dimensional (3D models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. Methods In this work, novel shape features are extracted representing protein structure in the form of local (per amino acid distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN for function prediction and the outputs are fused through support vector machines or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel. Results Cross validation experiments on single-functional enzymes (n = 44,661 from the PDB database achieved 90.1% correct classification, demonstrating an improvement over previous results on the same dataset when sequence similarity was not considered. Discussion The automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification. The proposed method shows promise for structure-based protein function prediction, but sufficient data may not yet be available to properly assess the method’s performance on non-homologous proteins and thus reduce the confounding factor of evolutionary relationships.

  17. Atomic interaction networks in the core of protein domains and their native folds.

    Science.gov (United States)

    Soundararajan, Venkataramanan; Raman, Rahul; Raguram, S; Sasisekharan, V; Sasisekharan, Ram

    2010-02-23

    Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be "signature" of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1-2 angstroms (mean 1.61A) C(alpha) RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the 'twilight' and 'midnight' zones wherein <30% and <10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plague-causative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence

  18. Amino acid code of protein secondary structure.

    Science.gov (United States)

    Shestopalov, B V

    2003-01-01

    The calculation of protein three-dimensional structure from the amino acid sequence is a fundamental problem to be solved. This paper presents principles of the code theory of protein secondary structure, and their consequence--the amino acid code of protein secondary structure. The doublet code model of protein secondary structure, developed earlier by the author (Shestopalov, 1990), is part of this theory. The theory basis are: 1) the name secondary structure is assigned to the conformation, stabilized only by the nearest (intraresidual) and middle-range (at a distance no more than that between residues i and i + 5) interactions; 2) the secondary structure consists of regular (alpha-helical and beta-structural) and irregular (coil) segments; 3) the alpha-helices, beta-strands and coil segments are encoded, respectively, by residue pairs (i, i + 4), (i, i + 2), (i, i = 1), according to the numbers of residues per period, 3.6, 2, 1; 4) all such pairs in the amino acid sequence are codons for elementary structural elements, or structurons; 5) the codons are divided into 21 types depending on their strength, i.e. their encoding capability; 6) overlappings of structurons of one and the same structure generate the longer segments of this structure; 7) overlapping of structurons of different structures is forbidden, and therefore selection of codons is required, the codon selection is hierarchic; 8) the code theory of protein secondary structure generates six variants of the amino acid code of protein secondary structure. There are two possible kinds of model construction based on the theory: the physical one using physical properties of amino acid residues, and the statistical one using results of statistical analysis of a great body of structural data. Some evident consequences of the theory are: a) the theory can be used for calculating the secondary structure from the amino acid sequence as a partial solution of the problem of calculation of protein three

  19. An assessment of machine and statistical learning approaches to inferring networks of protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Browne Fiona

    2006-12-01

    Full Text Available Protein-protein interactions (PPI play a key role in many biological systems. Over the past few years, an explosion in availability of functional biological data obtained from high-throughput technologies to infer PPI has been observed. However, results obtained from such experiments show high rates of false positives and false negatives predictions as well as systematic predictive bias. Recent research has revealed that several machine and statistical learning methods applied to integrate relatively weak, diverse sources of large-scale functional data may provide improved predictive accuracy and coverage of PPI. In this paper we describe the effects of applying different computational, integrative methods to predict PPI in Saccharomyces cerevisiae. We investigated the predictive ability of combining different sets of relatively strong and weak predictive datasets. We analysed several genomic datasets ranging from mRNA co-expression to marginal essentiality. Moreover, we expanded an existing multi-source dataset from S. cerevisiae by constructing a new set of putative interactions extracted from Gene Ontology (GO- driven annotations in the Saccharomyces Genome Database. Different classification techniques: Simple Naive Bayesian (SNB, Multilayer Perceptron (MLP and K-Nearest Neighbors (KNN were evaluated. Relatively simple classification methods (i.e. less computing intensive and mathematically complex, such as SNB, have been proven to be proficient at predicting PPI. SNB produced the “highest” predictive quality obtaining an area under Receiver Operating Characteristic (ROC curve (AUC value of 0.99. The lowest AUC value of 0.90 was obtained by the KNN classifier. This assessment also demonstrates the strong predictive power of GO-driven models, which offered predictive performance above 0.90 using the different machine learning and statistical techniques. As the predictive power of single-source datasets became weaker MLP and SNB performed

  20. Protein structure and neutral theory of evolution.

    Science.gov (United States)

    Ptitsyn, O B; Volkenstein, M V

    1986-08-01

    The neutral theory of evolution is extended to the origin of protein molecules. Arguments are presented which suggest that the amino acid sequences of many globular proteins mainly represent "memorized" random sequences while biological evolution reduces to the "editing" these random sequences. Physical requirements for a functional globular protein are formulated and it is shown that many of these requirement do not involve strategical selection of amino acid sequences during biological evolution but are inherent also for typical random sequences. In particular, it is shown that random sequences of polar and amino acid residues can form alpha-helices and beta-strand with lengths and arrangement along the chain similar to those in real globular proteins. These alpha- and beta-regions in random sequences can form three-dimensional folding patterns also similar to those in proteins. The arguments are presented suggesting that even the tight packing of side groups inside protein core do not require very strong biological selection of amino acid sequences either. Thus many structural features of real proteins can exist also in random sequences and the biological selection is needed mainly for the creation of active site of protein and for their stability under physiological conditions.

  1. Using likelihood-free inference to compare evolutionary dynamics of the protein networks of H. pylori and P. falciparum.

    Directory of Open Access Journals (Sweden)

    Oliver Ratmann

    2007-11-01

    Full Text Available Gene duplication with subsequent interaction divergence is one of the primary driving forces in the evolution of genetic systems. Yet little is known about the precise mechanisms and the role of duplication divergence in the evolution of protein networks from the prokaryote and eukaryote domains. We developed a novel, model-based approach for Bayesian inference on biological network data that centres on approximate Bayesian computation, or likelihood-free inference. Instead of computing the intractable likelihood of the protein network topology, our method summarizes key features of the network and, based on these, uses a MCMC algorithm to approximate the posterior distribution of the model parameters. This allowed us to reliably fit a flexible mixture model that captures hallmarks of evolution by gene duplication and subfunctionalization to protein interaction network data of Helicobacter pylori and Plasmodium falciparum. The 80% credible intervals for the duplication-divergence component are [0.64, 0.98] for H. pylori and [0.87, 0.99] for P. falciparum. The remaining parameter estimates are not inconsistent with sequence data. An extensive sensitivity analysis showed that incompleteness of PIN data does not largely affect the analysis of models of protein network evolution, and that the degree sequence alone barely captures the evolutionary footprints of protein networks relative to other statistics. Our likelihood-free inference approach enables a fully Bayesian analysis of a complex and highly stochastic system that is otherwise intractable at present. Modelling the evolutionary history of PIN data, it transpires that only the simultaneous analysis of several global aspects of protein networks enables credible and consistent inference to be made from available datasets. Our results indicate that gene duplication has played a larger part in the network evolution of the eukaryote than in the prokaryote, and suggests that single gene

  2. Gene promoter evolution targets the center of the human protein interaction network.

    Directory of Open Access Journals (Sweden)

    Jordi Planas

    Full Text Available Assessing the contribution of promoters and coding sequences to gene evolution is an important step toward discovering the major genetic determinants of human evolution. Many specific examples have revealed the evolutionary importance of cis-regulatory regions. However, the relative contribution of regulatory and coding regions to the evolutionary process and whether systemic factors differentially influence their evolution remains unclear. To address these questions, we carried out an analysis at the genome scale to identify signatures of positive selection in human proximal promoters. Next, we examined whether genes with positively selected promoters (Prom+ genes show systemic differences with respect to a set of genes with positively selected protein-coding regions (Cod+ genes. We found that the number of genes in each set was not significantly different (8.1% and 8.5%, respectively. Furthermore, a functional analysis showed that, in both cases, positive selection affects almost all biological processes and only a few genes of each group are located in enriched categories, indicating that promoters and coding regions are not evolutionarily specialized with respect to gene function. On the other hand, we show that the topology of the human protein network has a different influence on the molecular evolution of proximal promoters and coding regions. Notably, Prom+ genes have an unexpectedly high centrality when compared with a reference distribution (P=0.008, for Eigenvalue centrality. Moreover, the frequency of Prom+ genes increases from the periphery to the center of the protein network (P=0.02, for the logistic regression coefficient. This means that gene centrality does not constrain the evolution of proximal promoters, unlike the case with coding regions, and further indicates that the evolution of proximal promoters is more efficient in the center of the protein network than in the periphery. These results show that proximal promoters

  3. Analyzing the pathways enriched in genes associated with nicotine dependence in the context of human protein-protein interaction network.

    Science.gov (United States)

    Hu, Ying; Fang, Zhonghai; Yang, Yichen; Fan, Ting; Wang, Ju

    2018-03-16

    Nicotine dependence is the primary addictive stage of cigarette smoking. Although a lot of studies have been performed to explore the molecular mechanism underlying nicotine dependence, our understanding on this disorder is still far from complete. Over the past decades, an increasing number of candidate genes involved in nicotine dependence have been identified by different technical approaches, including the genetic association analysis. In this study, we performed a comprehensive collection of candidate genes reported to be genetically associated with nicotine dependence. Then, the biochemical pathways enriched in these genes were identified by considering the gene's propensity to be related to nicotine dependence. One of the most widely used pathway enrichment analysis approach, over-representation analysis, ignores the function non-equivalence of genes in candidate gene set and may have low discriminative power in identifying some dysfunctional pathways. To overcome such drawbacks, we constructed a comprehensive human protein-protein interaction network, and then assigned a function weighting score to each candidate gene based on their network topological features. Evaluation indicated the function weighting score scheme was consistent with available evidence. Finally, the function weighting scores of the candidate genes were incorporated into pathway analysis to identify the dysfunctional pathways involved in nicotine dependence, and the interactions between pathways was detected by pathway crosstalk analysis. Compared to conventional over-representation based pathway analysis tool, the modified method exhibited improved discriminative power and detected some novel pathways potentially underlying nicotine dependence. In summary, we conducted a comprehensive collection of genes associated with nicotine dependence and then detected the biochemical pathways enriched in these genes using a modified pathway enrichment analysis approach with function weighting

  4. What befalls the proteins and water in a living cell when the cell dies?

    Science.gov (United States)

    Ling, Gilbert N; Fu, Ya-zhen

    2005-01-01

    The solvency of solutes of varying molecular size in the intracellular water of freshly-killed Ehrlich carcinoma cells fits the same theoretical curve that describes the solvency of similar solutes in a 36% solution of native bovine hemoglobin--a protein found only in red blood cells and making up 97.3% of the red cell's total intracellular proteins. The merging of the two sets of data confirms the prediction of the AI Hypothesis that key intracellular protein(s) in dying cells undergo(es) a transition from: (1) one in which the polypeptide NHCO groups assume a fully-extended conformation with relatively strong power of polarizing and orienting the bulk-phase water in multilayers; to (2) one in which most of the polypeptide NHCO groups are engaged in alpha-helical and other "introvert" conformations (see below for definition) with much weaker power in polarizing-orienting multilayers of bulk-phase water. This concordance of the two sets of data also shows that what we now call native hemoglobin--supposedly denoting hemoglobin found in its natural state in living red blood cells--, in fact, more closely resembles the water-polarizing, and -orienting intracellular proteins in dead cells. Although in the dead Ehrlich carcinoma cells as well as in the 36% solution of native hemoglobin, much of the protein's polypeptide NHCO groups are engaged in alpha-helical and other "introvert" conformation (Perutz 1969; Weissbluth 1974), both systems produce a weak but nonetheless pervasive and "long-range" water polarization and orientation. It is suggested that in both the dead Ehrlich carcinoma ascites cells and in the 36% native bovine hemoglobin solution, enough polypeptide NHCO groups assume the fully-extended conformation to produce the weak but far-reaching multilayer water polarization and orientation observed.

  5. The heat-shock protein/chaperone network and multiple stress resistance.

    Science.gov (United States)

    Jacob, Pierre; Hirt, Heribert; Bendahmane, Abdelhafid

    2017-04-01

    Crop yield has been greatly enhanced during the last century. However, most elite cultivars are adapted to temperate climates and are not well suited to more stressful conditions. In the context of climate change, stress resistance is a major concern. To overcome these difficulties, scientists may help breeders by providing genetic markers associated with stress resistance. However, multistress resistance cannot be obtained from the simple addition of single stress resistance traits. In the field, stresses are unpredictable and several may occur at once. Consequently, the use of single stress resistance traits is often inadequate. Although it has been historically linked with the heat stress response, the heat-shock protein (HSP)/chaperone network is a major component of multiple stress responses. Among the HSP/chaperone 'client proteins', many are primary metabolism enzymes and signal transduction components with essential roles for the proper functioning of a cell. HSPs/chaperones are controlled by the action of diverse heat-shock factors, which are recruited under stress conditions. In this review, we give an overview of the regulation of the HSP/chaperone network with a focus on Arabidopsis thaliana. We illustrate the role of HSPs/chaperones in regulating diverse signalling pathways and discuss several basic principles that should be considered for engineering multiple stress resistance in crops through the HSP/chaperone network. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  6. Hub nodes in the network of human Mitogen-Activated Protein Kinase (MAPK pathways: Characteristics and potential as drug targets

    Directory of Open Access Journals (Sweden)

    V.K. MD Aksam

    2017-01-01

    Full Text Available Proteins involved in the cross-talk between ERK1/2, ERK5, JNK, and P38 signalling pathways integrate the network of Mitogen-Activated Protein Kinase (MAPK pathways. Graph theory-based approach is used to construct the network of MAPK pathways, and to observe the network organisational principles. Connectivity pattern reveals rich-club among the hubs, enabling structural ordering. A positive correlation between the degree of the nodes and percentage of essential protein showed hubs are central to the network architecture and function. Furthermore, attributes like connectivity, inter/intra-pathway class, position in the pathway, protein type and subcellular localization of the essential and non-essential proteins are characterizing complex functional roles. Shared properties of 34 cancerous essential proteins lack to be drug targets. We identified the seven nodes overlapping properties of the hub, essential and causing side effects on targeting them. We exploit the strategy of cancerous, non-hub and non-essential proteins as potential drug targets and identified 4EBP1, BAD, CHOP10, GADD45, HSP27, MKP1, RNPK, MLTKa/b, cPLA2, eEF2K and elF4E. We have illustrated the implication of targeting hub nodes and proposed network-based drug targets which would cause less side effect.

  7. The nucleoporin Nup98 associates with the intranuclear filamentous protein network of TPR

    Science.gov (United States)

    Fontoura, Beatriz M. A.; Dales, Samuel; Blobel, Günter; Zhong, Hualin

    2001-01-01

    The Nup98 gene codes for several alternatively spliced protein precursors. Two in vitro translated and autoproteolytically cleaved precursors yielded heterodimers of Nup98-6kDa peptide and Nup98-Nup96. TPR (translocated promoter region) is a protein that forms filamentous structures extending from nuclear pore complexes (NPCs) to intranuclear sites. We found that in vitro translated TPR bound to in vitro translated Nup98 and, via Nup98, to Nup96. Double-immunofluorescence microscopy with antibodies to TPR and Nup98 showed colocalization. In confocal sections the nucleolus itself was only weakly stained but there was intensive perinucleolar staining. Striking spike-like structures emanated from this perinucleolar ring and attenuated into thinner structures as they extended to the nuclear periphery. This characteristic staining pattern of the TPR network was considerably enhanced when a myc-tagged pyruvate kinase-6kDa fusion protein was overexpressed in HeLa cells. Double-immunoelectron microscopy of these cells using anti-myc and anti-TPR antibodies and secondary gold-coupled antibodies yielded row-like arrangements of gold particles. Taken together, the immunolocalization data support previous electron microscopical data, suggesting that TPR forms filaments that extend from the NPC to the nucleolus. We discuss the possible implications of the association of Nup98 with this intranuclear TPR network for an intranuclear phase of transport. PMID:11248057

  8. SynGAP regulates protein synthesis and homeostatic synaptic plasticity in developing cortical networks.

    Directory of Open Access Journals (Sweden)

    Chih-Chieh Wang

    Full Text Available Disrupting the balance between excitatory and inhibitory neurotransmission in the developing brain has been causally linked with intellectual disability (ID and autism spectrum disorders (ASD. Excitatory synapse strength is regulated in the central nervous system by controlling the number of postsynaptic α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptors (AMPARs. De novo genetic mutations of the synaptic GTPase-activating protein (SynGAP are associated with ID and ASD. SynGAP is enriched at excitatory synapses and genetic suppression of SynGAP increases excitatory synaptic strength. However, exactly how SynGAP acts to maintain synaptic AMPAR content is unclear. We show here that SynGAP limits excitatory synaptic strength, in part, by suppressing protein synthesis in cortical neurons. The data presented here from in vitro, rat and mouse cortical networks, demonstrate that regulation of translation by SynGAP involves ERK, mTOR, and the small GTP-binding protein Rheb. Furthermore, these data show that GluN2B-containing NMDARs and the cognitive kinase CaMKII act upstream of SynGAP and that this signaling cascade is required for proper translation-dependent homeostatic synaptic plasticity of excitatory synapses in developing cortical networks.

  9. Gravitropism and lateral root emergence are dependent on the trans-Golgi network protein TNO1

    Directory of Open Access Journals (Sweden)

    Rahul eRoy

    2015-11-01

    Full Text Available The trans-Golgi network (TGN is a dynamic organelle that functions as a relay station for receiving endocytosed cargo, directing secretory cargo, and trafficking to the vacuole. TGN-LOCALIZED SYP41-INTERACTING PROTEIN (TNO1 is a large, TGN-localized, coiled-coil protein that associates with the membrane fusion protein SYP41, a t-SNARE, and is required for efficient protein trafficking to the vacuole. Here, we show that a tno1 mutant has auxin transport-related defects. Mutant roots have delayed lateral root emergence, decreased gravitropic bending of plant organs and increased sensitivity to the auxin analog 2,4-Dichlorophenoxyacetic acid. Auxin asymmetry at the tips of elongating stage II lateral roots was reduced in the tno1 mutant, suggesting a role for TNO1 in cellular auxin transport during lateral root emergence. During gravistimulation, tno1 roots exhibited delayed auxin transport from the columella to the basal epidermal cells. Endocytosis to the TGN was unaffected in the mutant, indicating that bulk endocytic defects are not responsible for the observed phenotypes. Together these studies demonstrate a role for TNO1 in mediating auxin responses during root development and gravistimulation, potentially through trafficking of auxin transport proteins.

  10. GBNV encoded movement protein (NSm) remodels ER network via C-terminal coiled coil domain

    Energy Technology Data Exchange (ETDEWEB)

    Singh, Pratibha; Savithri, H.S., E-mail: bchss@biochem.iisc.ernet.in

    2015-08-15

    Plant viruses exploit the host machinery for targeting the viral genome–movement protein complex to plasmodesmata (PD). The mechanism by which the non-structural protein m (NSm) of Groundnut bud necrosis virus (GBNV) is targeted to PD was investigated using Agrobacterium mediated transient expression of NSm and its fusion proteins in Nicotiana benthamiana. GFP:NSm formed punctuate structures that colocalized with mCherry:plasmodesmata localized protein 1a (PDLP 1a) confirming that GBNV NSm localizes to PD. Unlike in other movement proteins, the C-terminal coiled coil domain of GBNV NSm was shown to be involved in the localization of NSm to PD, as deletion of this domain resulted in the cytoplasmic localization of NSm. Treatment with Brefeldin A demonstrated the role of ER in targeting GFP NSm to PD. Furthermore, mCherry:NSm co-localized with ER–GFP (endoplasmic reticulum targeting peptide (HDEL peptide fused with GFP). Co-expression of NSm with ER–GFP showed that the ER-network was transformed into vesicles indicating that NSm interacts with ER and remodels it. Mutations in the conserved hydrophobic region of NSm (residues 130–138) did not abolish the formation of vesicles. Additionally, the conserved prolines at positions 140 and 142 were found to be essential for targeting the vesicles to the cell membrane. Further, systematic deletion of amino acid residues from N- and C-terminus demonstrated that N-terminal 203 amino acids are dispensable for the vesicle formation. On the other hand, the C-terminal coiled coil domain when expressed alone could also form vesicles. These results suggest that GBNV NSm remodels the ER network by forming vesicles via its interaction through the C-terminal coiled coil domain. Interestingly, NSm interacts with NP in vitro and coexpression of these two proteins in planta resulted in the relocalization of NP to PD and this relocalization was abolished when the N-terminal unfolded region of NSm was deleted. Thus, the NSm

  11. Water molecule network and active site flexibility of apo protein tyrosine phosphatase 1B

    DEFF Research Database (Denmark)

    Pedersen, A.K.; Peters, Günther H.J.; Møller, K.B.

    2004-01-01

    the conformation and flexibility of active-site residues as well as the water-molecule network, is a key issue in understanding ligand binding and enzyme kinetics and in structure-based drug design. A 1.95 Angstrom apo PTP1B structure has been obtained, showing four highly coordinated water molecules in the active......Protein tyrosine phosphatase 1B (PTP1B) plays a key role as a negative regulator of insulin and leptin signalling and is therefore considered to be an important molecular target for the treatment of type 2 diabetes and obesity. Detailed structural information about the structure of PTP1B, including......-site pocket of the enzyme; hence, the active site is highly solvated in the apo state. Three of the water molecules are located at positions that approximately correspond to the positions of the phosphate O atoms of the natural substrate phosphotyrosine and form a similar network of hydrogen bonds. The active...

  12. Predicting dihedral angle probability distributions for protein coil residues from primary sequence using neural networks

    DEFF Research Database (Denmark)

    Helles, Glennie; Fonseca, Rasmus

    2009-01-01

    done previously, none have, to our knowledge, presented comparable results for the probability distribution of dihedral angles. Results: In this paper we develop an artificial neural network that uses an input-window of amino acids to predict a dihedral angle probability distribution for the middle...... residue in the input-window. The trained neural network shows a significant improvement (4-68%) in predicting the most probable bin (covering a 30°×30° area of the dihedral angle space) for all amino acids in the data set compared to first order statistics. An accuracy comparable to that of secondary......Predicting the three-dimensional structure of a protein from its amino acid sequence is currently one of the most challenging problems in bioinformatics. The internal structure of helices and sheets is highly recurrent and help reduce the search space significantly. However, random coil segments...

  13. Network single-walled carbon nanotube biosensors for fast and highly sensitive detection of proteins.

    Science.gov (United States)

    Hu, Pingán; Zhang, Jia; Wen, Zhenzhong; Zhang, Can

    2011-08-19

    Detection of proteins is powerfully assayed in the diagnosis of diseases. A strategy for the development of an ultrahigh sensitivity biosensor based on a network single-walled carbon nanotube (SWNT) field-effect transistor (FET) has been demonstrated. Metallic SWNTs (m-SWNTs) in the network nanotube FET were selectively removed or cut via a carefully controlled procedure of electrical break-down (BD), and left non-conducting m-SWNTs which magnified the Schottky barrier (SB) area. This nanotube FET exhibited ultrahigh sensitivity and fast response to biomolecules. The lowest detection limit of 0.5 pM was achieved by exploiting streptavidin (SA) or a biotin/SA pair as the research model, and BD-treated nanotube biosensors had a 2 × 10(4)-fold lower minimum detectable concentration than the device without BD treatment. The response time is in the range of 0.3-3 min.

  14. Application of serum protein fingerprinting coupled with artificial neural network model in diagnosis of hepatocellular carcinoma.

    Science.gov (United States)

    Wang, Jia-xiang; Zhang, Bo; Yu, Jie-kai; Liu, Jian; Yang, Mei-qin; Zheng, Shu

    2005-08-05

    Hepatocellular carcinoma tends to present at a late clinical stage with poor prognosis. Therefore, it is urgent to explore and develop a simple, rapid diagnostic method, which has high sensitivity and specificity for hepatocellular carcinoma at an early stage. In this study, the serum proteins in patients with hepatocellular carcinoma or liver cirrhosis and in normal controls were analysed. Surface enhanced laser desorption/ionization time-of-flight mass (SELDI-TOF-MS) spectrometry was used to fingerprint serum protein using the protein chip technique and explore the value of the fingerprint, coupled with artificial neural network, to diagnose hepatocellular carcinoma. Of the 106 serum samples obtained, 52 were from patients with hepatocellular carcinoma, 22 from patients with liver cirrhosis and 32 from healthy volunteers. The samples were randomly assigned into a training group (n = 70, 35 patients with hepatocellular carcinoma, 14 with liver cirrhosis, and 21 normal controls) and a testing group (n = 36, 17 patients with hepatocellular carcinoma, 8 with liver cirrhosis, and 11 normal controls). An artificial neural network was trained on data from 70 individuals in the training group to develop an artificial neural network diagnostic model and this model was tested. The 36 sera in the testing group were analysed with blind prediction by using the same flowchart and procedure of data collection. The 36 serum protein spectra were clustered with the preset clustering method and the same mass/charge (M/Z) peak values as those in the training group. Matrix transfer was performed after data were output. Then the data were input into the previously built artificial neural network model to get the prediction value. The M/Z peaks of the samples with more than 2000 M/Z were normalized with biomarker wizard of ProteinChip Software version 3.1 for noise filtering. The first threshold for noise filtering was set at 5, and the second was set at 2. The 10% was the minimum

  15. Purification, crystallization and preliminary X-ray diffraction analysis of the CBS-domain pair from the Methanococcus jannaschii protein MJ0100.

    Science.gov (United States)

    Lucas, María; Kortazar, Danel; Astigarraga, Egoitz; Fernández, José A; Mato, Jose M; Martínez-Chantar, María Luz; Martínez-Cruz, Luis Alfonso

    2008-10-01

    CBS domains are small protein motifs consisting of a three-stranded beta-sheet and two alpha-helices that are present in proteins of all kingdoms of life and in proteins with completely different functions. Several genetic diseases in humans have been associated with mutations in their sequence, which has made them promising targets for rational drug design. The C-terminal domain of the Methanococcus jannaschii protein MJ0100 includes a CBS-domain pair and has been overexpressed, purified and crystallized. Crystals of selenomethionine-substituted (SeMet) protein were also grown. The space group of both the native and SeMet crystals was determined to be orthorhombic P2(1)2(1)2(1), with unit-cell parameters a = 80.9, b = 119.5, c = 173.3 A. Preliminary analysis of the X-ray data indicated that there were eight molecules per asymmetric unit in both cases.

  16. Sparse networks of directly coupled, polymorphic, and functional side chains in allosteric proteins.

    Science.gov (United States)

    Soltan Ghoraie, Laleh; Burkowski, Forbes; Zhu, Mu

    2015-03-01

    Recent studies have highlighted the role of coupled side-chain fluctuations alone in the allosteric behavior of proteins. Moreover, examination of X-ray crystallography data has recently revealed new information about the prevalence of alternate side-chain conformations (conformational polymorphism), and attempts have been made to uncover the hidden alternate conformations from X-ray data. Hence, new computational approaches are required that consider the polymorphic nature of the side chains, and incorporate the effects of this phenomenon in the study of information transmission and functional interactions of residues in a molecule. These studies can provide a more accurate understanding of the allosteric behavior. In this article, we first present a novel approach to generate an ensemble of conformations and an efficient computational method to extract direct couplings of side chains in allosteric proteins, and provide sparse network representations of the couplings. We take the side-chain conformational polymorphism into account, and show that by studying the intrinsic dynamics of an inactive structure, we are able to construct a network of functionally crucial residues. Second, we show that the proposed method is capable of providing a magnified view of the coupled and conformationally polymorphic residues. This model reveals couplings between the alternate conformations of a coupled residue pair. To the best of our knowledge, this is the first computational method for extracting networks of side chains' alternate conformations. Such networks help in providing a detailed image of side-chain dynamics in functionally important and conformationally polymorphic sites, such as binding and/or allosteric sites. © 2014 Wiley Periodicals, Inc.

  17. Identification of Top-ranked Proteins within a Directional Protein Interaction Network using the PageRank Algorithm: Applications in Humans and Plants.

    Science.gov (United States)

    Li, Xiu-Qing; Xing, Tim; Du, Donglei

    2016-01-01

    Somatic mutation of signal transduction genes or key nodes of the cellular protein network can cause severe diseases in humans but can sometimes genetically improve plants, likely because growth is determinate in animals but indeterminate in plants. This article reviews protein networks; human protein ranking; the mitogen-activated protein kinase (MAPK) and insulin (phospho- inositide 3kinase [PI3K]/phosphatase and tensin homolog [PTEN]/protein kinase B [AKT]) signaling pathways; human diseases caused by somatic mutations to the PI3K/PTEN/ AKT pathway; use of the MAPK pathway in plant molecular breeding; and protein domain evolution. Casitas B-lineage lymphoma (CBL), PTEN, MAPK1 and PIK3CA are among PIK3CA the top-ranked proteins in directional rankings. Eight proteins (ACVR1, CDC42, RAC1, RAF1, RHOA, TGFBR1, TRAF2, and TRAF6) are ranked in the top 50 key players in both signal emission and signal reception and in interaction with many other proteins. Top-ranked proteins likely have major impacts on the network function. Such proteins are targets for drug discovery, because their mutations are implicated in various cancers and overgrowth syndromes. Appropriately managing food intake may help reduce the growth of tumors or malformation of tissues. The role of the protein kinase C/ fatty acid synthase pathway in fat deposition in PTEN/PI3K patients should be investigated. Both the MAPK and insulin signaling pathways exist in plants, and MAPK pathway engineering can improve plant tolerance to biotic and abiotic stresses such as salinity.

  18. Phylogeny, Functional Annotation, and Protein Interaction Network Analyses of the Xenopus tropicalis Basic Helix-Loop-Helix Transcription Factors

    Directory of Open Access Journals (Sweden)

    Wuyi Liu

    2013-01-01

    Full Text Available The previous survey identified 70 basic helix-loop-helix (bHLH proteins, but it was proved to be incomplete, and the functional information and regulatory networks of frog bHLH transcription factors were not fully known. Therefore, we conducted an updated genome-wide survey in the Xenopus tropicalis genome project databases and identified 105 bHLH sequences. Among the retrieved 105 sequences, phylogenetic analyses revealed that 103 bHLH proteins belonged to 43 families or subfamilies with 46, 26, 11, 3, 15, and 4 members in the corresponding supergroups. Next, gene ontology (GO enrichment analyses showed 65 significant GO annotations of biological processes and molecular functions and KEGG pathways counted in frequency. To explore the functional pathways, regulatory gene networks, and/or related gene groups coding for Xenopus tropicalis bHLH proteins, the identified bHLH genes were put into the databases KOBAS and STRING to get the signaling information of pathways and protein interaction networks according to available public databases and known protein interactions. From the genome annotation and pathway analysis using KOBAS, we identified 16 pathways in the Xenopus tropicalis genome. From the STRING interaction analysis, 68 hub proteins were identified, and many hub proteins created a tight network or a functional module within the protein families.

  19. Hydrogen bond networks determine emergent mechanical and thermodynamic properties across a protein family

    Directory of Open Access Journals (Sweden)

    Dallakyan Sargis

    2008-08-01

    Full Text Available Abstract Background Gram-negative bacteria use periplasmic-binding proteins (bPBP to transport nutrients through the periplasm. Despite immense diversity within the recognized substrates, all members of the family share a common fold that includes two domains that are separated by a conserved hinge. The hinge allows the protein to cycle between open (apo and closed (ligated conformations. Conformational changes within the proteins depend on a complex interplay of mechanical and thermodynamic response, which is manifested as an increase in thermal stability and decrease of flexibility upon ligand binding. Results We use a distance constraint model (DCM to quantify the give and take between thermodynamic stability and mechanical flexibility across the bPBP family. Quantitative stability/flexibility relationships (QSFR are readily evaluated because the DCM links mechanical and thermodynamic properties. We have previously demonstrated that QSFR is moderately conserved across a mesophilic/thermophilic RNase H pair, whereas the observed variance indicated that different enthalpy-entropy mechanisms allow similar mechanical response at their respective melting temperatures. Our predictions of heat capacity and free energy show marked diversity across the bPBP family. While backbone flexibility metrics are mostly conserved, cooperativity correlation (long-range couplings also demonstrate considerable amount of variation. Upon ligand removal, heat capacity, melting point, and mechanical rigidity are, as expected, lowered. Nevertheless, significant differences are found in molecular cooperativity correlations that can be explained by the detailed nature of the hydrogen bond network. Conclusion Non-trivial mechanical and thermodynamic variation across the family is explained by differences within the underlying H-bond networks. The mechanism is simple; variation within the H-bond networks result in altered mechanical linkage properties that directly affect

  20. GIS: a comprehensive source for protein structure similarities.

    Science.gov (United States)

    Guerler, Aysam; Knapp, Ernst-Walter

    2010-07-01

    A web service for analysis of protein structures that are sequentially or non-sequentially similar was generated. Recently, the non-sequential structure alignment algorithm GANGSTA+ was introduced. GANGSTA+ can detect non-sequential structural analogs for proteins stated to possess novel folds. Since GANGSTA+ ignores the polypeptide chain connectivity of secondary structure elements (i.e. alpha-helices and beta-strands), it is able to detect structural similarities also between proteins whose sequences were reshuffled during evolution. GANGSTA+ was applied in an all-against-all comparison on the ASTRAL40 database (SCOP version 1.75), which consists of >10,000 protein domains yielding about 55 x 10(6) possible protein structure alignments. Here, we provide the resulting protein structure alignments as a public web-based service, named GANGSTA+ Internet Services (GIS). We also allow to browse the ASTRAL40 database of protein structures with GANGSTA+ relative to an externally given protein structure using different constraints to select specific results. GIS allows us to analyze protein structure families according to the SCOP classification scheme. Additionally, users can upload their own protein structures for pairwise protein structure comparison, alignment against all protein structures of the ASTRAL40 database (SCOP version 1.75) or symmetry analysis. GIS is publicly available at http://agknapp.chemie.fu-berlin.de/gplus.

  1. Near perfect protein multi-label classification with deep neural networks.

    Science.gov (United States)

    Szalkai, Balázs; Grolmusz, Vince

    2018-01-01

    Biological sequences can be considered as data items of high-, non-fixed dimensions, corresponding to the length of those sequences. The comparison and the classification of biological sequences in their relations to large databases are important areas of research today. Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image- and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with multi-label classification ability, showing impressive accuracy when classifying protein sequences into 698 UniProt families (AUC=99.99%) and 983 Gene Ontology classes (AUC=99.45%). Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Features analysis for identification of date and party hubs in protein interaction network of Saccharomyces Cerevisiae

    Directory of Open Access Journals (Sweden)

    Araabi Babak N

    2010-12-01

    Full Text Available Abstract Background It has been understood that biological networks have modular organizations which are the sources of their observed complexity. Analysis of networks and motifs has shown that two types of hubs, party hubs and date hubs, are responsible for this complexity. Party hubs are local coordinators because of their high co-expressions with their partners, whereas date hubs display low co-expressions and are assumed as global connectors. However there is no mutual agreement on these concepts in related literature with different studies reporting their results on different data sets. We investigated whether there is a relation between the biological features of Saccharomyces Cerevisiae's proteins and their roles as non-hubs, intermediately connected, party hubs, and date hubs. We propose a classifier that separates these four classes. Results We extracted different biological characteristics including amino acid sequences, domain contents, repeated domains, functional categories, biological processes, cellular compartments, disordered regions, and position specific scoring matrix from various sources. Several classifiers are examined and the best feature-sets based on average correct classification rate and correlation coefficients of the results are selected. We show that fusion of five feature-sets including domains, Position Specific Scoring Matrix-400, cellular compartments level one, and composition pairs with two and one gaps provide the best discrimination with an average correct classification rate of 77%. Conclusions We study a variety of known biological feature-sets of the proteins and show that there is a relation between domains, Position Specific Scoring Matrix-400, cellular compartments level one, composition pairs with two and one gaps of Saccharomyces Cerevisiae's proteins, and their roles in the protein interaction network as non-hubs, intermediately connected, party hubs and date hubs. This study also confirms the

  3. Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure

    Directory of Open Access Journals (Sweden)

    Bilmes Jeff

    2011-05-01

    Full Text Available Abstract Background Protein secondary structure prediction provides insight into protein function and is a valuable preliminary step for predicting the 3D structure of a protein. Dynamic Bayesian networks (DBNs and support vector machines (SVMs have been shown to provide state-of-the-art performance in secondary structure prediction. As the size of the protein database grows, it becomes feasible to use a richer model in an effort to capture subtle correlations among the amino acids and the predicted labels. In this context, it is beneficial to derive sparse models that discourage over-fitting and provide biological insight. Results In this paper, we first show that we are able to obtain accurate secondary structure predictions. Our per-residue accuracy on a well established and difficult benchmark (CB513 is 80.3%, which is comparable to the state-of-the-art evaluated on this dataset. We then introduce an algorithm for sparsifying the parameters of a DBN. Using this algorithm, we can automatically remove up to 70-95% of the parameters of a DBN while maintaining the same level of predictive accuracy on the SD576 set. At 90% sparsity, we are able to compute predictions three times faster than a fully dense model evaluated on the SD576 set. We also demonstrate, using simulated data, that the algorithm is able to recover true sparse structures with high accuracy, and using real data, that the sparse model identifies known correlation structure (local and non-local related to different classes of secondary structure elements. Conclusions We present a secondary structure prediction method that employs dynamic Bayesian networks and support vector machines. We also introduce an algorithm for sparsifying the parameters of the dynamic Bayesian network. The sparsification approach yields a significant speed-up in generating predictions, and we demonstrate that the amino acid correlations identified by the algorithm correspond to several known features of

  4. Free energy of contact formation in proteins: Efficient computation in the elastic network approximation

    Science.gov (United States)

    Hamacher, Kay

    2011-07-01

    Biomolecular simulations have become a major tool in understanding biomolecules and their complexes. However, one can typically only investigate a few mutants or scenarios due to the severe computational demands of such simulations, leading to a great interest in method development to overcome this restriction. One way to achieve this is to reduce the complexity of the systems by an approximation of the forces acting upon the constituents of the molecule. The harmonic approximation used in elastic network models simplifies the physical complexity to the most reduced dynamics of these molecular systems. The reduced polymer modeled this way is typically comprised of mass points representing coarse-grained versions of, e.g., amino acids. In this work, we show how the computation of free energy contributions of contacts between two residues within the molecule can be reduced to a simple lookup operation in a precomputable matrix. Being able to compute such contributions is of great importance: protein design or molecular evolution changes introduce perturbations to these pair interactions, so we need to understand their impact. Perturbation to the interactions occurs due to randomized and fixated changes (in molecular evolution) or designed modifications of the protein structures (in bioengineering). These perturbations are modifications in the topology and the strength of the interactions modeled by the elastic network models. We apply the new algorithm to (1) the bovine trypsin inhibitor, a well-known enzyme in biomedicine, and show the connection to folding properties and the hydrophobic collapse hypothesis and (2) the serine proteinase inhibitor CI-2 and show the correlation to Φ values to characterize folding importance. Furthermore, we discuss the computational complexity and show empirical results for the average case, sampled over a library of 77 structurally diverse proteins. We found a relative speedup of up to 10 000-fold for large proteins with respect to

  5. Evolutionary rate heterogeneity between multi- and single-interface hubs across human housekeeping and tissue-specific protein interaction network: Insights from proteins' and its partners' properties.

    Science.gov (United States)

    Biswas, Kakali; Acharya, Debarun; Podder, Soumita; Ghosh, Tapash Chandra

    2017-12-02

    Integrating gene expression into protein-protein interaction network (PPIN) leads to the construction of tissue-specific (TS) and housekeeping (HK) sub-networks, with distinctive TS- and HK-hubs. All such hub proteins are divided into multi-interface (MI) hubs and single-interface (SI) hubs, where MI hubs evolve slower than SI hubs. Here we explored the evolutionary rate difference between MI and SI proteins within TS- and HK-PPIN and observed that this difference is present only in TS, but not in HK-class. Next, we explored whether proteins' own properties or its partners' properties are more influential in such evolutionary discrepancy. Statistical analyses revealed that this evolutionary rate correlates negatively with protein's own properties like expression level, miRNA count, conformational diversity and functional properties and with its partners' properties like protein disorder and tissue expression similarity. Moreover, partial correlation and regression analysis revealed that both proteins' and its partners' properties have independent effects on protein evolutionary rate. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Ab initio and homology based prediction of protein domains by recursive neural networks

    Directory of Open Access Journals (Sweden)

    Mooney Catherine

    2009-06-01

    Full Text Available Abstract Background Proteins, especially larger ones, are often composed of individual evolutionary units, domains, which have their own function and structural fold. Predicting domains is an important intermediate step in protein analyses, including the prediction of protein structures. Results We describe novel systems for the prediction of protein domain boundaries powered by Recursive Neural Networks. The systems rely on a combination of primary sequence and evolutionary information, predictions of structural features such as secondary structure, solvent accessibility and residue contact maps, and structural templates, both annotated for domains (from the SCOP dataset and unannotated (from the PDB. We gauge the contribution of contact maps, and PDB and SCOP templates independently and for different ranges of template quality. We find that accurately predicted contact maps are informative for the prediction of domain boundaries, while the same is not true for contact maps predicted ab initio. We also find that gap information from PDB templates is informative, but, not surprisingly, less than SCOP annotations. We test both systems trained on templates of all qualities, and systems trained only on templates of marginal similarity to the query (less than 25% sequence identity. While the first batch of systems produces near perfect predictions in the presence of fair to good templates, the second batch outperforms or match ab initio predictors down to essentially any level of template quality. We test all systems in 5-fold cross-validation on a large non-redundant set of multi-domain and single domain proteins. The final predictors are state-of-the-art, with a template-less prediction boundary recall of 50.8% (precision 38.7% within ± 20 residues and a single domain recall of 80.3% (precision 78.1%. The SCOP-based predictors achieve a boundary recall of 74% (precision 77.1% again within ± 20 residues, and classify single domain proteins as

  7. A Network of Multi-Tasking Proteins at the DNA Replication Fork Preserves Genome Stability.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available To elucidate the network that maintains high fidelity genome replication, we have introduced two conditional mutant alleles of DNA2, an essential DNA replication gene, into each of the approximately 4,700 viable yeast deletion mutants and determined the fitness of the double mutants. Fifty-six DNA2-interacting genes were identified. Clustering analysis of genomic synthetic lethality profiles of each of 43 of the DNA2-interacting genes defines a network (consisting of 322 genes and 876 interactions whose topology provides clues as to how replication proteins coordinate regulation and repair to protect genome integrity. The results also shed new light on the functions of the query gene DNA2, which, despite many years of study, remain controversial, especially its proposed role in Okazaki fragment processing and the nature of its in vivo substrates. Because of the multifunctional nature of virtually all proteins at the replication fork, the meaning of any single genetic interaction is inherently ambiguous. The multiplexing nature of the current studies, however, combined with follow-up supporting experiments, reveals most if not all of the unique pathways requiring Dna2p. These include not only Okazaki fragment processing and DNA repair but also chromatin dynamics.

  8. An Integrative Analysis of Preeclampsia Based on the Construction of an Extended Composite Network Featuring Protein-Protein Physical Interactions and Transcriptional Relationships.

    Directory of Open Access Journals (Sweden)

    Daniel Vaiman

    Full Text Available Preeclampsia (PE is a pregnancy disorder defined by hypertension and proteinuria. This disease remains a major cause of maternal and fetal morbidity and mortality. Defective placentation is generally described as being at the root of the disease. The characterization of the transcriptome signature of the preeclamptic placenta has allowed to identify differentially expressed genes (DEGs. However, we still lack a detailed knowledge on how these DEGs impact the function of the placenta. The tools of network biology offer a methodology to explore complex diseases at a systems level. In this study we performed a cross-platform meta-analysis of seven publically available gene expression datasets comparing non-pathological and preeclamptic placentas. Using the rank product algorithm we identified a total of 369 DEGs consistently modified in PE. The DEGs were used as seeds to build both an extended physical protein-protein interactions network and a transcription factors regulatory network. Topological and clustering analysis was conducted to analyze the connectivity properties of the networks. Finally both networks were merged into a composite network which presents an integrated view of the regulatory pathways involved in preeclampsia and the crosstalk between them. This network is a useful tool to explore the relationship between the DEGs and enable hypothesis generation for functional experimentation.

  9. An Analysis of Central Residues Between Ligand-Bound and Ligand-Free Protein Structures Based on Network Approach.

    Science.gov (United States)

    Amala, Arumugam; Emerson, Isacc Arnold

    2017-08-01

    Depiction of protein structures as networks of interacting residues has enabled us to understand the structure and function of the protein. Previous investigations on closeness centrality have identified protein functional sites from three- dimensional structures. It is well recognized that ligand binding to a receptor protein induces a wide range of structural changes. An interesting question is how central residues function during conformational changes triggered during ligand binding? The aim of this study is to comprehend at what extent central residues change during ligand binding to receptor proteins. To determine this, we examined 37 pairs of protein structures consisting of ligand-bound and ligand-free forms. These protein structures were modelled as an undirected network and significant central residues were obtained using residue centrality measures. In addition to these, the basic network parameters were also analysed. On analysing the residue centrality measures, we observed that 60% of central residues were common in both the ligand-bound and ligand-free states. The geometry of the central residues revealed that they were situated closer to the protein center of the mass. Finally, we demonstrated the effectiveness of central residues in amino acids substitutions and in the evolution itself. The closeness centrality was also analyzed among different protein domain sizes and the values gradually declined from single-domains to multi-domain proteins suggesting that the network has potential for hierarchical organization. Betweenness centrality measure was also used to determine the central residues and 31% of these residues were common between the holo/apo states. Findings reveal that central residues play a significant role in determining the functional properties of proteins. These results have implications in predicting binding/active site residues, specifically in the context of drug designing, if additional information concerning ligand binding is

  10. Uncovering the Molecular Mechanism of Actions between Pharmaceuticals and Proteins on the AD Network.

    Directory of Open Access Journals (Sweden)

    Shujuan Cao

    Full Text Available This study begins with constructing the mini metabolic networks (MMNs of beta amyloid (Aβ and acetylcholine (ACh which stimulate the Alzheimer's Disease (AD. Then we generate the AD network by incorporating MMNs of Aβ and ACh, and other MMNs of stimuli of AD. The panel of proteins contains 49 enzymes/receptors on the AD network which have the 3D-structure in PDB. The panel of drugs is formed by 5 AD drugs and 5 AD nutraceutical drugs, and 20 non-AD drugs. All of these complexes formed by these 30 drugs and 49 proteins are transformed into dyadic arrays. Utilizing the prior knowledge learned from the drug panel, we propose a statistical classification (dry-lab. According to the wet-lab for the complex of amiloride and insulin degrading enzyme, and the complex of amiloride and neutral endopeptidase, we are confident that this dry-lab is reliable. As the consequences of the dry-lab, we discover many interesting implications. Especially, we show that possible causes of Tacrine, donepezil, galantamine and huperzine A cannot improve the level of ACh which is against to their original design purpose but they still prevent AD to be worse as Aβ deposition appeared. On the other hand, we recommend Miglitol and Atenolol as the safe and potent drugs to improve the level of ACh before Aβ deposition appearing. Moreover, some nutrients such as NADH and Vitamin E should be controlled because they may harm health if being used in wrong way and wrong time. Anyway, the insights shown in this study are valuable to be developed further.

  11. IntNetDB v1.0: an integrated protein-protein interaction network database generated by a probabilistic model

    Directory of Open Access Journals (Sweden)

    Han Jing-Dong J

    2006-11-01

    Full Text Available Abstract Background Although protein-protein interaction (PPI networks have been explored by various experimental methods, the maps so built are still limited in coverage and accuracy. To further expand the PPI network and to extract more accurate information from existing maps, studies have been carried out to integrate various types of functional relationship data. A frequently updated database of computationally analyzed potential PPIs to provide biological researchers with rapid and easy access to analyze original data as a biological network is still lacking. Results By applying a probabilistic model, we integrated 27 heterogeneous genomic, proteomic and functional annotation datasets to predict PPI networks in human. In addition to previously studied data types, we show that phenotypic distances and genetic interactions can also be integrated to predict PPIs. We further built an easy-to-use, updatable integrated PPI database, the Integrated Network Database (IntNetDB online, to provide automatic prediction and visualization of PPI network among genes of interest. The networks can be visualized in SVG (Scalable Vector Graphics format for zooming in or out. IntNetDB also provides a tool to extract topologically highly connected network neighborhoods from a specific network for further exploration and research. Using the MCODE (Molecular Complex Detections algorithm, 190 such neighborhoods were detected among all the predicted interactions. The predicted PPIs can also be mapped to worm, fly and mouse interologs. Conclusion IntNetDB includes 180,010 predicted protein-protein interactions among 9,901 human proteins and represents a useful resource for the research community. Our study has increased prediction coverage by five-fold. IntNetDB also provides easy-to-use network visualization and analysis tools that allow biological researchers unfamiliar with computational biology to access and analyze data over the internet. The web interface of

  12. StaRProtein, A Web Server for Prediction of the Stability of Repeat Proteins

    Science.gov (United States)

    Xu, Yongtao; Zhou, Xu; Huang, Meilan

    2015-01-01

    Repeat proteins have become increasingly important due to their capability to bind to almost any proteins and the potential as alternative therapy to monoclonal antibodies. In the past decade repeat proteins have been designed to mediate specific protein-protein interactions. The tetratricopeptide and ankyrin repeat proteins are two classes of helical repeat proteins that form different binding pockets to accommodate various partners. It is important to understand the factors that define folding and stability of repeat proteins in order to prioritize the most stable designed repeat proteins to further explore their potential binding affinities. Here we developed distance-dependant statistical potentials using two classes of alpha-helical repeat proteins, tetratricopeptide and ankyrin repeat proteins respectively, and evaluated their efficiency in predicting the stability of repeat proteins. We demonstrated that the repeat-specific statistical potentials based on these two classes of repeat proteins showed paramount accuracy compared with non-specific statistical potentials in: 1) discriminate correct vs. incorrect models 2) rank the stability of designed repeat proteins. In particular, the statistical scores correlate closely with the equilibrium unfolding free energies of repeat proteins and therefore would serve as a novel tool in quickly prioritizing the designed repeat proteins with high stability. StaRProtein web server was developed for predicting the stability of repeat proteins. PMID:25807112

  13. Domain distribution and intrinsic disorder in hubs in the human protein–protein interaction network

    Science.gov (United States)

    Patil, Ashwini; Kinoshita, Kengo; Nakamura, Haruki

    2010-01-01

    Intrinsic disorder and distributed surface charge have been previously identified as some of the characteristics that differentiate hubs (proteins with a large number of interactions) from non-hubs in protein–protein interaction networks. In this study, we investigated the differences in the quantity, diversity, and functional nature of Pfam domains, and their relationship with intrinsic disorder, in hubs and non-hubs. We found that proteins with a more diverse domain composition were over-represented in hubs when compared with non-hubs, with the number of interactions in hubs increasing with domain diversity. Conversely, the fraction of intrinsic disorder in hubs decreased with increasing number of ordered domains. The difference in the levels of disorder was more prominent in hubs and non-hubs with fewer domains. Functional analysis showed that hubs were enriched in kinase and adaptor domains acting primarily in signal transduction and transcription regulation, whereas non-hubs had more DNA-binding domains and were involved in catalytic activity. Consistent with the differences in the functional nature of their domains, hubs with two or more domains were more likely to connect distinct functional modules in the interaction network when compared with single domain hubs. We conclude that the availability of greater number and diversity of ordered domains, in addition to the tendency to have promiscuous domains, differentiates hubs from non-hubs and provides an additional means of achieving interaction promiscuity. Further, hubs with fewer domains use greater levels of intrinsic disorder to facilitate interaction promiscuity with the prevalence of disorder decreasing with increasing number of ordered domains. PMID:20509167

  14. Role of long- and short-range hydrophobic, hydrophilic and charged residues contact network in protein's structural organization.

    Science.gov (United States)

    Sengupta, Dhriti; Kundu, Sudip

    2012-06-21

    The three-dimensional structure of a protein can be described as a graph where nodes represent residues and the strength of non-covalent interactions between them are edges. These protein contact networks can be separated into long and short-range interactions networks depending on the positions of amino acids in primary structure. Long-range interactions play a distinct role in determining the tertiary structure of a protein while short-range interactions could largely contribute to the secondary structure formations. In addition, physico chemical properties and the linear arrangement of amino acids of the primary structure of a protein determines its three dimensional structure. Here, we present an extensive analysis of protein contact subnetworks based on the London van der Waals interactions of amino acids at different length scales. We further subdivided those networks in hydrophobic, hydrophilic and charged residues networks and have tried to correlate their influence in the overall topology and organization of a protein. The largest connected component (LCC) of long (LRN)-, short (SRN)- and all-range (ARN) networks within proteins exhibit a transition behaviour when plotted against different interaction strengths of edges among amino acid nodes. While short-range networks having chain like structures exhibit highly cooperative transition; long- and all-range networks, which are more similar to each other, have non-chain like structures and show less cooperativity. Further, the hydrophobic residues subnetworks in long- and all-range networks have similar transition behaviours with all residues all-range networks, but the hydrophilic and charged residues networks don't. While the nature of transitions of LCC's sizes is same in SRNs for thermophiles and mesophiles, there exists a clear difference in LRNs. The presence of larger size of interconnected long-range interactions in thermophiles than mesophiles, even at higher interaction strength between amino acids

  15. Molecular dynamics studies of protein folding and aggregation

    Science.gov (United States)

    Ding, Feng

    that globular proteins under a denaturing environment partially unfold and aggregate by forming stabilizing hydrogen bonds between the backbones of the partial folded substructures. Proteins or peptides rich in alpha-helices also aggregate into beta-rich amyloid fibrils. Upon aggregation, the protein or peptide undergoes a conformational transition from alpha-helices to beta-sheets. The transition of alpha-helix to beta-hairpin (two-stranded beta-sheet) is studied in an all-heavy-atom discrete molecular dynamics model of a polyalanine chain. An entropical driving scenario for the alpha-helix to beta-hairpin transition is discovered.

  16. Modelling human protein interaction networks as metric spaces has potential in disease research and drug target discovery.

    Science.gov (United States)

    Fadhal, Emad; Mwambene, Eric C; Gamieldien, Junaid

    2014-06-14

    We have recently shown by formally modelling human protein interaction networks (PINs) as metric spaces and classified proteins into zones based on their distance from the topological centre that hub proteins are primarily centrally located. We also showed that zones closest to the network centre are enriched for critically important proteins and are also functionally very specialised for specific 'house keeping' functions. We proposed that proteins closest to the network centre may present good therapeutic targets. Here, we present multiple pieces of novel functional evidence that provides strong support for this hypothesis. We found that the human PINs has a highly connected signalling core, with the majority of proteins involved in signalling located in the two zones closest to the topological centre. The majority of essential, disease related, tumour suppressor, oncogenic and approved drug target proteins were found to be centrally located. Similarly, the majority of proteins consistently expressed in 13 types of cancer are also predominantly located in zones closest to the centre. Proteins from zones 1 and 2 were also found to comprise the majority of proteins in key KEGG pathways such as MAPK-signalling, the cell cycle, apoptosis and also pathways in cancer, with very similar patterns seen in pathways that lead to cancers such as melanoma and glioma, and non-neoplastic diseases such as measles, inflammatory bowel disease and Alzheimer's disease. Based on the diversity of evidence uncovered, we propose that when considered holistically, proteins located centrally in the human PINs that also have similar functions to existing drug targets are good candidate targets for novel therapeutics. Similarly, since disease pathways are dominated by centrally located proteins, candidates shortlisted in genome scale disease studies can be further prioritized and contextualised based on whether they occupy central positions in the human PINs.

  17. Estimation of adsorption isotherm and mass transfer parameters in protein chromatography using artificial neural networks.

    Science.gov (United States)

    Wang, Gang; Briskot, Till; Hahn, Tobias; Baumann, Pascal; Hubbuch, Jürgen

    2017-03-03

    Mechanistic modeling has been repeatedly successfully applied in process development and control of protein chromatography. For each combination of adsorbate and adsorbent, the mechanistic models have to be calibrated. Some of the model parameters, such as system characteristics, can be determined reliably by applying well-established experimental methods, whereas others cannot be measured directly. In common practice of protein chromatography modeling, these parameters are identified by applying time-consuming methods such as frontal analysis combined with gradient experiments, curve-fitting, or combined Yamamoto approach. For new components in the chromatographic system, these traditional calibration approaches require to be conducted repeatedly. In the presented work, a novel method for the calibration of mechanistic models based on artificial neural network (ANN) modeling was applied. An in silico screening of possible model parameter combinations was performed to generate learning material for the ANN model. Once the ANN model was trained to recognize chromatograms and to respond with the corresponding model parameter set, it was used to calibrate the mechanistic model from measured chromatograms. The ANN model's capability of parameter estimation was tested by predicting gradient elution chromatograms. The time-consuming model parameter estimation process itself could be reduced down to milliseconds. The functionality of the method was successfully demonstrated in a study with the calibration of the transport-dispersive model (TDM) and the stoichiometric displacement model (SDM) for a protein mixture. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.

  18. Two-dimensional sup 1 H NMR studies on HPr protein from Staphylococcus aureus: Complete sequential assignments and secondary structure

    Energy Technology Data Exchange (ETDEWEB)

    Kalbitzer, H.R.; Neidig, K.P. (Max-Planck-Inst. for Medical Research, Heidelberg (West Germany)); Hengstenberg, W. (Univ. of Bochum (West Germany))

    1991-11-19

    Complete sequence-specific assignments of the {sup 1}H NMR spectrum of HPr protein from Staphylococcus aureus were obtained by two-dimensional NMR methods. Important secondary structure elements that can be derived from the observed nuclear Overhauser effects are a large antiparallel {beta}-pleated sheet consisting of four strands, A, B, C, D, a segment S{sub AB} consisting of an extended region around the active-center histidine (His-15) and an {alpha}-helix, a half-turn between strands B and C, a segment S{sub CD} which shows no typical secondary structure, and the {alpha}-helical, C-terminal segment S{sub term}. These general structural features are similar to those found earlier in HPr proteins from different microorganisms such as Escherichia coli, Bacillus subtilis, and Streptococcus faecalis.

  19. DNCON2: Improved protein contact prediction using two-level deep convolutional neural networks.

    Science.gov (United States)

    Adhikari, Badri; Hou, Jie; Cheng, Jianlin

    2017-12-08

    Significant improvements in the prediction of protein residue-residue contacts are observed in the recent years. These contacts, predicted using a variety of coevolution-based and machine learning methods, are the key contributors to the recent progress in ab initio protein structure prediction, as demonstrated in the recent CASP experiments. Continuing the development of new methods to reliably predict contact maps is essential to further improve ab initio structure prediction. In this paper we discuss DNCON2, an improved protein contact map predictor based on two-level deep convolutional neural networks. It consists of six convolutional neural networks - the first five predict contacts at 6, 7.5, 8, 8.5, and 10 Å distance thresholds, and the last one uses these five predictions as additional features to predict final contact maps. On the free-modeling datasets in CASP10, 11, and 12 experiments, DNCON2 achieves mean precisions of 35%, 50%, and 53.4%, respectively, higher than 30.6% by MetaPSICOV on CASP10 dataset, 34% by MetaPSICOV on CASP11 dataset, and 46.3% by Raptor-X on CASP12 dataset, when top L/5 long-range contacts are evaluated. We attribute the improved performance of DNCON2 to the inclusion of short- and medium-range contacts into training, two-level approach to prediction, use of the state-of-the-art optimization and activation functions, and a novel deep learning architecture that allows each filter in a convolutional layer to access all the input features of a protein of arbitrary length. The web server of DNCON2 is at http://sysbio.rnet.missouri.edu/dncon2/ where training and testing datasets as well as the predictions for CASP10, 11, and 12 free-modeling datasets can also be downloaded. Its source code is available at https://github.com/multicom-toolbox/DNCON2/. chengji@missouri.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  20. Conformational intermediate of the amyloidogenic protein beta 2-microglobulin at neutral pH

    DEFF Research Database (Denmark)

    Heegaard, N H; Sen, J W; Kaarsholm, N C

    2001-01-01

    electrophoresis that two conformers spontaneously exist in aqueous buffers at neutral pH. Upon treatment of wild-type beta(2)-microglobulin with acetonitrile or trifluoroethanol, two conformations were also observed. These conformations were in equilibrium dependent on the sample temperature and the percentage...... of organic solvent present. Circular dichroism showed a loss of beta-structures and gain of alpha-helices. Reversal to the native conformation occurred when removing the organics. Affinity capillary electrophoresis experiments showed increased specific interactions of the nonnative beta(2)-microglobulin...... conformation with the dyes 8-anilino-1-naphthalene sulfonic acid and Congo red. The observations may relate to early folding events prior to amyloid fibrillation and facilitate the development of methods to detect and inhibit pro-amyloid protein and peptide conformations....

  1. Domain architecture and oligomerization properties of the paramyxovirus PIV 5 hemagglutinin-neuraminidase (HN) protein.

    Science.gov (United States)

    Yuan, Ping; Leser, George P; Demeler, Borries; Lamb, Robert A; Jardetzky, Theodore S

    2008-09-01

    The mechanism by which the paramyxovirus hemagglutinin-neuraminidase (HN) protein couples receptor binding to activation of virus entry remains to be fully understood, but the HN stalk is thought to play an important role in the process. We have characterized ectodomain constructs of the parainfluenza virus 5 HN to understand better the underlying architecture and oligomerization properties that may influence HN functions. The PIV 5 neuraminidase (NA) domain is monomeric whereas the ectodomain forms a well-defined tetramer. The HN stalk also forms tetramers and higher order oligomers with high alpha-helical content. Together, the data indicate that the globular NA domains form weak intersubunit interactions at the end of the HN stalk tetramer, while stabilizing the stalk and overall oligomeric state of the ectodomain. Electron microscopy of the HN ectodomain reveals flexible arrangements of the NA and stalk domains, which may be important for understanding how these two HN domains impact virus entry.

  2. The heat shock protein/chaperone network and multiple stress resistance

    KAUST Repository

    Jacob, Pierre

    2016-11-15

    Crop yield has been greatly enhanced during the last century. However, most elite cultivars are adapted to temperate climates and are not well suited to more stressful conditions. In the context of climate change, stress resistance is a major concern. To overcome these difficulties, scientists may help breeders by providing genetic markers associated with stress resistance. However, multi-stress resistance cannot be obtained from the simple addition of single stress resistance traits. In the field, stresses are unpredictable and several may occur at once. Consequently, the use of single stress resistance traits is often inadequate. Although it has been historically linked with the heat stress response, the heat shock protein (HSP)/chaperone network is a major component of multiple stress responses. Among the HSP/chaperone

  3. Using co-occurrence network structure to extract synonymous gene and protein names from MEDLINE abstracts

    Directory of Open Access Journals (Sweden)

    Spackman K

    2005-04-01

    Full Text Available Abstract Background Text-mining can assist biomedical researchers in reducing information overload by extracting useful knowledge from large collections of text. We developed a novel text-mining method based on analyzing the network structure created by symbol co-occurrences as a way to extend the capabilities of knowledge extraction. The method was applied to the task of automatic gene and protein name synonym extraction. Results Performance was measured on a test set consisting of about 50,000 abstracts from one year of MEDLINE. Synonyms retrieved from curated genomics databases were used as a gold standard. The system obtained a maximum F-score of 22.21% (23.18% precision and 21.36% recall, with high efficiency in the use of seed pairs. Conclusion The method performs comparably with other studied methods, does not rely on sophisticated named-entity recognition, and requires little initial seed knowledge.

  4. Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data

    Directory of Open Access Journals (Sweden)

    Li Min

    2012-05-01

    Full Text Available Abstract Background Identification of protein complexes and functional modules from protein-protein interaction (PPI networks is crucial to understanding the principles of cellular organization and predicting protein functions. In the past few years, many computational methods have been proposed. However, most of them considered the PPI networks as static graphs and overlooked the dynamics inherent within these networks. Moreover, few of them can distinguish between protein complexes and functional modules. Results In this paper, a new framework is proposed to distinguish between protein complexes and functional modules by integrating gene expression data into protein-protein interaction (PPI data. A series of time-sequenced subnetworks (TSNs is constructed according to the time that the interactions were activated. The algorithm TSN-PCD was then developed to identify protein complexes from these TSNs. As protein complexes are significantly related to functional modules, a new algorithm DFM-CIN is proposed to discover functional modules based on the identified complexes. The experimental results show that the combination of temporal gene expression data with PPI data contributes to identifying protein complexes more precisely. A quantitative comparison based on f-measure reveals that our algorithm TSN-PCD outperforms the other previous protein complex discovery algorithms. Furthermore, we evaluate the identified functional modules by using “Biological Process” annotated in GO (Gene Ontology. The validation shows that the identified functional modules are statistically significant in terms of “Biological Process”. More importantly, the relationship between protein complexes and functional modules are studied. Conclusions The proposed framework based on the integration of PPI data and gene expression data makes it possible to identify protein complexes and functional modules more effectively. Moveover, the proposed new framework and

  5. Nuclear phosphoproteome of developing chickpea seedlings (Cicer arietinum L.) and protein-kinase interaction network.

    Science.gov (United States)

    Kumar, Rajiv; Kumar, Amit; Subba, Pratigya; Gayali, Saurabh; Barua, Pragya; Chakraborty, Subhra; Chakraborty, Niranjan

    2014-06-13

    Nucleus, the control centre of eukaryotic cell, houses most of the genetic machineries required for gene expression and their regulation. Post translational modifications of proteins, particularly phosphorylation control a wide variety of cellular processes but its functional connectivity, in plants, is still elusive. This study profiled the nuclear phosphoproteome of a grain legume, chickpea, to gain better understanding of such event. Intact nuclei were isolated from 3-week-old seedlings using two independent methods, and nuclear proteins were resolved by 2-DE. In a separate set of experiments, phosphoproteins were enriched using IMAC method and resolved by 1-DE. The separated proteins were stained with phosphospecific Pro-Q Diamond stain. Proteomic analyses led to the identification of 107 putative phosphoproteins, of which 86 were non-redundant. Multiple sites of phosphorylation were predicted on several key elements, which included both regulatory and functional proteins. The analysis revealed an array of phosphoproteins, presumably involved in a variety of cellular functions, viz., protein folding (24%), signalling and gene regulation (22%), DNA replication, repair and modification (16%), and metabolism (13%), among others. These results represent the first nucleus-specific phosphoproteome map of a non-model legume, which would provide insights into the possible function of protein phosphorylation in plants. Chickpea is grown over 10 million hectares of land worldwide, and global production hovers around 8.5 million metric tons annually. Despite its nutritional merits, it is often referred to as 'orphan' legume and has remained outside the realm of large-scale functional genomics studies. While current chickpea genome initiative has primarily focused on sequence information and functional annotation, proteomics analyses are limited. It is thus important to study the proteome of the cell organelle particularly the nucleus, which harbors most of the genetic

  6. Relating diseases by integrating gene associations and information flow through protein interaction network.

    Science.gov (United States)

    Hamaneh, Mehdi Bagheri; Yu, Yi-Kuo

    2014-01-01

    Identifying similar diseases could potentially provide deeper understanding of their underlying causes, and may even hint at possible treatments. For this purpose, it is necessary to have a similarity measure that reflects the underpinning molecular interactions and biological pathways. We have thus devised a network-based measure that can partially fulfill this goal. Our method assigns weights to all proteins (and consequently their encoding genes) by using information flow from a disease to the protein interaction network and back. Similarity between two diseases is then defined as the cosine of the angle between their corresponding weight vectors. The proposed method also provides a way to suggest disease-pathway associations by using the weights assigned to the genes to perform enrichment analysis for each disease. By calculating pairwise similarities between 2534 diseases, we show that our disease similarity measure is strongly correlated with the probability of finding the diseases in the same disease family and, more importantly, sharing biological pathways. We have also compared our results to those of MimMiner, a text-mining method that assigns pairwise similarity scores to diseases. We find the results of the two methods to be complementary. It is also shown that clustering diseases based on their similarities and performing enrichment analysis for the cluster centers significantly increases the term association rate, suggesting that the cluster centers are better representatives for biological pathways than the diseases themselves. This lends support to the view that our similarity measure is a good indicator of relatedness of biological processes involved in causing the diseases. Although not needed for understanding this paper, the raw results are available for download for further study at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbpmn/DiseaseRelations/.

  7. Integration of structural dynamics and molecular evolution via protein interaction networks: a new era in genomic medicine.

    Science.gov (United States)

    Kumar, Avishek; Butler, Brandon M; Kumar, Sudhir; Ozkan, S Banu

    2015-12-01

    Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. A Bipartite Network-based Method for Prediction of Long Non-coding RNA–protein Interactions

    Directory of Open Access Journals (Sweden)

    Mengqu Ge

    2016-02-01

    Full Text Available As one large class of non-coding RNAs (ncRNAs, long ncRNAs (lncRNAs have gained considerable attention in recent years. Mutations and dysfunction of lncRNAs have been implicated in human disorders. Many lncRNAs exert their effects through interactions with the corresponding RNA-binding proteins. Several computational approaches have been developed, but only few are able to perform the prediction of these interactions from a network-based point of view. Here, we introduce a computational method named lncRNA–protein bipartite network inference (LPBNI. LPBNI aims to identify potential lncRNA–interacting proteins, by making full use of the known lncRNA–protein interactions. Leave-one-out cross validation (LOOCV test shows that LPBNI significantly outperforms other network-based methods, including random walk (RWR and protein-based collaborative filtering (ProCF. Furthermore, a case study was performed to demonstrate the performance of LPBNI using real data in predicting potential lncRNA–interacting proteins.

  9. Artificial neural networks to model the production of blood protein hydrolysates for plant fertilisation.

    Science.gov (United States)

    Gálvez, Raúl Pérez; Carpio, Francisco Javier Espejo; Guadix, Emilia M; Guadix, Antonio

    2016-01-15

    Amino acid-based fertilisers increase the bioavailability of nitrogen in plants and help withstand stress conditions. Additionally, porcine blood protein hydrolysates are able to supply iron, which is involved in chlorophyll synthesis and improves the availability of nutrients in soil. A high degree of hydrolysis is desirable when producing a protein hydrolysate intended for fertilisation, since it assures a high supply of free amino acids. Given the complexity of enzyme reactions, empirical approaches such as artificial neural networks (ANNs) are preferred for modelisation. Porcine blood meal was hydrolysed for 3 h with subtilisin. The time evolution of the degree of hydrolysis was successfully modelled by means of a feedforward ANN comprising 10 neurons in the hidden layer and trained by the Levenberg-Marquardt algorithm. The ANN model described adequately the influence of pH, temperature, enzyme concentration and reaction time upon the degree of hydrolysis, and was used to estimate the optimal operation conditions (pH 6.67, 56.9 °C, enzyme to substrate ratio of 10 g kg(-1) and 3 h of reaction) leading to the maximum degree of hydrolysis (35.12%). ANN modelling was a useful tool to model enzymatic reactions and was successfully employed to optimise the degree of hydrolysis. © 2015 Society of Chemical Industry.

  10. Root cause investigation of deviations in protein chromatography based on mechanistic models and artificial neural networks.

    Science.gov (United States)

    Wang, Gang; Briskot, Till; Hahn, Tobias; Baumann, Pascal; Hubbuch, Jürgen

    2017-09-15

    In protein chromatography, process variations, such as aging of column or process errors, can result in deviations of the product and impurity levels. Consequently, the process performance described by purity, yield, or production rate may decrease. Based on visual inspection of the UV signal, it is hard to identify the source of the error and almost unfeasible to determine the quantity of deviation. The problem becomes even more pronounced, if multiple root causes of the deviation are interconnected and lead to an observable deviation. In the presented work, a novel method based on the combination of mechanistic chromatography models and the artificial neural networks is suggested to solve this problem. In a case study using a model protein mixture, the determination of deviations in column capacity and elution gradient length was shown. Maximal errors of 1.5% and 4.90% for the prediction of deviation in column capacity and elution gradient length respectively demonstrated the capability of this method for root cause investigation. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.

  11. A new framework for computational protein design through cost function network optimization.

    Science.gov (United States)

    Traoré, Seydou; Allouche, David; André, Isabelle; de Givry, Simon; Katsirelos, George; Schiex, Thomas; Barbe, Sophie

    2013-09-01

    The main challenge for structure-based computational protein design (CPD) remains the combinatorial nature of the search space. Even in its simplest fixed-backbone formulation, CPD encompasses a computationally difficult NP-hard problem that prevents the exact exploration of complex systems defining large sequence-conformation spaces. We present here a CPD framework, based on cost function network (CFN) solving, a recent exact combinatorial optimization technique, to efficiently handle highly complex combinatorial spaces encountered in various protein design problems. We show that the CFN-based approach is able to solve optimality a variety of complex designs that could often not be solved using a usual CPD-dedicated tool or state-of-the-art exact operations research tools. Beyond the identification of the optimal solution, the global minimum-energy conformation, the CFN-based method is also able to quickly enumerate large ensembles of suboptimal solutions of interest to rationally build experimental enzyme mutant libraries. The combined pipeline used to generate energetic models (based on a patched version of the open source solver Osprey 2.0), the conversion to CFN models (based on Perl scripts) and CFN solving (based on the open source solver toulbar2) are all available at http://genoweb.toulouse.inra.fr/~tschiex/CPD

  12. Inference of a Geminivirus−Host Protein−Protein Interaction Network through Affinity Purification and Mass Spectrometry Analysis

    Directory of Open Access Journals (Sweden)

    Liping Wang

    2017-09-01

    Full Text Available Viruses reshape the intracellular environment of their hosts, largely through protein-protein interactions, to co-opt processes necessary for viral infection and interference with antiviral defences. Due to genome size constraints and the concomitant limited coding capacity of viruses, viral proteins are generally multifunctional and have evolved to target diverse host proteins. Inference of the virus-host interaction network can be instrumental for understanding how viruses manipulate the host machinery and how re-wiring of specific pathways can contribute to disease. Here, we use affinity purification and mass spectrometry analysis (AP-MS to define the global landscape of interactions between the geminivirus Tomato yellow leaf curl virus (TYLCV and its host Nicotiana benthamiana. For this purpose, we expressed tagged versions of each of TYLCV-encoded proteins (C1/Rep, C2/TrAP, C3/REn, C4, V2, and CP in planta in the presence of the virus. Using a quantitative scoring system, 728 high-confidence plant interactors were identified, and the interaction network of each viral protein was inferred; TYLCV-targeted proteins are more connected than average, and connect with other proteins through shorter paths, which would allow the virus to exert large effects with few interactions. Comparative analyses of divergence patterns between N. benthamiana and potato, a non-host Solanaceae, showed evolutionary constraints on TYLCV-targeted proteins. Our results provide a comprehensive overview of plant proteins targeted by TYLCV during the viral infection, which may contribute to uncovering the underlying molecular mechanisms of plant viral diseases and provide novel potential targets for anti-viral strategies and crop engineering. Interestingly, some of the TYLCV-interacting proteins appear to be convergently targeted by other pathogen effectors, which suggests a central role for these proteins in plant-pathogen interactions, and pinpoints them as potential

  13. ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network

    Directory of Open Access Journals (Sweden)

    Renzhi Cao

    2017-10-01

    Full Text Available With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function prediction has been a long standing challenge to fill the gap between the huge amount of protein sequences and the known function. In this paper, we propose a novel method to convert the protein function problem into a language translation problem by the new proposed protein sequence language “ProLan” to the protein function language “GOLan”, and build a neural machine translation model based on recurrent neural networks to translate “ProLan” language to “GOLan” language. We blindly tested our method by attending the latest third Critical Assessment of Function Annotation (CAFA 3 in 2016, and also evaluate the performance of our methods on selected proteins whose function was released after CAFA competition. The good performance on the training and testing datasets demonstrates that our new proposed method is a promising direction for protein function prediction. In summary, we first time propose a method which converts the protein function prediction problem to a language translation problem and applies a neural machine translation model for protein function prediction.

  14. ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network.

    Science.gov (United States)

    Cao, Renzhi; Freitas, Colton; Chan, Leong; Sun, Miao; Jiang, Haiqing; Chen, Zhangxin

    2017-10-17

    With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function prediction has been a long standing challenge to fill the gap between the huge amount of protein sequences and the known function. In this paper, we propose a novel method to convert the protein function problem into a language translation problem by the new proposed protein sequence language "ProLan" to the protein function language "GOLan", and build a neural machine translation model based on recurrent neural networks to translate "ProLan" language to "GOLan" language. We blindly tested our method by attending the latest third Critical Assessment of Function Annotation (CAFA 3) in 2016, and also evaluate the performance of our methods on selected proteins whose function was released after CAFA competition. The good performance on the training and testing datasets demonstrates that our new proposed method is a promising direction for protein function prediction. In summary, we first time propose a method which converts the protein function prediction problem to a language translation problem and applies a neural machine translation model for protein function prediction.

  15. Prioritization of potential drug targets against P. aeruginosa by core proteomic analysis using computational subtractive genomics and Protein-Protein interaction network.

    Science.gov (United States)

    Uddin, Reaz; Jamil, Faiza

    2018-03-08

    Pseudomonas aeruginosa is an opportunistic gram-negative bacterium that has the capability to acquire resistance under hostile conditions and become a threat worldwide. It is involved in nosocomial infections. In the current study, potential novel drug targets against P. aeruginosa have been identified using core proteomic analysis and Protein-Protein Interactions (PPIs) studies. The non-redundant reference proteome of 68 strains having complete genome and latest assembly version of P. aeruginosa were downloaded from ftp NCBI RefSeq server in October 2016. The standalone CD-HIT tool was used to cluster ortholog proteins (having >=80% amino acid identity) present in all strains. The pan-proteome was clustered in 12,380 Clusters of Orthologous Proteins (COPs). By using in-house shell scripts, 3252 common COPs were extracted out and designated as clusters of core proteome. The core proteome of PAO1 strain was selected by fetching PAO1's proteome from common COPs. As a result, 1212 proteins were shortlisted that are non-homologous to the human but essential for the survival of the pathogen. Among these 1212 proteins, 321 proteins are conserved hypothetical proteins. Considering their potential as drug target, those 321 hypothetical proteins were selected and their probable functions were characterized. Based on the druggability criteria, 18 proteins were shortlisted. The interacting partners were identified by investigating the PPIs network using STRING v10 database. Subsequently, 8 proteins were shortlisted as 'hub proteins' and proposed as potential novel drug targets against P. aeruginosa. The study is interesting for the scientific community working to identify novel drug targets against MDR pathogens particularly P. aeruginosa. Copyright © 2018 Elsevier Ltd. All rights reserved.

  16. Uncovering packaging features of co-regulated modules based on human protein interaction and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    He Weiming

    2010-07-01

    Full Text Available Abstract Background Network co-regulated modules are believed to have the functionality of packaging multiple biological entities, and can thus be assumed to coordinate many biological functions in their network neighbouring regions. Results Here, we weighted edges of a human protein interaction network and a transcriptional regulatory network to construct an integrated network, and introduce a probabilistic model and a bipartite graph framework to exploit human co-regulated modules and uncover their specific features in packaging different biological entities (genes, protein complexes or metabolic pathways. Finally, we identified 96 human co-regulated modules based on this method, and evaluate its effectiveness by comparing it with four other methods. Conclusions Dysfunctions in co-regulated interactions often occur in the development of cancer. Therefore, we focussed on an example co-regulated module and found that it could integrate a number of cancer-related genes. This was extended to causal dysfunctions of some complexes maintained by several physically interacting proteins, thus coordinating several metabolic pathways that directly underlie cancer.

  17. Using Coexpression Protein Interaction Network Analysis to Identify Mechanisms of Danshensu Affecting Patients with Coronary Heart Disease

    Directory of Open Access Journals (Sweden)

    Mengqi Huo

    2017-06-01

    Full Text Available Salvia miltiorrhiza, known as Danshen, has attracted worldwide interest for its substantial effects on coronary heart disease (CHD. Danshensu (DSS is one of the main active ingredients of Danshen on CHD. Although it has been proven to have a good clinical effect on CHD, the action mechanisms remain elusive. In the current study, a coexpression network-based approach was used to illustrate the beneficial properties of DSS in the context of CHD. By integrating the gene expression profile data and protein-protein interactions (PPIs data, two coexpression protein interaction networks (CePIN in a CHD state (CHD CePIN and a non-CHD state (non-CHD CePIN were generated. Then, shared nodes and unique nodes in CHD CePIN were attained by conducting a comparison between CHD CePIN and non-CHD CePIN. By calculating the topological parameters of each shared node and unique node in the networks, and comparing the differentially expressed genes, target proteins involved in disease regulation were attained. Then, Gene Ontology (GO enrichment was utilized to identify biological processes associated to target proteins. Consequently, it turned out that the treatment of CHD with DSS may be partly attributed to the regulation of immunization and blood circulation. Also, it indicated that sodium/hydrogen exchanger 3 (SLC9A3, Prostaglandin G/H synthase 2 (PTGS2, Oxidized low-density lipoprotein receptor 1 (OLR1, and fibrinogen gamma chain (FGG may be potential therapeutic targets for CHD. In summary, this study provided a novel coexpression protein interaction network approach to provide an explanation of the mechanisms of DSS on CHD and identify key proteins which maybe the potential therapeutic targets for CHD.

  18. Inhibitory effects of nontoxic protein volvatoxin A1 on pore-forming cardiotoxic protein volvatoxin A2 by interaction with amphipathic alpha-helix.

    Science.gov (United States)

    Wu, Pei-Tzu; Lin, Su-Chang; Hsu, Chyong-Ing; Liaw, Yen-Chywan; Lin, Jung-Yaw

    2006-07-01

    Volvatoxin A2, a pore-forming cardiotoxic protein, was isolated from the edible mushroom Volvariella volvacea. Previous studies have demonstrated that volvatoxin A consists of volvatoxin A2 and volvatoxin A1, and the hemolytic activity of volvatoxin A2 is completely abolished by volvatoxin A1 at a volvatoxin A2/volvatoxin A1 molar ratio of 2. In this study, we investigated the molecular mechanism by which volvatoxin A1 inhibits the cytotoxicity of volvatoxin A2. Volvatoxin A1 by itself was found to be nontoxic, and furthermore, it inhibited the hemolytic and cytotoxic activities of volvatoxin A2 at molar ratios of 2 or lower. Interestingly, volvatoxin A1 contains 393 amino acid residues that closely resemble a tandem repeat of volvatoxin A2. Volvatoxin A1 contains two pairs of amphipathic alpha-helices but it lacks a heparin-binding site. This suggests that volvatoxin A1 may interact with volvatoxin A2 but not with the cell membrane. By using confocal microscopy, it was demonstrated that volvatoxin A1 could not bind to the cell membrane; however, volvatoxin A1 could inhibit binding of volvatoxin A2 to the cell membrane at a molar ratio of 2. Via peptide competition assay and in conjunction with pull-down and co-pull-down experiments, we demonstrated that volvatoxin A1 and volvatoxin A2 may form a complex. Our results suggest that this occurs via the interaction of one molecule of volvatoxin A1, which contains two amphipathic alpha-helices, with two molecules of volvatoxin A2, each of which contains one amphipathic alpha-helix. Taken together, the results of this study reveal a novel mechanism by which volvatoxin A1 regulates the cytotoxicity of volvatoxin A2 via direct interaction, and potentially provide an exciting new strategy for chemotherapy.

  19. Pathway Detection from Protein Interaction Networks and Gene Expression Data Using Color-Coding Methods and A* Search Algorithms

    Directory of Open Access Journals (Sweden)

    Cheng-Yu Yeh

    2012-01-01

    Full Text Available With the large availability of protein interaction networks and microarray data supported, to identify the linear paths that have biological significance in search of a potential pathway is a challenge issue. We proposed a color-coding method based on the characteristics of biological network topology and applied heuristic search to speed up color-coding method. In the experiments, we tested our methods by applying to two datasets: yeast and human prostate cancer networks and gene expression data set. The comparisons of our method with other existing methods on known yeast MAPK pathways in terms of precision and recall show that we can find maximum number of the proteins and perform comparably well. On the other hand, our method is more efficient than previous ones and detects the paths of length 10 within 40 seconds using CPU Intel 1.73GHz and 1GB main memory running under windows operating system.

  20. A protein interaction atlas for the nuclear receptors: properties and quality of a hub-based dimerisation network

    Directory of Open Access Journals (Sweden)

    De Graaf David

    2007-07-01

    Full Text Available Abstract Background The nuclear receptors are a large family of eukaryotic transcription factors that constitute major pharmacological targets. They exert their combinatorial control through homotypic heterodimerisation. Elucidation of this dimerisation network is vital in order to understand the complex dynamics and potential cross-talk involved. Results Phylogeny, protein-protein interactions, protein-DNA interactions and gene expression data have been integrated to provide a comprehensive and up-to-date description of the topology and properties of the nuclear receptor interaction network in humans. We discriminate between DNA-binding and non-DNA-binding dimers, and provide a comprehensive interaction map, that identifies potential cross-talk between the various pathways of nuclear receptors. Conclusion We infer that the topology of this network is hub-based, and much more connected than previously thought. The hub-based topology of the network and the wide tissue expression pattern of NRs create a highly competitive environment for the common heterodimerising partners. Furthermore, a significant number of negative feedback loops is present, with the hub protein SHP [NR0B2] playing a major role. We also compare the evolution, topology and properties of the nuclear receptor network with the hub-based dimerisation network of the bHLH transcription factors in order to identify both unique themes and ubiquitous properties in gene regulation. In terms of methodology, we conclude that such a comprehensive picture can only be assembled by semi-automated text-mining, manual curation and integration of data from various sources.

  1. Pleiotropy constrains the evolution of protein but not regulatory sequences in a transcription regulatory network influencing complex social behaviours

    Directory of Open Access Journals (Sweden)

    Daria eMolodtsova

    2014-12-01

    Full Text Available It is increasingly apparent that genes and networks that influence complex behaviour are evolutionary conserved, which is paradoxical considering that behaviour is labile over evolutionary timescales. How does adaptive change in behaviour arise if behaviour is controlled by conserved, pleiotropic, and likely evolutionary constrained genes? Pleiotropy and connectedness are known to constrain the general rate of protein evolution, prompting some to suggest that the evolution of complex traits, including behaviour, is fuelled by regulatory sequence evolution. However, we seldom have data on the strength of selection on mutations in coding and regulatory sequences, and this hinders our ability to study how pleiotropy influences coding and regulatory sequence evolution. Here we use population genomics to estimate the strength of selection on coding and regulatory mutations for a transcriptional regulatory network that influences complex behaviour of honey bees. We found that replacement mutations in highly connected transcription factors and target genes experience significantly stronger negative selection relative to weakly connected transcription factors and targets. Adaptively evolving proteins were significantly more likely to reside at the periphery of the regulatory network, while proteins with signs of negative selection were near the core of the network. Interestingly, connectedness and network structure had minimal influence on the strength of selection on putative regulatory sequences for both transcription factors and their targets. Our study indicates that adaptive evolution of complex behaviour can arise because of positive selection on protein-coding mutations in peripheral genes, and on regulatory sequence mutations in both transcription factors and their targets throughout the network.

  2. Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of ‘date' and ‘party' hubs

    Science.gov (United States)

    Chang, Xiao; Xu, Tao; Li, Yun; Wang, Kai

    2013-01-01

    The protein-protein interaction (PPI) networks are dynamically organized as modules, and are typically described by hub dichotomy: ‘party' hubs act as intramodule hubs and are coexpressed with their partners, yet ‘date' hubs act as coordinators among modules and are incoherently expressed with their partners. However, there remains skepticism about the existence of hub dichotomy. Since different algorithms and data sets were used in previous studies to test the model of hub classification, the conclusions may be largely influenced by the potential inherent biases. In this study, we evaluated two data sets of yeast interactome, and systematically investigated the behavior of hubs from multiple perspectives including co-expression patterns, topological roles and functional classifications. Our results revealed consistency between the two data sets, confirming the presence of hub dichotomy. Furthermore, we analyzed a human interactome data set, and demonstrated that the modular architecture of the PPI networks was more complicated than hub dichotomy. PMID:23603706

  3. Identification of T1D susceptibility genes within the MHC region by combining protein interaction networks and SNP genotyping data

    DEFF Research Database (Denmark)

    Brorsson, C.; Hansen, Niclas Tue; Hansen, Kasper Lage

    2009-01-01

    genes. We have developed a novel method that combines single nucleotide polymorphism (SNP) genotyping data with protein-protein interaction (ppi) networks to identify disease-associated network modules enriched for proteins encoded from the MHC region. Approximately 2500 SNPs located in the 4 Mb MHC......To develop novel methods for identifying new genes that contribute to the risk of developing type 1 diabetes within the Major Histocompatibility Complex (MHC) region on chromosome 6, independently of the known linkage disequilibrium (LD) between human leucocyte antigen (HLA)-DRB1, -DQA1, -DQB1...... are well known in the pathogenesis of T1D, but the modules also contain additional candidates that have been implicated in beta-cell development and diabetic complications. The extensive LD within the MHC region makes it important to develop new methods for analysing genotyping data for identification...

  4. Structure and Function Study of HIV and Influenza Fusion Proteins

    Science.gov (United States)

    Liang, Shuang

    Human immunodeficiency virus (HIV) and influenza virus are membrane-enveloped viruses causing acquired immunodeficiency syndrome (AIDS) and flu. The initial step of HIV and influenza virus infection is fusion between viral and host cell membrane catalyzed by the viral fusion protein gp41 and hemagglutinin (HA) respectively. However, the structure of gp41 and HA as well as the infection mechanism are still not fully understood. This work addresses (1) full length gp41 ectodomain and TM domain structure and function and (2) IFP membrane location and IFP-membrane interaction. My studies of gp41 protein and IFP can provide better understanding of the membrane fusion mechanism and may aid development of anti-viral therapeutics and vaccine. The full length ectodomain and transmembrane domain of gp41 and shorter constructs were expressed, purified and solubilized at physiology conditions. The constructs adopt overall alpha helical structure in SDS and DPC detergents, and showed hyperthermostability with Tm > 90 °C. The oligomeric states of these proteins vary in different detergent buffer: predominant trimer for all constructs and some hexamer fraction for HM and HM_TM protein in SDS at pH 7.4; and mixtures of monomer, trimer, and higher-order oligomer protein in DPC at pH 4.0 and 7.4. Substantial protein-induced vesicle fusion was observed, including fusion of neutral vesicles at neutral pH, which are the conditions similar HIV/cell fusion. Vesicle fusion by a gp41 ectodomain construct has rarely been observed under these conditions, and is aided by inclusion of both the FP and TM, and by protein which is predominantly trimer rather than monomer. Current data was integrated with existing data, and a structural model was proposed. Secondary structure and conformation of IFP is a helix-turn-helix structure in membrane. However, there has been arguments about the IFP membrane location. 13C-2H REDOR solid-state NMR is used to solve this problem. The IFP adopts major alpha

  5. A homologous mapping method for three-dimensional reconstruction of protein networks reveals disease-associated mutations.

    Science.gov (United States)

    Huang, Sing-Han; Lo, Yu-Shu; Luo, Yong-Chun; Tseng, Yu-Yao; Yang, Jinn-Moon

    2018-03-19

    One of the crucial steps toward understanding the associations among molecular interactions, pathways, and diseases in a cell is to investigate detailed atomic protein-protein interactions (PPIs) in the structural interactome. Despite the availability of large-scale methods for analyzing PPI networks, these methods often focused on PPI networks using genome-scale data and/or known experimental PPIs. However, these methods are unable to provide structurally resolved interaction residues and their conservations in PPI networks. Here, we reconstructed a human three-dimensional (3D) structural PPI network (hDiSNet) with the detailed atomic binding models and disease-associated mutations by enhancing our PPI families and 3D-domain interologs from 60,618 structural complexes and complete genome database with 6,352,363 protein sequences across 2274 species. hDiSNet is a scale-free network (γ = 2.05), which consists of 5177 proteins and 19,239 PPIs with 5843 mutations. These 19,239 structurally resolved PPIs not only expanded the number of PPIs compared to present structural PPI network, but also achieved higher agreement with gene ontology similarities and higher co-expression correlation than the ones of 181,868 experimental PPIs recorded in public databases. Among 5843 mutations, 1653 and 790 mutations involved in interacting domains and contacting residues, respectively, are highly related to diseases. Our hDiSNet can provide detailed atomic interactions of human disease and their associated proteins with mutations. Our results show that the disease-related mutations are often located at the contacting residues forming the hydrogen bonds or conserved in the PPI family. In addition, hDiSNet provides the insights of the FGFR (EGFR)-MAPK pathway for interpreting the mechanisms of breast cancer and ErbB signaling pathway in brain cancer. Our results demonstrate that hDiSNet can explore structural-based interactions insights for understanding the mechanisms of disease

  6. Quantitative and systems pharmacology 2. In silico polypharmacology of G protein-coupled receptor ligands via network-based approaches.

    Science.gov (United States)

    Wu, Zengrui; Lu, Weiqiang; Yu, Weiwei; Wang, Tianduanyi; Li, Weihua; Liu, Guixia; Zhang, Hankun; Pang, Xiufeng; Huang, Jin; Liu, Mingyao; Cheng, Feixiong; Tang, Yun

    2018-03-01

    G protein-coupled receptors (GPCRs) are the largest super family with more than 800 membrane receptors. Currently, over 30% of the approved drugs target human GPCRs. However, only approximately 30 human GPCRs have been resolved three-dimensional crystal structures, which limits traditional structure-based drug discovery. Recent advances in network-based systems pharmacology approaches have demonstrated powerful strategies for identifying new targets of GPCR ligands. In this study, we proposed a network-based systems pharmacology framework for comprehensive identification of new drug-target interactions on GPCRs. Specifically, we reconstructed both global and local drug-target interaction networks for human GPCRs. Network analysis on the known drug-target networks showed rational strategies for designing new GPCR ligands and evaluating side effects of the approved GPCR drugs. We further built global and local network-based models for predicting new targets of the known GPCR ligands. The area under the receiver operating characteristic curve of more than 0.96 was obtained for the best network-based models in cross validation. In case studies, we identified that several network-predicted GPCR off-targets (e.g. ADRA2A, ADRA2C and CHRM2) were associated with cardiovascular complications (e.g. bradycardia and palpitations) of the approved GPCR drugs via an integrative analysis of drug-target and off-target-adverse drug event networks. Importantly, we experimentally validated that two newly predicted compounds, AM966 and Ki16425, showed high binding affinities on prostaglandin E2 receptor EP4 subtype with IC 50 =2.67μM and 6.34μM, respectively. In summary, this study offers powerful network-based tools for identifying polypharmacology of GPCR ligands in drug discovery and development. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Rational drug repositioning guided by an integrated pharmacological network of protein, disease and drug

    Directory of Open Access Journals (Sweden)

    Lee Hee

    2012-07-01

    Full Text Available Abstract Background The process of drug discovery and development is time-consuming and costly, and the probability of success is low. Therefore, there is rising interest in repositioning existing drugs for new medical indications. When successful, this process reduces the risk of failure and costs associated with de novo drug development. However, in many cases, new indications of existing drugs have been found serendipitously. Thus there is a clear need for establishment of rational methods for drug repositioning. Results In this study, we have established a database we call “PharmDB” which integrates data associated with disease indications, drug development, and associated proteins, and known interactions extracted from various established databases. To explore linkages of known drugs to diseases of interest from within PharmDB, we designed the Shared Neighborhood Scoring (SNS algorithm. And to facilitate exploration of tripartite (Drug-Protein-Disease network, we developed a graphical data visualization software program called phExplorer, which allows us to browse PharmDB data in an interactive and dynamic manner. We validated this knowledge-based tool kit, by identifying a potential application of a hypertension drug, benzthiazide (TBZT, to induce lung cancer cell death. Conclusions By combining PharmDB, an integrated tripartite database, with Shared Neighborhood Scoring (SNS algorithm, we developed a knowledge platform to rationally identify new indications for known FDA approved drugs, which can be customized to specific projects using manual curation. The data in PharmDB is open access and can be easily explored with phExplorer and accessed via BioMart web service (http://www.i-pharm.org/, http://biomart.i-pharm.org/.

  8. Rational drug repositioning guided by an integrated pharmacological network of protein, disease and drug.

    Science.gov (United States)

    Lee, Hee Sook; Bae, Taejeong; Lee, Ji-Hyun; Kim, Dae Gyu; Oh, Young Sun; Jang, Yeongjun; Kim, Ji-Tea; Lee, Jong-Jun; Innocenti, Alessio; Supuran, Claudiu T; Chen, Luonan; Rho, Kyoohyoung; Kim, Sunghoon

    2012-07-02

    The process of drug discovery and development is time-consuming and costly, and the probability of success is low. Therefore, there is rising interest in repositioning existing drugs for new medical indications. When successful, this process reduces the risk of failure and costs associated with de novo drug development. However, in many cases, new indications of existing drugs have been found serendipitously. Thus there is a clear need for establishment of rational methods for drug repositioning. In this study, we have established a database we call "PharmDB" which integrates data associated with disease indications, drug development, and associated proteins, and known interactions extracted from various established databases. To explore linkages of known drugs to diseases of interest from within PharmDB, we designed the Shared Neighborhood Scoring (SNS) algorithm. And to facilitate exploration of tripartite (Drug-Protein-Disease) network, we developed a graphical data visualization software program called phExplorer, which allows us to browse PharmDB data in an interactive and dynamic manner. We validated this knowledge-based tool kit, by identifying a potential application of a hypertension drug, benzthiazide (TBZT), to induce lung cancer cell death. By combining PharmDB, an integrated tripartite database, with Shared Neighborhood Scoring (SNS) algorithm, we developed a knowledge platform to rationally identify new indications for known FDA approved drugs, which can be customized to specific projects using manual curation. The data in PharmDB is open access and can be easily explored with phExplorer and accessed via BioMart web service (http://www.i-pharm.org/, http://biomart.i-pharm.org/).

  9. Comparative studies of mitochondrial proteomics reveal an intimate protein network of male sterility in wheat (Triticum aestivum L.)

    Science.gov (United States)

    Wang, Shuping; Zhang, Gaisheng; Zhang, Yingxin; Song, Qilu; Chen, Zheng; Wang, Junsheng; Guo, Jialin; Niu, Na; Wang, Junwei; Ma, Shoucai

    2015-01-01

    Plant male sterility has often been associated with mitochondrial dysfunction; however, the mechanism in wheat (Triticum aestivum L.) has not been elucidated. This study set out to probe the mechanism of physiological male sterility (PHYMS) induced by the chemical hybridizing agent (CHA)-SQ-1, and cytoplasmic male sterility (CMS) of wheat at the proteomic level. A total of 71 differentially expressed mitochondrial proteins were found to be involved in pollen abortion and further identified by MALDI-TOF/TOF MS (matrix-assisted laser desorption/ionization-time of fight/time of flight mass spectrometry). These proteins were implicated in different cellular responses and metabolic processes, with obvious functional tendencies toward the tricarboxylic acid cycle, the mitochondrial electron transport chain, protein synthesis and degradation, oxidation stress, the cell division cycle, and epigenetics. Interactions between identified proteins were demonstrated by bioinformatics analysis, enabling a more complete insight into biological pathways involved in anther abortion and pollen defects. Accordingly, a mitochondria-mediated male sterility protein network in wheat is proposed; this network was further confirmed by physiological data, RT-PCR (real-time PCR), and TUNEL (terminal deoxynucleotidyl transferase-mediated dUTP nick end labelling) assay. The results provide intriguing insights into the metabolic pathway of anther abortion induced by CHA-SQ-1 and also give useful clues to identify the crucial proteins of PHYMS and CMS in wheat. PMID:26136264

  10. Exploration of the pathways and interaction network involved in bladder cancer cell line with knockdown of Opa interacting protein 5.

    Science.gov (United States)

    He, Xuefeng; Ding, Xiang; Wen, Duangai; Hou, Jianquan; Ping, Jigen; He, Jun

    2017-09-01

    In our previous study, we displayed that knockdown of Opa interacting protein 5 (OIP5) inhibited cell growth, disturbed cell cycle and increased cell apoptosis in bladder cancer (BC) cell line. Our present study aimed to explore the underlying pathways and interaction network involved in the roles of OIP5 in BC. Microarray analysis was conducted to obtain mRNA expression profiling of OIP5 knockdown (shOIP5) and control (shCtrl) BC cell lines. Bioinformatics analyses were performed including differentially expressed mRNAs (DEGs) identification, protein-protein interaction network construction, biological functions of prediction and ingenuity pathways analysis (IPA). Western Blotting (WB) was subjected to validate the protein expression levels of candidate DEGs in shOIP5 BC cell line. Respective 255 up- and 184 down-regulated DEGs were identified in shOIP5 group compared with shCtrl group. In the PPI network, CAND1 and MYC had the highest connectivity with DEGs. 439 DEGs were significantly enriched in inflammatory response, regulation of cell proliferation, Toll-like receptor signaling pathway, cytokine-cytokine receptor interaction and bladder cancer. In the disease and function enrichment, DEGs were obviously involved in cellular movement, cellular growth and proliferation, cancer, inflammatory response, cell death and survival. In the OIP5 regulatory network, CDH2, IRS1, IRAK3, ID1, TNF, IL6, ITGA6, MYC and SOD2 interacted with OIP5. The WB validation results were compatible with our bioinformatics analyses. OIP5 interaction network might function as an oncogene in BC progression based on aberrant inflammatory responses. Our study might provide valuable information for investigation of tumorigenesis mechanism in BC. Copyright © 2017 Elsevier GmbH. All rights reserved.

  11. Diffusion of information throughout the host interactome reveals gene expression variations in network proximity to target proteins of hepatitis C virus.

    Directory of Open Access Journals (Sweden)

    Ettore Mosca

    Full Text Available Hepatitis C virus infection is one of the most common and chronic in the world, and hepatitis associated with HCV infection is a major risk factor for the development of cirrhosis and hepatocellular carcinoma (HCC. The rapidly growing number of viral-host and host protein-protein interactions is enabling more and more reliable network-based analyses of viral infection supported by omics data. The study of molecular interaction networks helps to elucidate the mechanistic pathways linking HCV molecular activities and the host response that modulates the stepwise hepatocarcinogenic process from preneoplastic lesions (cirrhosis and dysplasia to HCC. Simulating the impact of HCV-host molecular interactions throughout the host protein-protein interaction (PPI network, we ranked the host proteins in relation to their network proximity to viral targets. We observed that the set of proteins in the neighborhood of HCV targets in the host interactome is enriched in key players of the host response to HCV infection. In opposition to HCV targets, subnetworks of proteins in network proximity to HCV targets are significantly enriched in proteins reported as differentially expressed in preneoplastic and neoplastic liver samples by two independent studies. Using multi-objective optimization, we extracted subnetworks that are simultaneously "guilt-by-association" with HCV proteins and enriched in proteins differentially expressed. These subnetworks contain established, recently proposed and novel candidate proteins for the regulation of the mechanisms of liver cells response to chronic HCV infection.

  12. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat

    NARCIS (Netherlands)

    Babbitt, P.C.; Bagos, P.G.; Bairoch, A.; Bateman, A.; Chatonnet, A.; Chen, M.J.; Craik, D.J.; Finn, R.D.; Gloriam, D.; Haft, D.H.; Henrissat, B.; Holliday, G.L.; Isberg, V.; Kaas, Q.; Landsman, D.; Lenfant, N.; Manning, G.; Nagano, N.; Srinivasan, N.; O'Donovan, C.; Pruitt, K.D.; Sowdhamini, R.; Rawlings, N.D.; Saier, M.H., Jr.; Sharman, J.L.; Spedding, M.; Tsirigos, K.D.; Vastermark, A.; Vriend, G.

    2015-01-01

    During 11-12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from

  13. RegPhos 2.0: an updated resource to explore protein kinase-substrate phosphorylation networks in mammals.

    Science.gov (United States)

    Huang, Kai-Yao; Wu, Hsin-Yi; Chen, Yi-Ju; Lu, Cheng-Tsung; Su, Min-Gang; Hsieh, Yun-Chung; Tsai, Chih-Ming; Lin, Kuo-I; Huang, Hsien-Da; Lee, Tzong-Yi; Chen, Yu-Ju

    2014-01-01

    Protein phosphorylation catalyzed by kinases plays crucial roles in regulating a variety of intracellular processes. Owing to an increasing number of in vivo phosphorylation sites that have been identified by mass spectrometry (MS)-based proteomics, the RegPhos, available online at http://csb.cse.yzu.edu.tw/RegPhos2/, was developed to explore protein phosphorylation networks in human. In this update, we not only enhance the data content in human but also investigate kinase-substrate phosphorylation networks in mouse and rat. The experimentally validated phosphorylation sites as well as their catalytic kinases were extracted from public resources, and MS/MS phosphopeptides were manually curated from research articles. RegPhos 2.0 aims to provide a more comprehensive view of intracellular signaling networks by integrating the information of metabolic pathways and protein-protein interactions. A case study shows that analyzing the phosphoproteome profile of time-dependent cell activation obtained from Liquid chromatography-mass spectrometry (LC-MS/MS) analysis, the RegPhos deciphered not only the consistent scheme in B cell receptor (BCR) signaling pathway but also novel regulatory molecules that may involve in it. With an attempt to help users efficiently identify the candidate biomarkers in cancers, 30 microarray experiments, including 39 cancerous versus normal cells, were analyzed for detecting cancer-specific expressed genes coding for kinases and their substrates. Furthermore, this update features an improved web interface to facilitate convenient access to the exploration of phosphorylation networks for a group of genes/proteins. Database URL: http://csb.cse.yzu.edu.tw/RegPhos2/

  14. Riboproteomics: A versatile approach for the identification of host protein interaction network in plant pathogenic noncoding RNAs.

    Directory of Open Access Journals (Sweden)

    Sonali Chaturvedi

    Full Text Available Pathogenic or non-pathogenic small (17 to 30 nt and long (>200 nt non-coding RNAs (ncRNAs have been implicated in the regulation of gene expression at transcriptional, post-transcriptional and epigenetic level by interacting with host proteins. However, lack of suitable experimental system precludes the identification and evaluation of the functional significance of host proteins interacting with ncRNAs. In this study, we present a first report on the application of riboproteomics to identify host proteins interacting with small, highly pathogenic, noncoding satellite RNA (sat-RNA associated with Cucumber mosaic virus, the helper virus (HV. RNA affinity beads containing sat-RNA transcripts of (+ or (--sense covalently coupled to cyanogen bromide activated sepharose beads were incubated with total protein extracts from either healthy or HV-infected Nicotiana benthamiana leaves. RNA-protein complexes bound to the beads were eluted and subjected to MudPIT analysis. Bioinformatics programs PANTHER classification and WoLF-PSORT were used to further classify the identified host proteins in each case based on their functionality and subcellular distribution. Finally, we observed that the host protein network interacting with plus and minus-strand transcripts of sat-RNA, in the presence or absence of HV is distinct, and the global interactome of host proteins interacting with satRNA in either of the orientations is very different.

  15. Recovering kinetics from a simplified protein folding model using replica exchange simulations: a kinetic network and effective stochastic dynamics.

    Science.gov (United States)

    Zheng, Weihua; Andrec, Michael; Gallicchio, Emilio; Levy, Ronald M

    2009-08-27

    We present an approach to recover kinetics from a simplified protein folding model at different temperatures using the combined power of replica exchange (RE), a kinetic network, and effective stochastic dynamics. While RE simulations generate a large set of discrete states with the correct thermodynamics, kinetic information is lost due to the random exchange of temperatures. We show how we can recover the kinetics of a 2D continuous potential with an entropic barrier by using RE-generated discrete states as nodes of a kinetic network. By choosing the neighbors and the microscopic rates between the neighbors appropriately, the correct kinetics of the system can be recovered by running a kinetic simulation on the network. We fine-tune the parameters of the network by comparison with the effective drift velocities and diffusion coefficients of the system determined from short-time stochastic trajectories. One of the advantages of the kinetic network model is that the network can be built on a high-dimensional discretized state space, which can consist of multiple paths not consistent with a single reaction coordinate.

  16. An approach to improve kernel-based Protein-Protein Interaction extraction by learning from large-scale network data.

    Science.gov (United States)

    Li, Lishuang; Guo, Rui; Jiang, Zhenchao; Huang, Degen

    2015-07-15

    Protein-Protein Interaction extraction (PPIe) from biomedical literatures is an important task in biomedical text mining and has achieved desirable results on the annotated datasets. However, the traditional machine learning methods on PPIe suffer badly from vocabulary gap and data sparseness, which weakens classification performance. In this work, an approach capturing external information from the web-based data is introduced to address these problems and boost the existing methods. The approach involves three kinds of word representation techniques: distributed representation, vector clustering and Brown clusters. Experimental results show that our method outperforms the state-of-the-art methods on five publicly available corpora. Our code and data are available at: http://chaoslog.com/improving-kernel-based-protein-protein-interaction-extraction-by-unsupervised-word-representation-codes-and-data.html. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. A yeast-based genomic strategy highlights the cell protein networks altered by FTase inhibitor peptidomimetics

    Directory of Open Access Journals (Sweden)

    Porcu Giampiero

    2010-07-01

    Full Text Available Abstract Background Farnesyltransferase inhibitors (FTIs are anticancer agents developed to inhibit Ras oncoprotein activities. FTIs of different chemical structure act via a conserved mechanism in eukaryotic cells. They have low toxicity and are active on a wide range of tumors in cellular and animal models, independently of the Ras activation state. Their ultimate mechanism of action, however, remains undetermined. FTase has hundred of substrates in human cells, many of which play a pivotal role in either tumorigenesis or in pro-survival pathways. This lack of knowledge probably accounts for the failure of FTIs at clinical stage III for most of the malignancies treated, with the notable exception of haematological malignancies. Understanding which cellular pathways are the ultimate targets of FTIs in different tumor types and the basis of FTI resistance is required to improve the efficacy of FTIs in cancer treatment. Results Here we used a yeast-based cellular assay to define the transcriptional changes consequent to FTI peptidomimetic administration in conditions that do not substantially change Ras membrane/cytosol distribution. Yeast and cancer cell lines were used to validate the results of the network analysis. The transcriptome of yeast cells treated with FTase inhibitor I was compared with that of untreated cells and with an isogenic strain genetically inhibited for FTase activity (Δram1. Cells treated with GGTI-298 were analyzed in a parallel study to validate the specificity of the FTI response. Network analysis, based on gene ontology criteria, identified a cell cycle gene cluster up-regulated by FTI treatment that has the Aurora A kinase IPL1 and the checkpoint protein MAD2 as hubs. Moreover, TORC1-S6K-downstream effectors were found to be down-regulated in yeast and mammalian FTI-treated cells. Notably only FTIs, but not genetic inhibition of FTase, elicited up-regulation of ABC/transporters. Conclusions This work provides a view

  18. Distinct roles of a tyrosine-associated hydrogen-bond network in fine-tuning the structure and function of heme proteins: two cases designed for myoglobin.

    Science.gov (United States)

    Liao, Fei; Yuan, Hong; Du, Ke-Jie; You, Yong; Gao, Shu-Qin; Wen, Ge-Bo; Lin, Ying-Wu; Tan, Xiangshi

    2016-10-20

    A hydrogen-bond (H-bond) network, specifically a Tyr-associated H-bond network, plays key roles in regulating the structure and function of proteins, as exemplified by abundant heme proteins in nature. To explore an approach for fine-tuning the structure and function of artificial heme proteins, we herein used myoglobin (Mb) as a model protein and introduced a Tyr residue in the secondary sphere of the heme active site at two different positions (107 and 138). We performed X-ray crystallography, UV-Vis spectroscopy, stopped-flow kinetics, and electron paramagnetic resonance (EPR) studies for the two single mutants, I107Y Mb and F138Y Mb, and compared to that of wild-type Mb under the same conditions. The results showed that both Tyr107 and Tyr138 form a distinct H-bond network involving water molecules and neighboring residues, which fine-tunes ligand binding to the heme iron and enhances the protein stability, respectively. Moreover, the Tyr107-associated H-bond network was shown to fine-tune both H2O2 binding and activation. With two cases demonstrated for Mb, this study suggests that the Tyr-associated H-bond network has distinct roles in regulating the protein structure, properties and functions, depending on its location in the protein scaffold. Therefore, it is possible to design a Tyr-associated H-bond network in general to create other artificial heme proteins with improved properties and functions.

  19. Rational Design of Thermodynamic and Kinetic Binding Profiles by Optimizing Surface Water Networks Coating Protein-Bound Ligands.

    Science.gov (United States)

    Krimmer, Stefan G; Cramer, Jonathan; Betz, Michael; Fridh, Veronica; Karlsson, Robert; Heine, Andreas; Klebe, Gerhard

    2016-12-08

    A previously studied congeneric series of thermolysin inhibitors addressing the solvent-accessible S 2 ' pocket with different hydrophobic substituents showed modulations of the surface water layers coating the protein-bound inhibitors. Increasing stabilization of water molecules resulted in an enthalpically more favorable binding signature, overall enhancing affinity. Based on this observation, we optimized the series by designing tailored P 2 ' substituents to improve and further stabilize the surface water network. MD simulations were applied to predict the putative water pattern around the bound ligands. Subsequently, the inhibitors were synthesized and characterized by high-resolution crystallography, microcalorimetry, and surface plasmon resonance. One of the designed inhibitors established the most pronounced water network of all inhibitors tested so far, composed of several fused water polygons, and showed 50-fold affinity enhancement with respect to the original methylated parent ligand. Notably, the inhibitor forming the most perfect water network also showed significantly prolonged residence time compared to the other tested inhibitors.

  20. Analysis of Drug Design for a Selection of G Protein-Coupled Neuro-Receptors Using Neural Network Techniques

    DEFF Research Database (Denmark)

    Agerskov, Claus; Mortensen, Rasmus M.; Bohr, Henrik G.

    2015-01-01

    A study is presented on how well possible drug-molecules can be predicted with respect to their function and binding to a selection of neuro-receptors by the use of artificial neural networks. The ligands investigated in this study are chosen to be corresponding to the G protein-coupled receptors...... computational tools, able to aid in drug-design in a fast and cheap fashion, compared to conventional pharmacological techniques....

  1. Cooperation of the ER-shaping proteins atlastin, lunapark, and reticulons to generate a tubular membrane network

    OpenAIRE

    Wang, Songyu; Tukachinsky, Hanna; Romano, Fabian B; Rapoport, Tom A

    2016-01-01

    eLife digest The endoplasmic reticulum is a compartment within the cells of plants, animals and other eukaryotes. This compartment plays a number of roles within cells, for example, serving as the site where many proteins and fat molecules are built. Most often the endoplasmic reticulum exists as a network of thin tubules. However, this shape changes during the lifetime of a single cell, and the endoplasmic reticulum converts into flattened structures known as sheets when the cell divides. Th...

  2. Nine steps to proteomic wisdom: A practical guide to using protein-protein interaction networks and molecular pathways as a framework for interpreting disease proteomic profiles.

    Science.gov (United States)

    Isserlin, Ruth; Emili, Andrew

    2007-09-01

    A major aim of proteomic profiling of disease is to uncover the mechanistic basis of a given pathology. High-throughput experimental techniques continue to advance rapidly, but are still plagued by high rates of false negatives, false positives, and other spurious findings. By reducing a disease profile to a subset of differentially expressed proteins and determining functional over-representation, one can often make a reasonable first-pass assessment as to what might be happening in disease. Integrating mRNA expression patterns together with prior knowledge of protein-protein interaction networks and biological pathway information goes a step further, providing clues into the core processes that are aberrant in the disease state, and indicating which cellular functions are activated or repressed as a maladaptive pathophysiological response. This multi-step framework allows one to hypothesize as to possible cause and effect of pathology, and highlights potentially instructive pathways or sub-networks for subsequent experimental validation. Indeed, efficiently exploiting data regarding the myriad of physical and genetic interactions among expressed gene products, in parallel with the systematic sampling of genetic variation among diverse human populations, promises to revolutionize our current understanding of disease action at a deeper molecular level. Copyright © 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. The protein network surrounding the human telomere repeat binding factors TRF1, TRF2, and POT1

    Energy Technology Data Exchange (ETDEWEB)

    Giannone, Richard J [ORNL; McDonald, W Hayes [ORNL; Hurst, Gregory {Greg} B [ORNL; Shen, Rong-Fong [National Institute on Aging, National Institutes of Health; Wang, Yisong [ORNL; Liu, Yie [National Institute on Aging, Baltimore

    2010-01-01

    Telomere integrity (including telomere length and capping) is critical in overall genomic stability. Telomere repeat binding factors and their associated proteins play vital roles in telomere length regulation and end protection. In this study, we explore the protein network surrounding telomere repeat binding factors, TRF1, TRF2, and POT1 using dual-tag affinity purification in combination with multidimensional protein identification technology liquid chromatography - tandem mass spectrometry (MudPIT LC-MS/MS). After control subtraction and data filtering, we found that TRF2 and POT1 co-purified all six members of the telomere protein complex, while TRF1 identified five of six components at frequencies that lend evidence towards the currently accepted telomere architecture. Many of the known TRF1 or TRF2 interacting proteins were also identified. Moreover, putative associating partners identified for each of the three core components fell into functional categories such as DNA damage repair, ubiquitination, chromosome cohesion, chromatin modification/remodeling, DNA replication, cell cycle and transcription regulation, nucleotide metabolism, RNA processing, and nuclear transport. These putative protein-protein associations may participate in different biological processes at telomeres or, intriguingly, outside telomeres.

  4. The protein network surrounding the human telomere repeat binding factors TRF1, TRF2, and POT1.

    Directory of Open Access Journals (Sweden)

    Richard J Giannone

    2010-08-01

    Full Text Available Telomere integrity (including telomere length and capping is critical in overall genomic stability. Telomere repeat binding factors and their associated proteins play vital roles in telomere length regulation and end protection. In this study, we explore the protein network surrounding telomere repeat binding factors, TRF1, TRF2, and POT1 using dual-tag affinity purification in combination with multidimensional protein identification technology liquid chromatography--tandem mass spectrometry (MudPIT LC-MS/MS. After control subtraction and data filtering, we found that TRF2 and POT1 co-purified all six members of the telomere protein complex, while TRF1 identified five of six components at frequencies that lend evidence towards the currently accepted telomere architecture. Many of the known TRF1 or TRF2 interacting proteins were also identified. Moreover, putative associating partners identified for each of the three core components fell into functional categories such as DNA damage repair, ubiquitination, chromosome cohesion, chromatin modification/remodeling, DNA replication, cell cycle and transcription regulation, nucleotide metabolism, RNA processing, and nuclear transport. These putative protein-protein associations may participate in different biological processes at telomeres or, intriguingly, outside telomeres.

  5. A network of 2-4 nm filaments found in sea urchin smooth muscle. Protein constituents and in situ localization.

    Science.gov (United States)

    Pureur, R P; Coffe, G; Soyer-Gobillard, M O; de Billy, F; Pudles, J

    1986-01-01

    In this report the coisolation of two proteins from sea urchin smooth muscle of apparent molecular weights (Mr) 54 and 56 kD respectively, as determined on SDS-PAGE, is described. Like the intermediate filament proteins, these two proteins are insoluble in high ionic strength buffer solution. On two-dimensional gel electrophoresis and by immunological methods it is shown that these proteins are not related (by these criteria) to rat smooth muscle desmin (54 kD) or vimentin (56 kD). Furthermore, in conditions where both desmin and vimentin assemble in vitro into 10 nm filaments, the sea urchin smooth muscle proteins do not assemble into filaments. Ultrastructural studies on the sea urchin smooth muscle cell show that the thin and thick filaments organization resembles that described in the vertebrate smooth muscle. However, instead of 10 nm filaments, a network of filaments, 2-4 nm in diameter, is revealed, upon removal of the thin and thick filaments by 0.6 M KCl treatment. By indirect immunofluorescence microscopy, and in particular by immunocytochemical electron microscopy studies on the sea urchin smooth muscle cell, it is shown that the antibodies raised against both 54 and 56 kD proteins appear to specifically label these 2-4 nm filaments. These findings indicate that both the 54 and 56 kD proteins might be constituents of this category of filaments. The possible significance of this new cytoskeletal element, that we have named echinonematin filaments, is discussed.

  6. Protein contact order prediction from primary sequences

    Directory of Open Access Journals (Sweden)

    Wishart David S

    2008-05-01

    Full Text Available Abstract Background Contact order is a topological descriptor that has been shown to be correlated with several interesting protein properties such as protein folding rates and protein transition state placements. Contact order has also been used to select for viable protein folds from ab initio protein structure prediction programs. For proteins of known three-dimensional structure, their contact order can be calculated directly. However, for proteins with unknown three-dimensional structure, there is no effective prediction method currently available. Results In this paper, we propose several simple yet very effective methods to predict contact order from the amino acid sequence only. One set of methods is based on a weighted linear combination of predicted secondary structure content and amino acid composition. Depending on the number of components used in these equations it is possible to achieve a correlation coefficient of 0.857–0.870 between the observed and predicted contact order. A second method, based on sequence similarity to known three-dimensional structures, is able to achieve a correlation coefficient of 0.977. We have also developed a much more robust implementation for calculating contact order directly from PDB coordinates that works for > 99% PDB files. All of these contact order predictors and calculators have been implemented as a web server (see Availability and requirements section for URL. Conclusion Protein contact order can be effectively predicted from the primary sequence, at the absence of three-dimensional structure. Three factors, percentage of residues in alpha helices, percentage of residues in beta strands, and sequence length, appear to be strongly correlated with the absolute contact order.

  7. Cooperation of the ER-shaping proteins atlastin, lunapark, and reticulons to generate a tubular membrane network.

    Science.gov (United States)

    Wang, Songyu; Tukachinsky, Hanna; Romano, Fabian B; Rapoport, Tom A

    2016-09-13

    In higher eukaryotes, the endoplasmic reticulum (ER) contains a network of membrane tubules, which transitions into sheets during mitosis. Network formation involves curvature-stabilizing proteins, including the reticulons (Rtns), as well as the membrane-fusing GTPase atlastin (ATL) and the lunapark protein (Lnp). Here, we have analyzed how these proteins cooperate. ATL is needed to not only form, but also maintain, the ER network. Maintenance requires a balance between ATL and Rtn, as too little ATL activity or too high Rtn4a concentrations cause ER fragmentation. Lnp only affects the abundance of three-way junctions and tubules. We suggest a model in which ATL-mediated fusion counteracts the instability of free tubule ends. ATL tethers and fuses tubules stabilized by the Rtns, and transiently sits in newly formed three-way junctions. Lnp subsequently moves into the junctional sheets and forms oligomers. Lnp is inactivated by mitotic phosphorylation, which contributes to the tubule-to-sheet conversion of the ER.

  8. Networking

    OpenAIRE

    Rauno Lindholm, Daniel; Boisen Devantier, Lykke; Nyborg, Karoline Lykke; Høgsbro, Andreas; Fries, de; Skovlund, Louise

    2016-01-01

    The purpose of this project was to examine what influencing factor that has had an impact on the presumed increasement of the use of networking among academics on the labour market and how it is expressed. On the basis of the influence from globalization on the labour market it can be concluded that the globalization has transformed the labour market into a market based on the organization of networks. In this new organization there is a greater emphasis on employees having social qualificati...

  9. Crystal structure of an endotoxin-neutralizing protein from the horseshoe crab, Limulus anti-LPS factor, at 1.5 A resolution.

    Science.gov (United States)

    Hoess, A; Watson, S; Siber, G R; Liddington, R

    1993-09-01

    Lipopolysaccharide (LPS), or endotoxin, is the major mediator of septic shock, a serious complication of Gram-negative bacterial infections in humans. Molecules that bind LPS and neutralize its biological effects or enhance its clearance could have important clinical applications. Limulus anti-LPS factor (LALF) binds LPS tightly, and, in animal models, reduces mortality when administered before or after LPS challenge or bacterial infection. Here we present the high resolution structure of a recombinant LALF. It has a single domain consisting of three alpha-helices packed against a four-stranded beta-sheet. The wedge-shaped molecule has a striking charge distribution and amphipathicity that suggest how it can insert into membranes. The binding site for LPS probably involves an extended amphipathic loop, and we propose that two mammalian LPS-binding proteins will have a similar loop. The amphipathic loop structure may be used in the design of molecules with therapeutic properties against septic shock.

  10. A Novel Type III Endosome Transmembrane Protein, TEMP

    Directory of Open Access Journals (Sweden)

    Rohan D. Teasdale

    2012-11-01

    Full Text Available As part of a high-throughput subcellular localisation project, the protein encoded by the RIKEN mouse cDNA 2610528J11 was expressed and identified to be associated with both endosomes and the plasma membrane. Based on this, we have assigned the name TEMP for Type III Endosome Membrane Protein. TEMP encodes a short protein of 111 amino acids with a single, alpha-helical transmembrane domain. Experimental analysis of its membrane topology demonstrated it is a Type III membrane protein with the amino-terminus in the lumenal, or extracellular region, and the carboxy-terminus in the cytoplasm. In addition to the plasma membrane TEMP was localized to Rab5 positive early endosomes, Rab5/Rab11 positive recycling endosomes but not Rab7 positive late endosomes. Video microscopy in living cells confirmed TEMP's plasma membrane localization and identified the intracellular endosome compartments to be tubulovesicular. Overexpression of TEMP resulted in the early/recycling endosomes clustering at the cell periphery that was dependent on the presence of intact microtubules. The cellular function of TEMP cannot be inferred based on bioinformatics comparison, but its cellular distribution between early/recycling endosomes and the plasma membrane suggests a role in membrane transport.

  11. Neural Network Enhanced Structure Determination of Osteoporosis, Immune System, and Radiation Repair Proteins, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — The proposed innovation will utilize self learning neural network technology to determine the structure of osteoporosis, immune system disease, and excess radiation...

  12. Gene expression profiles and protein–protein interaction network analysis in AIDS patients with HIV-associated encephalitis and dementia

    Directory of Open Access Journals (Sweden)

    Shityakov S

    2015-11-01

    Full Text Available Sergey Shityakov,1 Thomas Dandekar,2 Carola Förster1 1Department of Anesthesia and Critical Care, 2Department of Bioinformatics, University of Würzburg, Würzburg, Germany Abstract: Central nervous system dysfunction is an important cause of morbidity and mortality in patients with human immunodeficiency virus type 1 (HIV-1 infection and acquired immunodeficiency virus syndrome (AIDS. Patients with AIDS are usually affected by HIV-associated encephalitis (HIVE with viral replication limited to cells of monocyte origin. To examine the molecular mechanisms underlying HIVE-induced dementia, the GSE4755 Affymetrix data were obtained from the Gene Expression Omnibus database and the differentially expressed genes (DEGs between the samples from AIDS patients with and without apparent features of HIVE-induced dementia were identified. In addition, protein–protein interaction networks were constructed by mapping DEGs into protein–protein interaction data to identify the pathways that these DEGs are involved in. The results revealed that the expression of 1,528 DEGs is mainly involved in the immune response, regulation of cell proliferation, cellular response to inflammation, signal transduction, and viral replication cycle. Heat-shock protein alpha, class A member 1 (HSP90AA1, and fibronectin 1 were detected as hub nodes with degree values >130. In conclusion, the results indicate that HSP90A and fibronectin 1 play important roles in HIVE pathogenesis.Keywords: microarray, human immunodeficiency virus, differentially expressed genes, protein–protein interaction network, gene ontology, encephalitis, dementia

  13. A dynamic and adaptive network of cytosolic interactions governs protein export by the T3SS injectisome.

    Science.gov (United States)

    Diepold, Andreas; Sezgin, Erdinc; Huseyin, Miles; Mortimer, Thomas; Eggeling, Christian; Armitage, Judith P

    2017-06-27

    Many bacteria use a type III secretion system (T3SS) to inject effector proteins into host cells. Selection and export of the effectors is controlled by a set of soluble proteins at the cytosolic interface of the membrane spanning type III secretion 'injectisome'. Combining fluorescence microscopy, biochemical interaction studies and fluorescence correlation spectroscopy, we show that in live Yersinia enterocolitica bacteria these soluble proteins form complexes both at the injectisome and in the cytosol. Binding to the injectisome stabilizes these cytosolic complexes, whereas the free cytosolic complexes, which include the type III secretion ATPase, constitute a highly dynamic and adaptive network. The extracellular calcium concentration, which triggers activation of the T3SS, directly influences the cytosolic complexes, possibly through the essential component SctK/YscK, revealing a potential mechanism involved in the regulation of type III secretion.

  14. Prediction of the anti-inflammatory mechanisms of curcumin by module-based protein interaction network analysis.

    Science.gov (United States)

    Gan, Yanxiong; Zheng, Shichao; Baak, Jan P A; Zhao, Silei; Zheng, Yongfeng; Luo, Nini; Liao, Wan; Fu, Chaomei

    2015-11-01

    Curcumin, the medically active component from Curcuma longa (Turmeric), is widely used to treat inflammatory diseases. Protein interaction network (PIN) analysis was used to predict its mechanisms of molecular action. Targets of curcumin were obtained based on ChEMBL and STITCH databases. Protein-protein interactions (PPIs) were extracted from the String database. The PIN of curcumin was constructed by Cytoscape and the function modules identified by gene ontology (GO) enrichment analysis based on molecular complex detection (MCODE). A PIN of curcumin with 482 nodes and 1688 interactions was constructed, which has scale-free, small world and modular properties. Based on analysis of these function modules, the mechanism of curcumin is proposed. Two modules were found to be intimately associated with inflammation. With function modules analysis, the anti-inflammatory effects of curcumin were related to SMAD, ERG and mediation by the TLR family. TLR9 may be a potential target of curcumin to treat inflammation.

  15. MOCASSIN-prot: A multi-objective clustering approach for protein similarity networks

    Science.gov (United States)

    Proteins often include multiple conserved domains. Various evolutionary events including duplication and loss of domains, domain shuffling, as well as sequence divergence contribute to generating complexities in protein structures. A large variation exists, for example, in the numbers, combinations,...

  16. Near-atomic cryo-EM imaging of a small protein displayed on a designed scaffolding system.

    Science.gov (United States)

    Liu, Yuxi; Gonen, Shane; Gonen, Tamir; Yeates, Todd O

    2018-03-27

    Current single-particle cryo-electron microscopy (cryo-EM) techniques can produce images of large protein assemblies and macromolecular complexes at atomic level detail without the need for crystal growth. However, proteins of smaller size, typical of those found throughout the cell, are not presently amenable to detailed structural elucidation by cryo-EM. Here we use protein design to create a modular, symmetrical scaffolding system to make protein molecules of typical size suitable for cryo-EM. Using a rigid continuous alpha helical linker, we connect a small 17-kDa protein (DARPin) to a protein subunit that was designed to self-assemble into a cage with cubic symmetry. We show that the resulting construct is amenable to structural analysis by single-particle cryo-EM, allowing us to identify and solve the structure of the attached small protein at near-atomic detail, ranging from 3.5- to 5-Å resolution. The result demonstrates that proteins considerably smaller than the theoretical limit of 50 kDa for cryo-EM can be visualized clearly when arrayed in a rigid fashion on a symmetric designed protein scaffold. Furthermore, because the amino acid sequence of a DARPin can be chosen to confer tight binding to various other protein or nucleic acid molecules, the system provides a future route for imaging diverse macromolecules, potentially broadening the application of cryo-EM to proteins of typical size in the cell.

  17. [Pharmacological mechanism analysis of oligopeptide from Pinctada fucata based on in silico proteolysis and protein interaction network].

    Science.gov (United States)

    Chen, Yan-Kun; Qiao, Lian-Sheng; Huo, Xiao-Qian; Zhang, Xu; Han, Na; Zhang, Yan-Ling

    2017-09-01

    Pinctada fucata oligopeptide is one of key pharmaceutical effective constituents of P. fucata. It is significant to analyze its pharmacological effect and mechanism. This study aims to discover the potential oligopeptides from P. fucata and analyze the mechanism of P. fucata oligopeptide based on in silico technologies and protein interaction network(PIN). First, main protein sequences of P. fucata were collected, and oligopeptides were obtained using in silico gastrointestinal tract proteolysis. Then, key potential targets of P. fucata oligopeptides were obtained through pharmacophore screening. The protein-protein interaction(PPI) of targets was achieved and implemented to construct PIN and analyze the mechanism of P. fucata oligopeptides. P. fucata oligopeptide database was constructed based on in silico technologies, including 458 oligopeptides. Twelve modules were identified from PIN by a graph theoretic clustering algorithm Molecular Complex Detection(MCODE) and analyzed by Gene ontology(GO) enrichment. The results indicated that P. fucata oligopeptides have an effect in treating neurological diseases, such as Alzheimer's disease. In silico proteolysis could be used to analyze the protein sequences of traditional Chinese medicine(TCM). According to the combination of in silico proteolysis and PIN, the biological activity of oligopeptides could be interpreted rapidly based on the known TCM protein sequence. The study provides the methodology basis for rapidly and efficiently implementing the mechanism analysis of TCM oligopeptides. Copyright© by the Chinese Pharmaceutical Association.

  18. Deep Convolutional Neural Network Analysis of Flow Imaging Microscopy Data to Classify Subvisible Particles in Protein Formulations.

    Science.gov (United States)

    Calderon, Christopher P; Daniels, Austin L; Randolph, Theodore W

    2018-04-01

    Flow-imaging microscopy (FIM) is commonly used to characterize subvisible particles in therapeutic protein formulations. Although pharmaceutical companies often collect large repositories of FIM images of protein therapeutic products, current state-of-the-art methods for analyzing these images rely on low-dimensional lists of "morphological features" to characterize particles that ignore much of the information encoded in the existing image databases. Deep convolutional neural networks (sometimes referred to as "CNNs or ConvNets") have demonstrated the ability to extract predictive information from raw macroscopic image data without requiring the selection or specification of "morphological features" in a variety of tasks. However, the inherent heterogeneity of protein therapeutics and optical phenomena associated with subvisible FIM particle measurements introduces new challenges regarding the application of ConvNets to FIM image analysis. We demonstrate a supervised learning technique leveraging ConvNets to extract information from raw images in order to predict the process conditions or stress states (freeze-thawing, mechanical shaking, etc.) that produced a variety of different protein particles. We demonstrate that our new classifier, in combination with a "data pooling" strategy, can nearly perfectly differentiate between protein formulations in a variety of scenarios of relevance to protein therapeutics quality control and process monitoring using as few as 20 particles imaged via FIM. Copyright © 2018 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.

  19. Analysis of the Yeast Kinome Reveals a Network of Regulated Protein Localization during Filamentous Growth

    OpenAIRE

    Bharucha, Nikë; Ma, Jun; Dobry, Craig J.; Lawson, Sarah K.; Yang, Zhifen; Kumar, Anuj

    2008-01-01

    The subcellular distribution of kinases and other signaling proteins is regulated in response to cellular cues; however, the extent of this regulation has not been investigated for any gene set in any organism. Here, we present a systematic analysis of protein kinases in the budding yeast, screening for differential localization during filamentous growth. Filamentous growth is an important stress response involving mitogen-activated protein kinase and cAMP-dependent protein kinase signaling m...

  20. NeBcon: protein contact map prediction using neural network training coupled with naïve Bayes classifiers.

    Science.gov (United States)

    He, Baoji; Mortuza, S M; Wang, Yanting; Shen, Hong-Bin; Zhang, Yang

    2017-08-01

    Recent CASP experiments have witnessed exciting progress on folding large-size non-humongous proteins with the assistance of co-evolution based contact predictions. The success is however anecdotal due to the requirement of the contact prediction methods for the high volume of sequence homologs that are not available to most of the non-humongous protein targets. Development of efficient methods that can generate balanced and reliable contact maps for different type of protein targets is essential to enhance the success rate of the ab initio protein structure prediction. We developed a new pipeline, NeBcon, which uses the naïve Bayes classifier (NBC) theorem to combine eight state of the art contact methods that are built from co-evolution and machine learning approaches. The posterior probabilities of the NBC model are then trained with intrinsic structural features through neural network learning for the final contact map prediction. NeBcon was tested on 98 non-redundant proteins, which improves the accuracy of the best co-evolution based meta-server predictor by 22%; the magnitude of the improvement increases to 45% for the hard targets that lack sequence and structural homologs in the databases. Detailed data analysis showed that the major contribution to the improvement is due to the optimized NBC combination of the complementary information from both co-evolution and machine learning predictions. The neural network training also helps to improve the coupling of the NBC posterior probability and the intrinsic structural features, which were found particularly important for the proteins that do not have sufficient number of homologous sequences to derive reliable co-evolution profiles. On-line server and standalone package of the program are available at http://zhanglab.ccmb.med.umich.edu/NeBcon/ . zhng@umich.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For

  1. In silico Sequence Analysis, Structure Prediction and Function Annotation of Human Bcl-X Beta Protein

    Directory of Open Access Journals (Sweden)

    Anjali Singh

    2014-03-01

    Full Text Available Bcl-X proteins are the one of the best categorized member of the Bcl-2 protein families which acts as primary regulators of apoptosis in mammalian cells. The Bcl-X proteins are potential anti-cancer drug targets. In this study, the tertiary structure of the beta isoform of the apoptosis regulator Bcl-X in humans (h-Bcl-Xβ has been predicted by fold-recognition (threading approach. In silico assessment of the h-Bcl-Xβ protein revealed the characteristic structural features of anti-apoptotic Bcl-2 protein family in h-Bcl-Xβ protein. The predicted model was comprised of BH1-BH4 domains, seven alpha-helices and a C-terminal transmembrane domain for membrane localization and sub-cellular targeting. Quality assessment of the predict model confirmed its reliability as fairly good model. Active sites of h-Bcl-Xβ protein were identified using CASTp server. The future work can be directed towards drug designing for cancer treatment by regulating the activity of h-Bcl-Xβ proteins.

  2. Bayesian Markov random field analysis for integrated network-based protein function prediction

    NARCIS (Netherlands)

    Kourmpetis, Y.I.A.

    2011-01-01

    Unravelling the functions of proteins is one of the most important aims of modern biology. Experimental inference of protein function is expensive and not scalable to large datasets. In this thesis a probabilistic method for protein function prediction is presented that integrates different

  3. Equal opportunity for low-degree network nodes: a PageRank-based method for protein target identification in metabolic graphs.

    Directory of Open Access Journals (Sweden)

    Dániel Bánky

    Full Text Available Biological network data, such as metabolic-, signaling- or physical interaction graphs of proteins are increasingly available in public repositories for important species. Tools for the quantitative analysis of these networks are being developed today. Protein network-based drug target identification methods usually return protein hubs with large degrees in the networks as potentially important targets. Some known, important protein targets, however, are not hubs at all, and perturbing protein hubs in these networks may have several unwanted physiological effects, due to their interaction with numerous partners. Here, we show a novel method applicable in networks with directed edges (such as metabolic networks that compensates for the low degree (non-hub vertices in the network, and identifies important nodes, regardless of their hub properties. Our method computes the PageRank for the nodes of the network, and divides the PageRank by the in-degree (i.e., the number of incoming edges of the node. This quotient is the same in all nodes in an undirected graph (even for large- and low-degree nodes, that is, for hubs and non-hubs as well, but may differ significantly from node to node in directed graphs. We suggest to assign importance to non-hub nodes with large PageRank/in-degree quotient. Consequently, our method gives high scores to nodes with large PageRank, relative to their degrees: therefore non-hub important nodes can easily be identified in large networks. We demonstrate that these relatively high PageRank scores have biological relevance: the method correctly finds numerous already validated drug targets in distinct organisms (Mycobacterium tuberculosis, Plasmodium falciparum and MRSA Staphylococcus aureus, and consequently, it may suggest new possible protein targets as well. Additionally, our scoring method was not chosen arbitrarily: its value for all nodes of all undirected graphs is constant; therefore its high value captures

  4. Equal opportunity for low-degree network nodes: a PageRank-based method for protein target identification in metabolic graphs.

    Science.gov (United States)

    Bánky, Dániel; Iván, Gábor; Grolmusz, Vince

    2013-01-01

    Biological network data, such as metabolic-, signaling- or physical interaction graphs of proteins are increasingly available in public repositories for important species. Tools for the quantitative analysis of these networks are being developed today. Protein network-based drug target identification methods usually return protein hubs with large degrees in the networks as potentially important targets. Some known, important protein targets, however, are not hubs at all, and perturbing protein hubs in these networks may have several unwanted physiological effects, due to their interaction with numerous partners. Here, we show a novel method applicable in networks with directed edges (such as metabolic networks) that compensates for the low degree (non-hub) vertices in the network, and identifies important nodes, regardless of their hub properties. Our method computes the PageRank for the nodes of the network, and divides the PageRank by the in-degree (i.e., the number of incoming edges) of the node. This quotient is the same in all nodes in an undirected graph (even for large- and low-degree nodes, that is, for hubs and non-hubs as well), but may differ significantly from node to node in directed graphs. We suggest to assign importance to non-hub nodes with large PageRank/in-degree quotient. Consequently, our method gives high scores to nodes with large PageRank, relative to their degrees: therefore non-hub important nodes can easily be identified in large networks. We demonstrate that these relatively high PageRank scores have biological relevance: the method correctly finds numerous already validated drug targets in distinct organisms (Mycobacterium tuberculosis, Plasmodium falciparum and MRSA Staphylococcus aureus), and consequently, it may suggest new possible protein targets as well. Additionally, our scoring method was not chosen arbitrarily: its value for all nodes of all undirected graphs is constant; therefore its high value captures importance in the

  5. Proteomic, cellular, and network analyses reveal new DUSP3 interactions with nucleolar proteins in HeLa cells.

    Science.gov (United States)

    Panico, Karine; Forti, Fabio Luis

    2013-12-06

    DUSP3 (or Vaccinia virus phosphatase VH1-related; VHR) is a small dual-specificity phosphatase known to dephosphorylate c-Jun N-terminal kinases and extracellular signal-regulated kinases. In human cervical cancer cells, DUSP3 is overexpressed, localizes preferentially to the nucleus, and plays a key role in cellular proliferation and senescence triggering. Other DUSP3 functions are still unknown, as illustrated by recent and unpublished results from our group showing that this enzyme mediates DNA damage response or repair processes. In this study, we sought to identify new interactions between DUSP3 and proteins directly or indirectly involved in or correlated with its biological roles in HeLa cells exposed to gamma or UV radiation. By using GST-DUSP as bait, we pulled down interacting proteins and identified them by LC-MS/MS. Of the 46 proteins obtained, six hits were extensively validated by immune techniques; the proteins Nucleophosmin, HnRNP C1/C2, and Nucleolin were the most promising targets found to directly interact with DUSP3. We then analyzed the DUSP3 interactomes using physical protein-protein interaction networks using our hits as the seed list. The validated hits as well as unvalidated hits fluctuated on the DUSP3 interactomes of HeLa cells, independent of the time post radiation, which confirmed our proteomic and experimental data and clearly showed the proximity of DUSP3 to proteins involved in processes intimately related to DNA repair and senescence, such as Ku70 and Tert, via interactions with nucleolar proteins, which were identified in this study, that regulate DNA/RNA structure and functions.

  6. Conformational Flexibility of Proteins Involved in Ribosome Biogenesis: Investigations via Small Angle X-ray Scattering (SAXS

    Directory of Open Access Journals (Sweden)

    Dritan Siliqi

    2018-02-01

    Full Text Available The dynamism of proteins is central to their function, and several proteins have been described as flexible, as consisting of multiple domains joined by flexible linkers, and even as intrinsically disordered. Several techniques exist to study protein structures, but small angle X-ray scattering (SAXS has proven to be particularly powerful for the quantitative analysis of such flexible systems. In the present report, we have used SAXS in combination with X-ray crystallography to highlight their usefulness at characterizing flexible proteins, using as examples two proteins involved in different steps of ribosome biogenesis. The yeast BRCA2 and CDKN1A-interactig protein, Bcp1, is a chaperone for Rpl23 of unknown structure. We showed that it consists of a rigid, slightly elongated protein, with a secondary structure comprising a mixture of alpha helices and beta sheets. As an example of a flexible molecule, we studied the SBDS (Shwachman-Bodian-Diamond Syndrome protein that is involved in the cytoplasmic maturation of the 60S subunit and constitutes the mutated target in the Shwachman-Diamond Syndrome. In solution, this protein coexists in an ensemble of three main conformations, with the N- and C-terminal ends adopting different orientations with respect to the central domain. The structure observed in the protein crystal corresponds to an average of those predicted by the SAXS flexibility analysis.

  7. Modeling of allergen proteins found in sea food products

    Directory of Open Access Journals (Sweden)

    Nataly Galán-Freyle

    2012-06-01

    Full Text Available Shellfish are a source of food allergens, and their consumption is the cause of severe allergic reactions in humans. Tropomyosins, a family of muscle proteins, have been identified as the major allergens in shellfish and mollusks species. Nevertheless, few experimentally determined three-dimensional structures are available in the Protein Data Base (PDB. In this study, 3D models of several homologous of tropomyosins present in marine shellfish and mollusk species (Chaf 1, Met e1, Hom a1, Per v1, and Pen a1 were constructed, validated, and their immunoglobulin E binding epitopes were identified using bioinformatics tools. All protein models for these allergens consisted of long alpha-helices. Chaf 1, Met e1, and Hom a1 had six conserved regions with sequence similarities to known epitopes, whereas Per v1 and Pen a1 contained only one. Lipophilic potentials of identified epitopes revealed a high propensity of hydrophobic amino acids in the immunoglobulin E binding site. This information could be useful to design tropomyosin-specific immunotherapy for sea food allergies.

  8. Folding dynamics of a family of beta-sheet proteins

    Science.gov (United States)

    Rousseau, Denis

    2008-03-01

    Fatty acid binding proteins (FABP) consist of ten anti-parallel beta strands and two small alpha helices. The beta strands are arranged into two nearly orthogonal five-strand beta sheets that surround the interior cavity, which binds unsaturated long-chain fatty acids. In the brain isoform (BFABP), these are very important for the development of the central nervous system and neuron differentiation. Furthermore, BFABP is implicated in the p