WorldWideScience

Sample records for coat protein sequences

  1. RNA2 of grapevine fanleaf virus: sequence analysis and coat protein cistron location.

    Science.gov (United States)

    Serghini, M A; Fuchs, M; Pinck, M; Reinbolt, J; Walter, B; Pinck, L

    1990-07-01

    The nucleotide sequence of the genomic RNA2 (3774 nucleotides) of grapevine fanleaf virus strain F13 was determined from overlapping cDNA clones and its genetic organization was deduced. Two rapid and efficient methods were used for cDNA cloning of the 5' region of RNA2. The complete sequence contained only one long open reading frame of 3555 nucleotides (1184 codons, 131K product). The analysis of the N-terminal sequence of purified coat protein (CP) and identification of its C-terminal residue have allowed the CP cistron to be precisely positioned within the polyprotein. The CP produced by proteolytic cleavage at the Arg/Gly site between residues 680 and 681 contains 504 amino acids (Mr 56019) and has hydrophobic properties. The Arg/Gly cleavage site deduced by N-terminal amino acid sequence analysis is the first for a nepovirus coat protein and for plant viruses expressing their genomic RNAs by polyprotein synthesis. Comparison of GFLV RNA2 with M RNA of cowpea mosaic comovirus and with RNA2 of two closely related nepoviruses, tomato black ring virus and Hungarian grapevine chrome mosaic virus, showed strong similarities among the 3' non-coding regions but less similarity among the 5' end non-coding sequences than reported among other nepovirus RNAs.

  2. Nucleotide sequence of the coat protein gene of the Skierniewice isolate of plum pox virus (PPV)

    International Nuclear Information System (INIS)

    Wypijewski, K.; Musial, W.; Augustyniak, J.; Malinowski, T.

    1994-01-01

    The coat protein (CP) gene of the Skierniewice isolate of plum pox virus (PPV-S) has been amplified using the reverse transcription - polymerase chain reaction (RT-PCR), cloned and sequenced. The nucleotide sequence of the gene and the deduced amino-acid sequences of PPV-S CP were compared with those of other PPV strains. The nucleotide sequence showed very high homology to most of the published sequences. The motif: Asp-Ala-Gly (DAG), important for the aphid transmissibility, was present in the amino-acid sequence. Our isolate did not react in ELISA with monoclonal antibodies MAb06 supposed to be specific for PPV-D. (author). 32 refs, 1 fig., 2 tabs

  3. Partial characterization of the lettuce infectious yellows virus genomic RNAs, identification of the coat protein gene and comparison of its amino acid sequence with those of other filamentous RNA plant viruses.

    Science.gov (United States)

    Klaassen, V A; Boeshore, M; Dolja, V V; Falk, B W

    1994-07-01

    Purified virions of lettuce infectious yellows virus (LIYV), a tentative member of the closterovirus group, contained two RNAs of approximately 8500 and 7300 nucleotides (RNAs 1 and 2 respectively) and a single coat protein species with M(r) of approximately 28,000. LIYV-infected plants contained multiple dsRNAs. The two largest were the correct size for the replicative forms of LIYV virion RNAs 1 and 2. To assess the relationships between LIYV RNAs 1 and 2, cDNAs corresponding to the virion RNAs were cloned. Northern blot hybridization analysis showed no detectable sequence homology between these RNAs. A partial amino acid sequence obtained from purified LIYV coat protein was found to align in the most upstream of four complete open reading frames (ORFs) identified in a LIYV RNA 2 cDNA clone. The identity of this ORF was confirmed as the LIYV coat protein gene by immunological analysis of the gene product expressed in vitro and in Escherichia coli. Computer analysis of the LIYV coat protein amino acid sequence indicated that it belongs to a large family of proteins forming filamentous capsids of RNA plant viruses. The LIYV coat protein appears to be most closely related to the coat proteins of two closteroviruses, beet yellows virus and citrus tristeza virus.

  4. Nucleotide sequence of the coat protein gene of Lettuce big-vein virus.

    Science.gov (United States)

    Sasaya, T; Ishikawa, K; Koganezawa, H

    2001-06-01

    A sequence of 1425 nt was established that included the complete coat protein (CP) gene of Lettuce big-vein virus (LBVV). The LBVV CP gene encodes a 397 amino acid protein with a predicted M(r) of 44486. Antisera raised against synthetic peptides corresponding to N-terminal or C-terminal parts of the LBVV CP reacted in Western blot analysis with a protein with an M(r) of about 48000. RNA extracted from purified particles of LBVV by using proteinase K, SDS and phenol migrated in gels as two single-stranded RNA species of approximately 7.3 kb (ss-1) and 6.6 kb (ss-2). After denaturation by heat and annealing at room temperature, the RNA migrated as four species, ss-1, ss-2 and two additional double-stranded RNAs (ds-1 and ds-2). The Northern blot hybridization analysis using riboprobes from a full-length clone of the LBVV CP gene indicated that ss-2 has a negative-sense nature and contains the LBVV CP gene. Moreover, ds-2 is a double-stranded form of ss-2. Database searches showed that the LBVV CP most resembled the nucleocapsid proteins of rhabdoviruses. These results indicate that it would be appropriate to classify LBVV as a negative-sense single-stranded RNA virus rather than as a double-stranded RNA virus.

  5. Amino acid sequences mediating vascular cell adhesion molecule 1 binding to integrin alpha 4: homologous DSP sequence found for JC polyoma VP1 coat protein

    Directory of Open Access Journals (Sweden)

    Michael Andrew Meyer

    2013-07-01

    Full Text Available The JC polyoma viral coat protein VP1 was analyzed for amino acid sequences homologies to the IDSP sequence which mediates binding of VLA-4 (integrin alpha 4 to vascular cell adhesion molecule 1. Although the full sequence was not found, a DSP sequence was located near the critical arginine residue linked to infectivity of the virus and binding to sialic acid containing molecules such as integrins (3. For the JC polyoma virus, a DSP sequence was found at residues 70, 71 and 72 with homology also noted for the mouse polyoma virus and SV40 virus. Three dimensional modeling of the VP1 molecule suggests that the DSP loop has an accessible site for interaction from the external side of the assembled viral capsid pentamer.

  6. Functional analysis of bipartite begomovirus coat protein promoter sequences

    International Nuclear Information System (INIS)

    Lacatus, Gabriela; Sunter, Garry

    2008-01-01

    We demonstrate that the AL2 gene of Cabbage leaf curl virus (CaLCuV) activates the CP promoter in mesophyll and acts to derepress the promoter in vascular tissue, similar to that observed for Tomato golden mosaic virus (TGMV). Binding studies indicate that sequences mediating repression and activation of the TGMV and CaLCuV CP promoter specifically bind different nuclear factors common to Nicotiana benthamiana, spinach and tomato. However, chromatin immunoprecipitation demonstrates that TGMV AL2 can interact with both sequences independently. Binding of nuclear protein(s) from different crop species to viral sequences conserved in both bipartite and monopartite begomoviruses, including TGMV, CaLCuV, Pepper golden mosaic virus and Tomato yellow leaf curl virus suggests that bipartite begomoviruses bind common host factors to regulate the CP promoter. This is consistent with a model in which AL2 interacts with different components of the cellular transcription machinery that bind viral sequences important for repression and activation of begomovirus CP promoters

  7. High genetic diversity in the coat protein and 3' untranslated regions

    Indian Academy of Sciences (India)

    The 3′ terminal region consisting of the coat protein (CP) coding sequence and 3′ untranslated region (3′UTR) was cloned and sequenced from seven isolates. Sequence comparisons revealed considerable genetic diversity among the isolates in their CP and 3′UTR, making CdMV one of the highly variable members ...

  8. Crystal structure of the bacteriophage Qβ coat protein in complex with the RNA operator of the replicase gene.

    Science.gov (United States)

    Rumnieks, Janis; Tars, Kaspars

    2014-03-06

    The coat proteins of single-stranded RNA bacteriophages specifically recognize and bind to a hairpin structure in their genome at the beginning of the replicase gene. The interaction serves to repress the synthesis of the replicase enzyme late in infection and contributes to the specific encapsidation of phage RNA. While this mechanism is conserved throughout the Leviviridae family, the coat protein and operator sequences from different phages show remarkable variation, serving as prime examples for the co-evolution of protein and RNA structure. To better understand the protein-RNA interactions in this virus family, we have determined the three-dimensional structure of the coat protein from bacteriophage Qβ bound to its cognate translational operator. The RNA binding mode of Qβ coat protein shares several features with that of the widely studied phage MS2, but only one nucleotide base in the hairpin loop makes sequence-specific contacts with the protein. Unlike in other RNA phages, the Qβ coat protein does not utilize an adenine-recognition pocket for binding a bulged adenine base in the hairpin stem but instead uses a stacking interaction with a tyrosine side chain to accommodate the base. The extended loop between β strands E and F of Qβ coat protein makes contacts with the lower part of the RNA stem, explaining the greater length dependence of the RNA helix for optimal binding to the protein. Consequently, the complex structure allows the proposal of a mechanism by which the Qβ coat protein recognizes and discriminates in favor of its cognate RNA. © 2013.

  9. Shotgun protein sequencing.

    Energy Technology Data Exchange (ETDEWEB)

    Faulon, Jean-Loup Michel; Heffelfinger, Grant S.

    2009-06-01

    A novel experimental and computational technique based on multiple enzymatic digestion of a protein or protein mixture that reconstructs protein sequences from sequences of overlapping peptides is described in this SAND report. This approach, analogous to shotgun sequencing of DNA, is to be used to sequence alternative spliced proteins, to identify post-translational modifications, and to sequence genetically engineered proteins.

  10. Experimental Rugged Fitness Landscape in Protein Sequence Space

    OpenAIRE

    HAYASHI, Yuuki; 相田, 拓洋; TOYOTA, Hitoshi; 伏見, 譲; URABE, Itaru; YOMO, Tetsuya

    2006-01-01

    The fitness landscape in sequence space determines the process of biomolecular evolution. To plot the fitness landscape of protein function, we carried out in vitro molecular evolution beginning with a defective fd phage carrying a random polypeptide of 139 amino acids in place of the g3p minor coat protein D2 domain, which is essential for phage infection. After 20 cycles of random substitution at sites 12-130 of the initial random polypeptide and selection for infectivity, the selected phag...

  11. Prunus necrotic ringspot ilarvirus: nucleotide sequence of RNA3 and the relationship to other ilarviruses based on coat protein comparison.

    Science.gov (United States)

    Guo, D; Maiss, E; Adam, G; Casper, R

    1995-05-01

    The RNA3 of prunus necrotic ringspot ilarvirus (PNRSV) has been cloned and its entire sequence determined. The RNA3 consists of 1943 nucleotides (nt) and possesses two large open reading frames (ORFs) separated by an intergenic region of 74 nt. The 5' proximal ORF is 855 nt in length and codes for a protein of molecular mass 31.4 kDa which has homologies with the putative movement protein of other members of the Bromoviridae. The 3' proximal ORF of 675 nt is the cistron for the coat protein (CP) and has a predicted molecular mass of 24.9 kDa. The sequence of the 3' non-coding region (NCR) of PNRSV RNA3 showed a high degree of similarity with those of tobacco streak virus (TSV), prune dwarf virus (PDV), apple mosaic virus (ApMV) and also alfalfa mosaic virus (AIMV). In addition it contained potential stem-loop structures with interspersed AUGC motifs characteristic for ilar- and alfamoviruses. This conserved primary and secondary structure in all 3' NCRs may be responsible for the interaction with homologous and heterologous CPs and subsequent activation of genome replication. The CP gene of an ApMV isolate (ApMV-G) of 657 nt has also been cloned and sequenced. Although ApMV and PNRSV have a distant serological relationship, the deduced amino acid sequences of their CPs have an identity of only 51.8%. The N termini of PNRSV and ApMV CPs have in common a zinc-finger motif and the potential to form an amphipathic helix.

  12. Molecular identification based on coat protein sequences of the Barley yellow dwarf virus from Brazil

    Directory of Open Access Journals (Sweden)

    Talita Bernardon Mar

    2013-12-01

    Full Text Available Yellow dwarf disease, one of the most important diseases of cereal crops worldwide, is caused by virus species belonging to the Luteoviridae family. Forty-two virus isolates obtained from oat (Avena sativa L., wheat (Triticum aestivum L., barley (Hordeum vulgare L., corn (Zea mays L., and ryegrass (Lolium multiflorum Lam. collected between 2007 and 2008 from winter cereal crop regions in southern Brazil were screened by polymerase chain reaction (PCR with primers designed on ORF 3 (coat protein - CP for the presence of Barley yellow dwarf virus and Cereal yellow dwarf virus (B/CYDV. PCR products of expected size (~357 bp for subgroup II and (~831 bp for subgroup I were obtained for three and 39 samples, respectively. These products were cloned and sequenced. The subgroup II 3' partial CP amino acid deduced sequences were identified as BYDV-RMV (92 - 93 % of identity with "Illinois" Z14123 isolate. The complete CP amino acid deduced sequences of subgroup I isolates were confirmed as BYDV-PAV (94 - 99 % of identity and established a very homogeneous group (identity higher than 99 %. These results support the prevalence of BYDV-PAV in southern Brazil as previously diagnosed by Enzyme-Linked Immunosorbent Assay (ELISA and suggest that this population is very homogeneous. To our knowledge, this is the first report of BYDV-RMV in Brazil and the first genetic diversity study on B/CYDV in South America.

  13. Screening of Potential Inhibitor against Coat Protein of Apple Chlorotic Leaf Spot Virus.

    Science.gov (United States)

    Purohit, Rituraj; Kumar, Sachin; Hallan, Vipin

    2018-06-01

    In this study, we analyzed Coat protein (CP) of Apple chlorotic leaf spot virus (ACLSV), an important latent virus on Apple. Incidence of the virus is upto 60% in various apple cultivars, affecting yield losses of the order of 10-40% (depending upon the cultivar). CP plays an important role as the sole building block of the viral capsid. Homology approach was used to model 193 amino acid sequence of the coat protein. We used various servers such as ConSurf, TargetS, OSML, COACH, COFACTOR for the prediction of active site residues in coat protein. Virtual screening strategy was employed to search potential inhibitors for CP. Top twenty screened molecules considered for drugability, and toxicity analysis and one potential molecule was further analyzed by docking analysis. Here, we reported a potent molecule which could inhibit the formation of viron assembly by targeting the CP protein of virus.

  14. Evidence for lysine acetylation in the coat protein of a polerovirus.

    Science.gov (United States)

    Cilia, Michelle; Johnson, Richard; Sweeney, Michelle; DeBlasio, Stacy L; Bruce, James E; MacCoss, Michael J; Gray, Stewart M

    2014-10-01

    Virions of the RPV strain of Cereal yellow dwarf virus-RPV were purified from infected oat tissue and analysed by MS. Two conserved residues, K147 and K181, in the virus coat protein, were confidently identified to contain epsilon-N-acetyl groups. While no functional data are available for K147, K181 lies within an interfacial region critical for virion assembly and stability. The signature immonium ion at m/z 126.0919 demonstrated the presence of N-acetyllysine, and the sequence fragment ions enabled an unambiguous assignment of the epsilon-N-acetyl modification on K181. We hypothesize that selection favours acetylation of K181 in a fraction of coat protein monomers to stabilize the capsid by promoting intermonomer salt bridge formation.

  15. Polyglycerol coatings of glass vials for protein resistance.

    Science.gov (United States)

    Höger, Kerstin; Becherer, Tobias; Qiang, Wei; Haag, Rainer; Friess, Wolfgang; Küchler, Sarah

    2013-11-01

    Proteins are surface active molecules which undergo non-specific adsorption when getting in contact with surfaces such as the primary packaging material. This process is critical as it may cause a loss of protein content or protein aggregation. To prevent unspecific adsorption, protein repellent coatings are of high interest. We describe the coating of industrial relevant borosilicate glass vials with linear methoxylated polyglycerol, hyperbranched polyglycerol, and hyperbranched methoxylated polyglycerol. All coatings provide excellent protein repellent effects. The hyperbranched, non-methoxylated coating performed best. The protein repellent properties were maintained also after applying industrial relevant sterilization methods (≥200 °C). Marginal differences in antibody stability between formulations stored in bare glass vials and coated vials were detected after 3 months storage; the protein repellent effect remained largely stable. Here, we describe a new material suitable for the coating of primary packaging material of proteins which significantly reduces the protein adsorption and thus could present an interesting new possibility for biomedical applications. Copyright © 2013 Elsevier B.V. All rights reserved.

  16. Zucchini yellow mosaic virus: biological properties, detection procedures and comparison of coat protein gene sequences.

    Science.gov (United States)

    Coutts, B A; Kehoe, M A; Webster, C G; Wylie, S J; Jones, R A C

    2011-12-01

    Between 2006 and 2010, 5324 samples from at least 34 weed, two cultivated legume and 11 native species were collected from three cucurbit-growing areas in tropical or subtropical Western Australia. Two new alternative hosts of zucchini yellow mosaic virus (ZYMV) were identified, the Australian native cucurbit Cucumis maderaspatanus, and the naturalised legume species Rhyncosia minima. Low-level (0.7%) seed transmission of ZYMV was found in seedlings grown from seed collected from zucchini (Cucurbita pepo) fruit infected with isolate Cvn-1. Seed transmission was absent in >9500 pumpkin (C. maxima and C. moschata) seedlings from fruit infected with isolate Knx-1. Leaf samples from symptomatic cucurbit plants collected from fields in five cucurbit-growing areas in four Australian states were tested for the presence of ZYMV. When 42 complete coat protein (CP) nucleotide (nt) sequences from the new ZYMV isolates obtained were compared to those of 101 complete CP nt sequences from five other continents, phylogenetic analysis of the 143 ZYMV sequences revealed three distinct groups (A, B and C), with four subgroups in A (I-IV) and two in B (I-II). The new Australian sequences grouped according to collection location, fitting within A-I, A-II and B-II. The 16 new sequences from one isolated location in tropical northern Western Australia all grouped into subgroup B-II, which contained no other isolates. In contrast, the three sequences from the Northern Territory fitted into A-II with 94.6-99.0% nt identities with isolates from the United States, Iran, China and Japan. The 23 new sequences from the central west coast and two east coast locations all fitted into A-I, with 95.9-98.9% nt identities to sequences from Europe and Japan. These findings suggest that (i) there have been at least three separate ZYMV introductions into Australia and (ii) there are few changes to local isolate CP sequences following their establishment in remote growing areas. Isolates from A-I and B

  17. Roles for the coat protein telokin-like domain and the scaffolding protein amino-terminus

    Science.gov (United States)

    Suhanovsky, Margaret M.; Teschke, Carolyn M.

    2011-01-01

    Assembly of icosahedral capsids of proper size and symmetry is not understood. Residue F170 in bacteriophage P22 coat protein is critical for conformational switching during assembly. Substitutions at this site cause assembly of tubes of hexamerically arranged coat protein. Intragenic suppressors of the ts phenotype of F170A and F170K coat protein mutants were isolated. Suppressors were repeatedly found in the coat protein telokin-like domain at position 285, which caused coat protein to assemble into petite procapsids and capsids. Petite capsid assembly strongly correlated to the side chain volume of the substituted amino acid. We hypothesize that larger side chains at position 285 torque the telokin-like domain, changing flexibility of the subunit and intercapsomer contacts. Thus, a single amino acid substitution in coat protein is sufficient to change capsid size. In addition, the products of assembly of the variant coat proteins were affected by the size of the internal scaffolding protein. PMID:21784500

  18. A Sequence-Independent, Unstructured Internal Ribosome Entry Site Is Responsible for Internal Expression of the Coat Protein of Turnip Crinkle Virus.

    Science.gov (United States)

    May, Jared; Johnson, Philip; Saleem, Huma; Simon, Anne E

    2017-04-15

    To maximize the coding potential of viral genomes, internal ribosome entry sites (IRES) can be used to bypass the traditional requirement of a 5' cap and some/all of the associated translation initiation factors. Although viral IRES typically contain higher-order RNA structure, an unstructured sequence of about 84 nucleotides (nt) immediately upstream of the Turnip crinkle virus (TCV) coat protein (CP) open reading frame (ORF) has been found to promote internal expression of the CP from the genomic RNA (gRNA) both in vitro and in vivo An absence of extensive RNA structure was predicted using RNA folding algorithms and confirmed by selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) RNA structure probing. Analysis of the IRES region in vitro by use of both the TCV gRNA and reporter constructs did not reveal any sequence-specific elements but rather suggested that an overall lack of structure was an important feature for IRES activity. The CP IRES is A-rich, independent of orientation, and strongly conserved among viruses in the same genus. The IRES was dependent on eIF4G, but not eIF4E, for activity. Low levels of CP accumulated in vivo in the absence of detectable TCV subgenomic RNAs, strongly suggesting that the IRES was active in the gRNA in vivo Since the TCV CP also serves as the viral silencing suppressor, early translation of the CP from the viral gRNA is likely important for countering host defenses. Cellular mRNA IRES also lack extensive RNA structures or sequence conservation, suggesting that this viral IRES and cellular IRES may have similar strategies for internal translation initiation. IMPORTANCE Cap-independent translation is a common strategy among positive-sense, single-stranded RNA viruses for bypassing the host cell requirement of a 5' cap structure. Viral IRES, in general, contain extensive secondary structure that is critical for activity. In contrast, we demonstrate that a region of viral RNA devoid of extensive secondary

  19. Engineered Protein Coatings to Improve the Osseointegration of Dental and Orthopaedic Implants

    Science.gov (United States)

    Raphel, Jordan; Karlsson, Johan; Galli, Silvia; Wennerberg, Ann; Lindsay, Christopher; Haugh, Matthew; Pajarinen, Jukka; Goodman, Stuart B.; Jimbo, Ryo; Andersson, Martin; Heilshorn, Sarah C.

    2016-01-01

    Here we present the design of an engineered, elastin-like protein (ELP) that is chemically modified to enable stable coatings on the surfaces of titanium-based dental and orthopaedic implants by novel photocrosslinking and solution processing steps. The ELP includes an extended RGD sequence to confer bio-signaling and an elastin-like sequence for mechanical stability. ELP thin films were fabricated on cp-Ti and Ti6Al4V surfaces using scalable spin and dip coating processes with photoactive covalent crosslinking through a carbene insertion mechanism. The coatings withstood procedures mimicking dental screw and hip replacement stem implantations, a key metric for clinical translation. They promoted rapid adhesion of MG63 osteoblast-like cells, with over 80% adhesion after 24 hours, compared to 38% adhesion on uncoated Ti6Al4V. MG63 cells produced significantly more mineralization on ELP coatings compared to uncoated Ti6Al4V. Human bone marrow mesenchymal stem cells (hMSCs) had an earlier increase in alkaline phosphatase activity, indicating more rapid osteogenic differentiation and mineral deposition on adhesive ELP coatings. Rat tibia and femur in vivo studies demonstrated that cell-adhesive ELP-coated implants increased bone-implant contact area and interfacial strength after one week. These results suggest that ELP coatings withstand surgical implantation and promote rapid osseointegration, enabling earlier implant loading and potentially preventing micromotion that leads to aseptic loosening and premature implant failure. PMID:26790146

  20. Spore coat protein of Bacillus subtilis. Structure and precursor synthesis.

    Science.gov (United States)

    Munoz, L; Sadaie, Y; Doi, R H

    1978-10-10

    The coat protein of Bacillus subtilis spores comprises about 10% of the total dry weight of spores and 25% of the total spore protein. One protein with a molecular weight of 13,000 to 15,000 comprises a major portion of the spore coat. This mature spore coat protein has histidine at its NH2 terminus and is relatively rich in hydrophobic amino acids. Netropsin, and antibiotic which binds to A-T-rich regions of DNA and inhibits sporulation, but not growth, decreased the synthesis of this spore coat protein by 75%. A precursor spore coat protein with a molecular weight of 25,000 is made initially at t1 of sporulation and is converted to the mature spore coat protein with a molecular weight of 13,500 at t2 - t3. These data indicate that the spore coat protein gene is expressed very early in sporulation prior to the modifications of RNA polymerase which have been noted.

  1. Nanostructured Mineral Coatings Stabilize Proteins for Therapeutic Delivery.

    Science.gov (United States)

    Yu, Xiaohua; Biedrzycki, Adam H; Khalil, Andrew S; Hess, Dalton; Umhoefer, Jennifer M; Markel, Mark D; Murphy, William L

    2017-09-01

    Proteins tend to lose their biological activity due to their fragile structural conformation during formulation, storage, and delivery. Thus, the inability to stabilize proteins in controlled-release systems represents a major obstacle in drug delivery. Here, a bone mineral inspired protein stabilization strategy is presented, which uses nanostructured mineral coatings on medical devices. Proteins bound within the nanostructured coatings demonstrate enhanced stability against extreme external stressors, including organic solvents, proteases, and ethylene oxide gas sterilization. The protein stabilization effect is attributed to the maintenance of protein conformational structure, which is closely related to the nanoscale feature sizes of the mineral coatings. Basic fibroblast growth factor (bFGF) released from a nanostructured mineral coating maintains its biological activity for weeks during release, while it maintains activity for less than 7 d during release from commonly used polymeric microspheres. Delivery of the growth factors bFGF and vascular endothelial growth factor using a mineral coated surgical suture significantly improves functional Achilles tendon healing in a rabbit model, resulting in increased vascularization, more mature collagen fiber organization, and a two fold improvement in mechanical properties. The findings of this study demonstrate that biomimetic interactions between proteins and nanostructured minerals provide a new, broadly applicable mechanism to stabilize proteins in the context of drug delivery and regenerative medicine. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Biomimetic surface coatings from modular amphiphilic proteins

    Science.gov (United States)

    Harden, James; Wan, Fan; Fischer, Stephen; Dick, Scott

    2010-03-01

    Recombinant DNA methods have been used to develop a library of diblock protein polymers for creating designer biofunctional interfaces. These proteins are composed of a surface-active, amphiphilic block joined to a disordered, water soluble block with an end terminal bioactive domain. The amphiphilic block has a strong affinity for many synthetic polymer surfaces, providing a facile means of imparting biological functionality to otherwise bio-neutral materials through physical self-assembly. We have incorporated a series of bioactive end domains into this diblock motif, including sequences that encode specific cell binding and signaling functions of extracellular matrix constituents (e.g. RGD and YIGSR). In this talk, we show that these diblock constructs self-assemble into biofunctional surface coatings on several model synthetic polymer materials. We demonstrate that surface adsorption of the proteins has minimal impacts on the presentation of the bioactive domains in the soluble block, and through the use of microscopic and cell proliferation assays, we show that the resulting biofunctional interfaces are capable of inducing appropriate cellular responses in a variety of human cell types.

  3. Nucleotide and amino acid sequences of a coat protein of an Ukrainian isolate of Potato virus Y: comparison with homologous sequences of other isolates and phylogenetic analysis

    Directory of Open Access Journals (Sweden)

    Budzanivska I. G.

    2014-03-01

    Full Text Available Aim. Identification of the widespread Ukrainian isolate(s of PVY (Potato virus Y in different potato cultivars and subsequent phylogenetic analysis of detected PVY isolates based on NA and AA sequences of coat protein. Methods. ELISA, RT-PCR, DNA sequencing and phylogenetic analysis. Results. PVY has been identified serologically in potato cultivars of Ukrainian selection. In this work we have optimized a method for total RNA extraction from potato samples and offered a sensitive and specific PCR-based test system of own design for diagnostics of the Ukrainian PVY isolates. Part of the CP gene of the Ukrainian PVY isolate has been sequenced and analyzed phylogenetically. It is demonstrated that the Ukrainian isolate of Potato virus Y (CP gene has a higher percentage of homology with the recombinant isolates (strains of this pathogen (approx. 98.8– 99.8 % of homology for both nucleotide and translated amino acid sequences of the CP gene. The Ukrainian isolate of PVY is positioned in the separate cluster together with the isolates found in Syria, Japan and Iran; these isolates possibly have common origin. The Ukrainian PVY isolate is confirmed to be recombinant. Conclusions. This work underlines the need and provides the means for accurate monitoring of Potato virus Y in the agroecosystems of Ukraine. Most importantly, the phylogenetic analysis demonstrated the recombinant nature of this PVY isolate which has been attributed to the strain group O, subclade N:O.

  4. The Generation of Turnip Crinkle Virus-Like Particles in Plants by the Transient Expression of Wild-Type and Modified Forms of Its Coat Protein.

    Science.gov (United States)

    Saunders, Keith; Lomonossoff, George P

    2015-01-01

    Turnip crinkle virus (TCV), a member of the genus carmovirus of the Tombusviridae family, has a genome consisting of a single positive-sense RNA molecule that is encapsidated in an icosahedral particle composed of 180 copies of a single type of coat protein. We have employed the CPMV-HT transient expression system to investigate the formation of TCV-like particles following the expression of the wild-type coat protein or modified forms of it that contain either deletions and/or additions. Transient expression of the coat protein in plants results in the formation of capsid structures that morphologically resemble TCV virions (T = 3 structure) but encapsidate heterogeneous cellular RNAs, rather than the specific TCV coat protein messenger RNA. Expression of an amino-terminal deleted form of the coat protein resulted in the formation of smaller T = 1 structures that are free of RNA. The possibility of utilizing TCV as a carrier for the presentation of foreign proteins on the particle surface was also explored by fusing the sequence of GFP to the C-terminus of the coat protein. The expression of coat protein-GFP hybrids permitted the formation of VLPs but the yield of particles is diminished compared to the yield obtained with unmodified coat protein. Our results confirm the importance of the N-terminus of the coat protein for the encapsidation of RNA and show that the coat protein's exterior P domain plays a key role in particle formation.

  5. A protein with amino acid sequence homology to bovine insulin is present in the legume Vigna unguiculata (cowpea

    Directory of Open Access Journals (Sweden)

    Venâncio T.M.

    2003-01-01

    Full Text Available Since the discovery of bovine insulin in plants, much effort has been devoted to the characterization of these proteins and elucidation of their functions. We report here the isolation of a protein with similar molecular mass and same amino acid sequence to bovine insulin from developing fruits of cowpea (Vigna unguiculata genotype Epace 10. Insulin was measured by ELISA using an anti-human insulin antibody and was detected both in empty pods and seed coats but not in the embryo. The highest concentrations (about 0.5 ng/µg of protein of the protein were detected in seed coats at 16 and 18 days after pollination, and the values were 1.6 to 4.0 times higher than those found for isolated pods tested on any day. N-terminal amino acid sequencing of insulin was performed on the protein purified by C4-HPLC. The significance of the presence of insulin in these plant tissues is not fully understood but we speculate that it may be involved in the transport of carbohydrate to the fruit.

  6. Detection of spore coat protein of Bacillus subtilis by immunological method

    International Nuclear Information System (INIS)

    Uchida, Aritsune; Kadota, Hajime

    1976-01-01

    The spore coat protein of Bacillus subtilis was separated, and the qualitative assay for the spore coat protein was made by use of the immunological technique. The immunological method was found to be useful for judging the maturation of spore coat in the course of sporulation. The spore coat protein antigen appeared at t 2 stage of sporulation. The addition of rifampicin at the earlier stages of sporulation inhibited the increase in content of the spore coat antigen. (auth.)

  7. Enhanced protein adsorption and patterning on nanostructured latex-coated paper.

    Science.gov (United States)

    Juvonen, Helka; Määttänen, Anni; Ihalainen, Petri; Viitala, Tapani; Sarfraz, Jawad; Peltonen, Jouko

    2014-06-01

    Specific interactions of extracellular matrix proteins with cells and their adhesion to the substrate are important for cell growth. A nanopatterned latex-coated paper substrate previously shown to be an excellent substrate for cell adhesion and 2D growth was studied for directed immobilization of proteins. The nanostructured latex surface was formed by short-wavelength IR irradiation of a two-component latex coating consisting of a hydrophilic film-forming styrene butadiene acrylonitrile copolymer and hydrophobic polystyrene particles. The hydrophobic regions of the IR-treated latex coating showed strong adhesion of bovine serum albumin (cell repelling protein), fibronectin (cell adhesive protein) and streptavidin. Opposite to the IR-treated surface, fibronectin and streptavidin had a poor affinity toward the untreated pristine latex coating. Detailed characterization of the physicochemical surface properties of the latex-coated substrates revealed that the observed differences in protein affinity were mainly due to the presence or absence of the protein repelling polar and charged surface groups. The protein adsorption was assisted by hydrophobic (dehydration) interactions. Copyright © 2014 Elsevier B.V. All rights reserved.

  8. Protein sequence comparison and protein evolution

    Energy Technology Data Exchange (ETDEWEB)

    Pearson, W.R. [Univ. of Virginia, Charlottesville, VA (United States). Dept. of Biochemistry

    1995-12-31

    This tutorial was one of eight tutorials selected to be presented at the Third International Conference on Intelligent Systems for Molecular Biology which was held in the United Kingdom from July 16 to 19, 1995. This tutorial examines how the information conserved during the evolution of a protein molecule can be used to infer reliably homology, and thus a shared proteinfold and possibly a shared active site or function. The authors start by reviewing a geological/evolutionary time scale. Next they look at the evolution of several protein families. During the tutorial, these families will be used to demonstrate that homologous protein ancestry can be inferred with confidence. They also examine different modes of protein evolution and consider some hypotheses that have been presented to explain the very earliest events in protein evolution. The next part of the tutorial will examine the technical aspects of protein sequence comparison. Both optimal and heuristic algorithms and their associated parameters that are used to characterize protein sequence similarities are discussed. Perhaps more importantly, they survey the statistics of local similarity scores, and how these statistics can both be used to improve the selectivity of a search and to evaluate the significance of a match. They them examine distantly related members of three protein families, the serine proteases, the glutathione transferases, and the G-protein-coupled receptors (GCRs). Finally, the discuss how sequence similarity can be used to examine internal repeated or mosaic structures in proteins.

  9. Prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoEM structures

    Directory of Open Access Journals (Sweden)

    Srinivasan Narayanaswamy

    2010-06-01

    Full Text Available Abstract Background Dengue virus along with the other members of the flaviviridae family has reemerged as deadly human pathogens. Understanding the mechanistic details of these infections can be highly rewarding in developing effective antivirals. During maturation of the virus inside the host cell, the coat proteins E and M undergo conformational changes, altering the morphology of the viral coat. However, due to low resolution nature of the available 3-D structures of viral assemblies, the atomic details of these changes are still elusive. Results In the present analysis, starting from Cα positions of low resolution cryo electron microscopic structures the residue level details of protein-protein interaction interfaces of dengue virus coat proteins have been predicted. By comparing the preexisting structures of virus in different phases of life cycle, the changes taking place in these predicted protein-protein interaction interfaces were followed as a function of maturation process of the virus. Besides changing the current notion about the presence of only homodimers in the mature viral coat, the present analysis indicated presence of a proline-rich motif at the protein-protein interaction interface of the coat protein. Investigating the conservation status of these seemingly functionally crucial residues across other members of flaviviridae family enabled dissecting common mechanisms used for infections by these viruses. Conclusions Thus, using computational approach the present analysis has provided better insights into the preexisting low resolution structures of virus assemblies, the findings of which can be made use of in designing effective antivirals against these deadly human pathogens.

  10. Putative recombination events and evolutionary history of five economically important viruses of fruit trees based on coat protein-encoding gene sequence analysis.

    Science.gov (United States)

    Boulila, Moncef

    2010-06-01

    To enhance the knowledge of recombination as an evolutionary process, 267 accessions retrieved from GenBank were investigated, all belonging to five economically important viruses infecting fruit crops (Plum pox, Apple chlorotic leaf spot, Apple mosaic, Prune dwarf, and Prunus necrotic ringspot viruses). Putative recombinational events were detected in the coat protein (CP)-encoding gene using RECCO and RDP version 3.31beta algorithms. Based on RECCO results, all five viruses were shown to contain potential recombination signals in the CP gene. Reconstructed trees with modified topologies were proposed. Furthermore, RECCO performed better than the RDP package in detecting recombination events and exhibiting their evolution rate along the sequences of the five viruses. RDP, however, provided the possible major and minor parents of the recombinants. Thus, the two methods should be considered complementary.

  11. Exploring the interaction network of the Bacillus subtilis outer coat and crust proteins.

    Science.gov (United States)

    Krajčíková, Daniela; Forgáč, Vladimír; Szabo, Adam; Barák, Imrich

    2017-11-01

    Bacillus subtilis spores, representatives of an exceptionally resistant dormant cell type, are encircled by a thick proteinaceous layer called the spore coat. More than 80 proteins assemble into four distinct coat layers: a basement layer, an inner coat, an outer coat and a crust. As the spore develops inside the mother cell, spore coat proteins synthesized in the cytoplasm are gradually deposited onto the prespore surface. A small set of morphogenetic proteins necessary for spore coat morphogenesis are thought to form a scaffold to which the rest of the coat proteins are attached. Extensive localization and proteomic studies using wild type and mutant spores have revealed the arrangement of individual proteins within the spore coat layers. In this study we examined the interactions between the proteins localized to the outer coat and crust using a bacterial two hybrid system. These two layers are composed of at least 25 components. Self-interactions were observed for most proteins and numerous novel interactions were identified. The most interesting contacts are those made with the morphogenetic proteins CotE, CotY and CotZ; these could serve as a basis for understanding the specific roles of particular proteins in spore coat morphogenesis. Copyright © 2017 Elsevier GmbH. All rights reserved.

  12. HIV protein sequence hotspots for crosstalk with host hub proteins.

    Directory of Open Access Journals (Sweden)

    Mahdi Sarmady

    Full Text Available HIV proteins target host hub proteins for transient binding interactions. The presence of viral proteins in the infected cell results in out-competition of host proteins in their interaction with hub proteins, drastically affecting cell physiology. Functional genomics and interactome datasets can be used to quantify the sequence hotspots on the HIV proteome mediating interactions with host hub proteins. In this study, we used the HIV and human interactome databases to identify HIV targeted host hub proteins and their host binding partners (H2. We developed a high throughput computational procedure utilizing motif discovery algorithms on sets of protein sequences, including sequences of HIV and H2 proteins. We identified as HIV sequence hotspots those linear motifs that are highly conserved on HIV sequences and at the same time have a statistically enriched presence on the sequences of H2 proteins. The HIV protein motifs discovered in this study are expressed by subsets of H2 host proteins potentially outcompeted by HIV proteins. A large subset of these motifs is involved in cleavage, nuclear localization, phosphorylation, and transcription factor binding events. Many such motifs are clustered on an HIV sequence in the form of hotspots. The sequential positions of these hotspots are consistent with the curated literature on phenotype altering residue mutations, as well as with existing binding site data. The hotspot map produced in this study is the first global portrayal of HIV motifs involved in altering the host protein network at highly connected hub nodes.

  13. Structure-function relationship of viral coat proteins : a site-directed spectroscopic study of M13 coat protein

    NARCIS (Netherlands)

    Stopar, D.

    1997-01-01

    This thesis describes the results of a spectroscopic study of the major coat protein of bacteriophage M13. During the infection process this protein is incorporated into the cytoplasmic membrane of Escherichia coli host cells. To specifically monitor the local structural changes

  14. 'Let the phage do the work': Using the phage P22 coat protein structures as a framework to understand its folding and assembly mutants

    International Nuclear Information System (INIS)

    Teschke, Carolyn M.; Parent, Kristin N.

    2010-01-01

    The amino acid sequence of viral capsid proteins contains information about their folding, structure and self-assembly processes. While some viruses assemble from small preformed oligomers of coat proteins, other viruses such as phage P22 and herpesvirus assemble from monomeric proteins (Fuller and King, 1980). The subunit assembly process is strictly controlled through protein:protein interactions such that icosahedral structures are formed with specific symmetries, rather than aberrant structures. dsDNA viruses commonly assemble by first forming a precursor capsid that serves as a DNA packaging machine. DNA packaging is accompanied by a conformational transition of the small precursor procapsid into a larger capsid for isometric viruses. Here we highlight the pseudo-atomic structures of phage P22 coat protein and rationalize several decades of data about P22 coat protein folding, assembly and maturation generated from a combination of genetics and biochemistry.

  15. Simple Coatings to Render Polystyrene Protein Resistant

    Directory of Open Access Journals (Sweden)

    Marcelle Hecker

    2018-02-01

    Full Text Available Non-specific protein adsorption is detrimental to the performance of many biomedical devices. Polystyrene is a commonly used material in devices and thin films. Simple reliable surface modification of polystyrene to render it protein resistant is desired in particular for device fabrication and orthogonal functionalisation schemes. This report details modifications carried out on a polystyrene surface to prevent protein adsorption. The trialed surfaces included Pluronic F127 and PLL-g-PEG, adsorbed on polystyrene, using a polydopamine-assisted approach. Quartz crystal microbalance with dissipation (QCM-D results showed only short-term anti-fouling success of the polystyrene surface modified with F127, and the subsequent failure of the polydopamine intermediary layer in improving its stability. In stark contrast, QCM-D analysis proved the success of the polydopamine assisted PLL-g-PEG coating in preventing bovine serum albumin adsorption. This modified surface is equally as protein-rejecting after 24 h in buffer, and thus a promising simple coating for long term protein rejection of polystyrene.

  16. Novel algorithms for protein sequence analysis

    NARCIS (Netherlands)

    Ye, Kai

    2008-01-01

    Each protein is characterized by its unique sequential order of amino acids, the so-called protein sequence. Biology”s paradigm is that this order of amino acids determines the protein”s architecture and function. In this thesis, we introduce novel algorithms to analyze protein sequences. Chapter 1

  17. Adaptive covariation between the coat and movement proteins of prunus necrotic ringspot virus.

    Science.gov (United States)

    Codoñer, Francisco M; Fares, Mario A; Elena, Santiago F

    2006-06-01

    The relative functional and/or structural importance of different amino acid sites in a protein can be assessed by evaluating the selective constraints to which they have been subjected during the course of evolution. Here we explore such constraints at the linear and three-dimensional levels for the movement protein (MP) and coat protein (CP) encoded by RNA 3 of prunus necrotic ringspot ilarvirus (PNRSV). By a maximum-parsimony approach, the nucleotide sequences from 46 isolates of PNRSV varying in symptomatology, host tree, and geographic origin have been analyzed and sites under different selective pressures have been identified in both proteins. We have also performed covariation analyses to explore whether changes in certain amino acid sites condition subsequent variation in other sites of the same protein or the other protein. These covariation analyses shed light on which particular amino acids should be involved in the physical and functional interaction between MP and CP. Finally, we discuss these findings in the light of what is already known about the implication of certain sites and domains in structure and protein-protein and RNA-protein interactions.

  18. Spore coat protein synthesis in cell-free systems from sporulating cells of Bacillus subtilis.

    Science.gov (United States)

    Nakayama, T; Munoz, L E; Sadaie, Y; Doi, R H

    1978-09-01

    Cell-free systems for protein synthesis were prepared from Bacillus subtilis 168 cells at several stages of sporulation. Immunological methods were used to determine whether spore coat protein could be synthesized in the cell-free systems prepared from sporulating cells. Spore coat protein synthesis first occurred in extracts from stage t2 cells. The proportion of spore coat protein to total proteins synthesized in the cell-free systems was 2.4 and 3.9% at stages t2 and t4, respectively. The sodium dodecyl sulfate-urea-polyacrylamide gel electrophoresis patterns of immunoprecipitates from the cell-free systems showed the complete synthesis of an apparent spore coat protein precursor (molecular weight, 25,000). A polypeptide of this weight was previously identified in studies in vivo (L.E. Munoz, Y. Sadaie, and R.H. Doi, J. Biol. Chem., in press). The synthesis in vitro of polysome-associated nascent spore coat polypeptides with varying molecular weights up to 23,000 was also detected. These results indicate that the spore coat protein may be synthesized as a precursor protein. The removal of proteases in the crude extracts by treatment with hemoglobin-Sepharose affinity techniques may be preventing the conversion of the large 25,000-dalton precursor to the 12,500-dalton mature spore coat protein.

  19. Molecular characterization and intermolecular interaction of coat protein of Prunus necrotic ringspot virus: implications for virus assembly.

    Science.gov (United States)

    Kulshrestha, Saurabh; Hallan, Vipin; Sharma, Anshul; Seth, Chandrika Attri; Chauhan, Anjali; Zaidi, Aijaz Asghar

    2013-09-01

    Coat protein (CP) and RNA3 from Prunus necrotic ringspot virus (PNRSV-rose), the most prevalent virus infecting rose in India, were characterized and regions in the coat protein important for self-interaction, during dimer formation were identified. The sequence analysis of CP and partial RNA 3 revealed that the rose isolate of PNRSV in India belongs to PV-32 group of PNRSV isolates. Apart from the already established group specific features of PV-32 group member's additional group-specific and host specific features were also identified. Presence of methionine at position 90 in the amino acid sequence alignment of PNRSV CP gene (belonging to PV-32 group) was identified as the specific conserved feature for the rose isolates of PNRSV. As protein-protein interaction plays a vital role in the infection process, an attempt was made to identify the portions of PNRSV CP responsible for self-interaction using yeast two-hybrid system. It was found (after analysis of the deletion clones) that the C-terminal region of PNRSV CP (amino acids 153-226) plays a vital role in this interaction during dimer formation. N-terminal of PNRSV CP is previously known to be involved in CP-RNA interactions, but our results also suggested that N-terminal of PNRSV CP represented by amino acids 1-77 also interacts with C-terminal (amino acids 153-226) in yeast two-hybrid system, suggesting its probable involvement in the CP-CP interaction.

  20. Covalently coating dextran on macroporous polyglycidyl methacrylate microsphere enabled rapid protein chromatographic separation

    International Nuclear Information System (INIS)

    Zhang, Rongyue; Li, Qiang; Li, Juan; Zhou, Weiqing; Ye, Peili; Gao, Yang; Ma, Guanghui; Su, Zhiguo

    2012-01-01

    Protein denaturation and nonspecific adsorption on polymer media as a chromatographic support have been a problem which needs to be overcome. Macroporous poly(glycidyl methacrylate–divinylbezene) (PGMA–DVB) microspheres prepared in this study were firstly covalently coated with dextran through a three-step method. The dextran was firstly adsorbed onto the microspheres and then covalently bound to the PGMA–DVB microsphere through ether bonds which were formed by hydroxyl group reacting with epoxy group at the presence of 4-(Dimethylamino) pyridine. Finally, the coating dextran layer was crosslinked by ethylene glycol diglycidyl ether to form the continuous network coating. The coated microspheres were characterized by Fourier transform infrared spectra, scanning electron microscope, mercury porosimetry measurements, laser scanning confocal microscope, and protein adsorption experiments. Results showed that PGMA–DVB microspheres coated with dextran successfully maintained the macroporous structure and high permeability. The backpressure was only 1.69 MPa at a high flow rate of 2891 cm/h. Consequently, the hydrophilicity and biocompatibility of modified microspheres were greatly improved, and the contact angle decreased from 184° to 13°, and nonspecific adsorption of proteins was decreased to little or none. The clad dextran coating with large amounts of hydroxyl group was easily derived to be various functional groups. The derived media have great potential applications in rapid protein chromatography. - Highlights: ► Macroporous PGMA–DVB microspheres were covalently coated with dextran. ► The hydrophilicity of the coated microspheres was significantly improved. ► The irreversible adsorption of proteins was reduced to zero. ► The coated microspheres can maintain the macropore structure. ► The coated microspheres were applied to rapid protein separation.

  1. Covalently coating dextran on macroporous polyglycidyl methacrylate microsphere enabled rapid protein chromatographic separation

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Rongyue; Li, Qiang; Li, Juan; Zhou, Weiqing; Ye, Peili; Gao, Yang; Ma, Guanghui, E-mail: ghma@home.ipe.ac.cn; Su, Zhiguo

    2012-12-01

    Protein denaturation and nonspecific adsorption on polymer media as a chromatographic support have been a problem which needs to be overcome. Macroporous poly(glycidyl methacrylate-divinylbezene) (PGMA-DVB) microspheres prepared in this study were firstly covalently coated with dextran through a three-step method. The dextran was firstly adsorbed onto the microspheres and then covalently bound to the PGMA-DVB microsphere through ether bonds which were formed by hydroxyl group reacting with epoxy group at the presence of 4-(Dimethylamino) pyridine. Finally, the coating dextran layer was crosslinked by ethylene glycol diglycidyl ether to form the continuous network coating. The coated microspheres were characterized by Fourier transform infrared spectra, scanning electron microscope, mercury porosimetry measurements, laser scanning confocal microscope, and protein adsorption experiments. Results showed that PGMA-DVB microspheres coated with dextran successfully maintained the macroporous structure and high permeability. The backpressure was only 1.69 MPa at a high flow rate of 2891 cm/h. Consequently, the hydrophilicity and biocompatibility of modified microspheres were greatly improved, and the contact angle decreased from 184 Degree-Sign to 13 Degree-Sign , and nonspecific adsorption of proteins was decreased to little or none. The clad dextran coating with large amounts of hydroxyl group was easily derived to be various functional groups. The derived media have great potential applications in rapid protein chromatography. - Highlights: Black-Right-Pointing-Pointer Macroporous PGMA-DVB microspheres were covalently coated with dextran. Black-Right-Pointing-Pointer The hydrophilicity of the coated microspheres was significantly improved. Black-Right-Pointing-Pointer The irreversible adsorption of proteins was reduced to zero. Black-Right-Pointing-Pointer The coated microspheres can maintain the macropore structure. Black-Right-Pointing-Pointer The coated microspheres

  2. Reinforcement of Bacillus subtilis spores by cross-linking of outer coat proteins during maturation.

    Science.gov (United States)

    Abhyankar, Wishwas; Pandey, Rachna; Ter Beek, Alexander; Brul, Stanley; de Koning, Leo J; de Koster, Chris G

    2015-02-01

    Resistance characteristics of bacterial endospores towards various environmental stresses such as chemicals and heat are in part attributed to their coat proteins. Heat resistance is developed in a late stage of sporulation and during maturation of released spores. Using our gel-free proteomic approach and LC-FT-ICR-MS/MS analysis we have monitored the efficiency of the tryptic digestion of proteins in the coat during spore maturation over a period of eight days, using metabolically (15)N labeled mature spores as reference. The results showed that during spore maturation the loss of digestion efficiency of outer coat and crust proteins synchronized with the increase in heat resistance. This implicates that spore maturation involves chemical cross-linking of outer coat and crust layer proteins leaving the inner coat layer proteins unmodified. It appears that digestion efficiencies of spore surface proteins can be linked to their location within the coat and crust layers. We also attempted to study a possible link between spore maturation and the observed heterogeneity in spore germination. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Experimental Rugged Fitness Landscape in Protein Sequence Space

    Science.gov (United States)

    Hayashi, Yuuki; Aita, Takuyo; Toyota, Hitoshi; Husimi, Yuzuru; Urabe, Itaru; Yomo, Tetsuya

    2006-01-01

    The fitness landscape in sequence space determines the process of biomolecular evolution. To plot the fitness landscape of protein function, we carried out in vitro molecular evolution beginning with a defective fd phage carrying a random polypeptide of 139 amino acids in place of the g3p minor coat protein D2 domain, which is essential for phage infection. After 20 cycles of random substitution at sites 12–130 of the initial random polypeptide and selection for infectivity, the selected phage showed a 1.7×104-fold increase in infectivity, defined as the number of infected cells per ml of phage suspension. Fitness was defined as the logarithm of infectivity, and we analyzed (1) the dependence of stationary fitness on library size, which increased gradually, and (2) the time course of changes in fitness in transitional phases, based on an original theory regarding the evolutionary dynamics in Kauffman's n-k fitness landscape model. In the landscape model, single mutations at single sites among n sites affect the contribution of k other sites to fitness. Based on the results of these analyses, k was estimated to be 18–24. According to the estimated parameters, the landscape was plotted as a smooth surface up to a relative fitness of 0.4 of the global peak, whereas the landscape had a highly rugged surface with many local peaks above this relative fitness value. Based on the landscapes of these two different surfaces, it appears possible for adaptive walks with only random substitutions to climb with relative ease up to the middle region of the fitness landscape from any primordial or random sequence, whereas an enormous range of sequence diversity is required to climb further up the rugged surface above the middle region. PMID:17183728

  4. Experimental rugged fitness landscape in protein sequence space.

    Science.gov (United States)

    Hayashi, Yuuki; Aita, Takuyo; Toyota, Hitoshi; Husimi, Yuzuru; Urabe, Itaru; Yomo, Tetsuya

    2006-12-20

    The fitness landscape in sequence space determines the process of biomolecular evolution. To plot the fitness landscape of protein function, we carried out in vitro molecular evolution beginning with a defective fd phage carrying a random polypeptide of 139 amino acids in place of the g3p minor coat protein D2 domain, which is essential for phage infection. After 20 cycles of random substitution at sites 12-130 of the initial random polypeptide and selection for infectivity, the selected phage showed a 1.7x10(4)-fold increase in infectivity, defined as the number of infected cells per ml of phage suspension. Fitness was defined as the logarithm of infectivity, and we analyzed (1) the dependence of stationary fitness on library size, which increased gradually, and (2) the time course of changes in fitness in transitional phases, based on an original theory regarding the evolutionary dynamics in Kauffman's n-k fitness landscape model. In the landscape model, single mutations at single sites among n sites affect the contribution of k other sites to fitness. Based on the results of these analyses, k was estimated to be 18-24. According to the estimated parameters, the landscape was plotted as a smooth surface up to a relative fitness of 0.4 of the global peak, whereas the landscape had a highly rugged surface with many local peaks above this relative fitness value. Based on the landscapes of these two different surfaces, it appears possible for adaptive walks with only random substitutions to climb with relative ease up to the middle region of the fitness landscape from any primordial or random sequence, whereas an enormous range of sequence diversity is required to climb further up the rugged surface above the middle region.

  5. Experimental rugged fitness landscape in protein sequence space.

    Directory of Open Access Journals (Sweden)

    Yuuki Hayashi

    Full Text Available The fitness landscape in sequence space determines the process of biomolecular evolution. To plot the fitness landscape of protein function, we carried out in vitro molecular evolution beginning with a defective fd phage carrying a random polypeptide of 139 amino acids in place of the g3p minor coat protein D2 domain, which is essential for phage infection. After 20 cycles of random substitution at sites 12-130 of the initial random polypeptide and selection for infectivity, the selected phage showed a 1.7x10(4-fold increase in infectivity, defined as the number of infected cells per ml of phage suspension. Fitness was defined as the logarithm of infectivity, and we analyzed (1 the dependence of stationary fitness on library size, which increased gradually, and (2 the time course of changes in fitness in transitional phases, based on an original theory regarding the evolutionary dynamics in Kauffman's n-k fitness landscape model. In the landscape model, single mutations at single sites among n sites affect the contribution of k other sites to fitness. Based on the results of these analyses, k was estimated to be 18-24. According to the estimated parameters, the landscape was plotted as a smooth surface up to a relative fitness of 0.4 of the global peak, whereas the landscape had a highly rugged surface with many local peaks above this relative fitness value. Based on the landscapes of these two different surfaces, it appears possible for adaptive walks with only random substitutions to climb with relative ease up to the middle region of the fitness landscape from any primordial or random sequence, whereas an enormous range of sequence diversity is required to climb further up the rugged surface above the middle region.

  6. Genetic diversity of the movement and coat protein genes of South American isolates of Prunus necrotic ringspot virus.

    Science.gov (United States)

    Fiore, Nicola; Fajardo, Thor V M; Prodan, Simona; Herranz, María Carmen; Aparicio, Frederic; Montealegre, Jaime; Elena, Santiago F; Pallás, Vicente; Sánchez-Navarro, Jesús

    2008-01-01

    Prunus necrotic ringspot virus (PNRSV) is distributed worldwide, but no molecular data have been previously reported from South American isolates. The nucleotide sequences corresponding to the movement (MP) and coat (CP) proteins of 23 isolates of PNRSV from Chile, Brazil, and Uruguay, and from different Prunus species, have been obtained. Phylogenetic analysis performed with full-length MP and CP sequences from all the PNRSV isolates confirmed the clustering of the isolates into the previously reported PV32-I, PV96-II and PE5-III phylogroups. No association was found between specific sequences and host, geographic origin or symptomatology. Comparative analysis showed that both MP and CP have phylogroup-specific amino acids and all of the motifs previously characterized for both proteins. The study of the distribution of synonymous and nonsynonymous changes along both open reading frames revealed that most amino acid sites are under the effect of negative purifying selection.

  7. Use of designed sequences in protein structure recognition.

    Science.gov (United States)

    Kumar, Gayatri; Mudgal, Richa; Srinivasan, Narayanaswamy; Sandhya, Sankaran

    2018-05-09

    Knowledge of the protein structure is a pre-requisite for improved understanding of molecular function. The gap in the sequence-structure space has increased in the post-genomic era. Grouping related protein sequences into families can aid in narrowing the gap. In the Pfam database, structure description is provided for part or full-length proteins of 7726 families. For the remaining 52% of the families, information on 3-D structure is not yet available. We use the computationally designed sequences that are intermediately related to two protein domain families, which are already known to share the same fold. These strategically designed sequences enable detection of distant relationships and here, we have employed them for the purpose of structure recognition of protein families of yet unknown structure. We first measured the success rate of our approach using a dataset of protein families of known fold and achieved a success rate of 88%. Next, for 1392 families of yet unknown structure, we made structural assignments for part/full length of the proteins. Fold association for 423 domains of unknown function (DUFs) are provided as a step towards functional annotation. The results indicate that knowledge-based filling of gaps in protein sequence space is a lucrative approach for structure recognition. Such sequences assist in traversal through protein sequence space and effectively function as 'linkers', where natural linkers between distant proteins are unavailable. This article was reviewed by Oliviero Carugo, Christine Orengo and Srikrishna Subramanian.

  8. Role of Monomer Sequence, Hydrogen Bonding and Mesoscale Architecture in Marine Antifouling Coatings

    Science.gov (United States)

    Segalman, Rachel

    Polypeptoids are non-natural, sequence specific polymers that offer the opportunity to probe the effect of monomer sequence, chirality, and chain shape on self-assembly and surface properties. Additionally, polypeptoid synthesis is more scaleable than traditional polypeptides suggesting their utility in large area applications. We have designed efficient marine anti-fouling coatings by using triblock copolymer scaffolds to which polypeptoids are tethered in order to tune both the modulus and surface energies with great precision. Surprisingly, when short sequences are tethered to a polymer backbone, polypeptoids consistently outperform analogous polypeptides in antifouling properties. We hypothesize that the hydrogen bonding inherent to the polypeptide backbone drives the observed differences in performance. We also find that the polymer scaffold housing the polypeptoids also plays a crucial role in directing surface presentation and therefore the overall coating properties.

  9. Improving accuracy of protein-protein interaction prediction by considering the converse problem for sequence representation

    Directory of Open Access Journals (Sweden)

    Wang Yong

    2011-10-01

    Full Text Available Abstract Background With the development of genome-sequencing technologies, protein sequences are readily obtained by translating the measured mRNAs. Therefore predicting protein-protein interactions from the sequences is of great demand. The reason lies in the fact that identifying protein-protein interactions is becoming a bottleneck for eventually understanding the functions of proteins, especially for those organisms barely characterized. Although a few methods have been proposed, the converse problem, if the features used extract sufficient and unbiased information from protein sequences, is almost untouched. Results In this study, we interrogate this problem theoretically by an optimization scheme. Motivated by the theoretical investigation, we find novel encoding methods for both protein sequences and protein pairs. Our new methods exploit sufficiently the information of protein sequences and reduce artificial bias and computational cost. Thus, it significantly outperforms the available methods regarding sensitivity, specificity, precision, and recall with cross-validation evaluation and reaches ~80% and ~90% accuracy in Escherichia coli and Saccharomyces cerevisiae respectively. Our findings here hold important implication for other sequence-based prediction tasks because representation of biological sequence is always the first step in computational biology. Conclusions By considering the converse problem, we propose new representation methods for both protein sequences and protein pairs. The results show that our method significantly improves the accuracy of protein-protein interaction predictions.

  10. Protein-resistant polymer coatings obtained by matrix assisted pulsed laser evaporation

    Energy Technology Data Exchange (ETDEWEB)

    Rusen, L. [National Institute for Lasers, Plasma and Radiation Physics, 409 Atomistilor Street, PO Box MG-16, 077125, Magurele, Bucharest (Romania); Mustaciosu, C. [Horia Hulubei National Institute of Physics and Nuclear Engineering - IFIN HH, Magurele, Bucharest (Romania); Mitu, B.; Filipescu, M.; Dinescu, M. [National Institute for Lasers, Plasma and Radiation Physics, 409 Atomistilor Street, PO Box MG-16, 077125, Magurele, Bucharest (Romania); Dinca, V., E-mail: dinali@nipne.ro [National Institute for Lasers, Plasma and Radiation Physics, 409 Atomistilor Street, PO Box MG-16, 077125, Magurele, Bucharest (Romania)

    2013-08-01

    Adsorption of proteins and polysaccharides is known to facilitate microbial attachment and subsequent formation of biofilm on surfaces that ultimately results in its biofouling. Therefore, protein repellent modified surfaces are necessary to block the irreversible attachment of microorganisms. Within this context, the feasibility of using the Poly(ethylene glycol)-block-poly(ε-caprolactone) methyl ether (PEG-block-PCL Me) copolymer as potential protein-resistant coating was explored in this work. The films were deposited using Matrix Assisted Pulsed Laser Evaporation (MAPLE), a technique that allows good control of composition, thickness and homogeneity. The chemical and morphological characteristics of the films were examined using Fourier Transform Infrared Spectroscopy (FTIR), contact angle measurements and Atomic Force Microscopy (AFM). The FTIR data demonstrates that the functional groups in the MAPLE-deposited films remain intact, especially for fluences below 0.5 J cm{sup −2}. Optical Microscopy and AFM images show that the homogeneity and the roughness of the coatings are related to both laser parameters (fluence, number of pulses) and target composition. Protein adsorption tests were performed on the PEG-block-PCL Me copolymer coated glass and on bare glass surface as a control. The results show that the presence of copolymer as coating significantly reduces the adsorption of proteins.

  11. The complete nucleotide sequence of RNA 3 of a peach isolate of Prunus necrotic ringspot virus.

    Science.gov (United States)

    Hammond, R W; Crosslin, J M

    1995-04-01

    The complete nucleotide sequence of RNA 3 of the PE-5 peach isolate of Prunus necrotic ringspot ilarvirus (PNRSV) was obtained from cloned cDNA. The RNA sequence is 1941 nucleotides and contains two open reading frames (ORFs). ORF 1 consisted of 284 amino acids with a calculated molecular weight of 31,729 Da and ORF 2 contained 224 amino acids with a calculated molecular weight of 25,018 Da. ORF 2 corresponds to the coat protein gene. Expression of ORF 2 engineered into a pTrcHis vector in Escherichia coli results in a fusion polypeptide of approximately 28 kDa which cross-reacts with PNRSV polyclonal antiserum. Analysis of the coat protein amino acid sequence reveals a putative "zinc-finger" domain at the amino-terminal portion of the protein. Two tetranucleotide AUGC motifs occur in the 3'-UTR of the RNA and may function in coat protein binding and genome activation. ORF 1 homologies to other ilarviruses and alfalfa mosaic virus are confined to limited regions of conserved amino acids. The translated amino acid sequence of the coat protein gene shows 92% similarity to one isolate of apple mosaic virus, a closely related member of the ilarvirus group of plant viruses, but only 66% similarity to the amino acid sequence of the coat protein gene of a second isolate. These relationships are also reflected at the nucleotide sequence level. These results in one instance confirm the close similarities observed at the biophysical and serological levels between these two viruses, but on the other hand call into question the nomenclature used to describe these viruses.

  12. Highly specific salt bridges govern bacteriophage P22 icosahedral capsid assembly: identification of the site in coat protein responsible for interaction with scaffolding protein.

    Science.gov (United States)

    Cortines, Juliana R; Motwani, Tina; Vyas, Aashay A; Teschke, Carolyn M

    2014-05-01

    Icosahedral virus assembly requires a series of concerted and highly specific protein-protein interactions to produce a proper capsid. In bacteriophage P22, only coat protein (gp5) and scaffolding protein (gp8) are needed to assemble a procapsid-like particle, both in vivo and in vitro. In scaffolding protein's coat binding domain, residue R293 is required for procapsid assembly, while residue K296 is important but not essential. Here, we investigate the interaction of scaffolding protein with acidic residues in the N-arm of coat protein, since this interaction has been shown to be electrostatic. Through site-directed mutagenesis of genes 5 and 8, we show that changing coat protein N-arm residue 14 from aspartic acid to alanine causes a lethal phenotype. Coat protein residue D14 is shown by cross-linking to interact with scaffolding protein residue R293 and, thus, is intimately involved in proper procapsid assembly. To a lesser extent, coat protein N-arm residue E18 is also implicated in the interaction with scaffolding protein and is involved in capsid size determination, since a cysteine mutation at this site generated petite capsids. The final acidic residue in the N-arm that was tested, E15, is shown to only weakly interact with scaffolding protein's coat binding domain. This work supports growing evidence that surface charge density may be the driving force of virus capsid protein interactions. Bacteriophage P22 infects Salmonella enterica serovar Typhimurium and is a model for icosahedral viral capsid assembly. In this system, coat protein interacts with an internal scaffolding protein, triggering the assembly of an intermediate called a procapsid. Previously, we determined that there is a single amino acid in scaffolding protein required for P22 procapsid assembly, although others modulate affinity. Here, we identify partners in coat protein. We show experimentally that relatively weak interactions between coat and scaffolding proteins are capable of driving

  13. Nucleotide sequence and genetic organization of Hungarian grapevine chrome mosaic nepovirus RNA2.

    Science.gov (United States)

    Brault, V; Hibrand, L; Candresse, T; Le Gall, O; Dunez, J

    1989-10-11

    The complete nucleotide sequence of hungarian grapevine chrome mosaic nepovirus (GCMV) RNA2 has been determined. The RNA sequence is 4441 nucleotides in length, excluding the poly(A) tail. A polyprotein of 1324 amino acids with a calculated molecular weight of 146 kDa is encoded in a single long open reading frame extending from nucleotides 218 to 4190. This polyprotein is homologous with the protein encoded by the S strain of tomato black ring virus (TBRV) RNA2, the only other nepovirus sequenced so far. Direct sequencing of the viral coat protein and in vitro translation of transcripts derived from cDNA sequences demonstrate that, as for comoviruses, the coat protein is located at the carboxy terminus of the polyprotein. A model for the expression of GCMV RNA2 is presented.

  14. Nonlinear deterministic structures and the randomness of protein sequences

    CERN Document Server

    Huang Yan Zhao

    2003-01-01

    To clarify the randomness of protein sequences, we make a detailed analysis of a set of typical protein sequences representing each structural classes by using nonlinear prediction method. No deterministic structures are found in these protein sequences and this implies that they behave as random sequences. We also give an explanation to the controversial results obtained in previous investigations.

  15. Nucleation phenomena in protein folding: the modulating role of protein sequence

    International Nuclear Information System (INIS)

    Travasso, Rui D M; FaIsca, Patricia F N; Gama, Margarida M Telo da

    2007-01-01

    For the vast majority of naturally occurring, small, single-domain proteins, folding is often described as a two-state process that lacks detectable intermediates. This observation has often been rationalized on the basis of a nucleation mechanism for protein folding whose basic premise is the idea that, after completion of a specific set of contacts forming the so-called folding nucleus, the native state is achieved promptly. Here we propose a methodology to identify folding nuclei in small lattice polymers and apply it to the study of protein molecules with a chain length of N = 48. To investigate the extent to which protein topology is a robust determinant of the nucleation mechanism, we compare the nucleation scenario of a native-centric model with that of a sequence-specific model sharing the same native fold. To evaluate the impact of the sequence's finer details in the nucleation mechanism, we consider the folding of two non-homologous sequences. We conclude that, in a sequence-specific model, the folding nucleus is, to some extent, formed by the most stable contacts in the protein and that the less stable linkages in the folding nucleus are solely determined by the fold's topology. We have also found that, independently of the protein sequence, the folding nucleus performs the same 'topological' function. This unifying feature of the nucleation mechanism results from the residues forming the folding nucleus being distributed along the protein chain in a similar and well-defined manner that is determined by the fold's topological features

  16. WildSpan: mining structured motifs from protein sequences

    Directory of Open Access Journals (Sweden)

    Chen Chien-Yu

    2011-03-01

    Full Text Available Abstract Background Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions that incorporates several pruning strategies to largely reduce the mining cost. Results WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode

  17. Polydopamine-coated open tubular column for the separation of proteins by capillary electrochromatography.

    Science.gov (United States)

    Xiao, Xing; Wang, Wentao; Chen, Jia; Jia, Li

    2015-08-01

    The separation and determination of proteins in food is an important aspect in food industry. Inspired by the self-polymerization of dopamine under alkaline conditions and the natural adhesive properties of polydopamine, in this paper, a simple and economical method was developed for the preparation of polydopamine-coated open tubular column, in which ammonium persulfate was used as the source of oxygen to induce and facilitate the polymerization of dopamine to form polydopamine. In comparison with a naked fused-silica capillary, the direction and magnitude of the electro-osmotic flow of the as-prepared polydopamine-coated open tubular column could be manipulated by varying the pH values of background solutions due to the existence of amine and phenolic hydroxyl groups on polydopamine coating. The surface morphology of the polydopamine-coated open tubular column was studied by scanning electron microscopy, and the thickness of polydopamine coating was 106 nm. The performance of the polydopamine-coated open tubular column was validated by analysis of proteins. The relative standard deviations of migration times of proteins representing run-to-run, day-to-day, and column-to-column were less than 3.5%. In addition, the feasibility of the polydopamine-coated open tubular column for real samples was verified by the separation of proteins in chicken egg white and pure milk. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. The movement protein and coat protein of alfalfa mosaic virus accumulate in structurally modified plasmodesmata

    NARCIS (Netherlands)

    van der Wel, N. N.; Goldbach, R. W.; van Lent, J. W.

    1998-01-01

    In systemically infected tissues of Nicotiana benthamiana, alfalfa mosaic virus (AMV) coat protein (CP) and movement protein (MP) are detected in plasmodesmata in a layer of three to four cells at the progressing front of infection. Besides the presence of these viral proteins, the plasmodesmata are

  19. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    Directory of Open Access Journals (Sweden)

    Dobbs Drena

    2011-06-01

    Full Text Available Abstract Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i NPS-HomPPI (Non partner-specific HomPPI, which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii PS-HomPPI (Partner-specific HomPPI, which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of

  20. Protein 3D structure computed from evolutionary sequence variation.

    Directory of Open Access Journals (Sweden)

    Debora S Marks

    Full Text Available The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org. This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of

  1. Backbone dynamics of a model membrane protein: measurement of individual amide hydrogen-exchange rates in detergent-solubilized M13 coat protein using 13C NMR hydrogen/deuterium isotope shifts

    International Nuclear Information System (INIS)

    Henry, G.D.; Weiner, J.H.; Sykes, B.D.

    1987-01-01

    Hydrogen-exchange rates have been measured for individual assigned amide protons in M13 coat protein, a 50-residue integral membrane protein, using a 13 C nuclear magnetic resonance (NMR) equilibrium isotope shift technique. The locations of the more rapidly exchanging amides have been determined. In D 2 O solutions, a peptide carbonyl resonance undergoes a small upfield isotope shift (0.08-0.09 ppm) from its position in H 2 O solutions; in 1:1 H 2 O/D 2 O mixtures, the carbonyl line shape is determined by the exchange rate at the adjacent nitrogen atom. M13 coat protein was labeled biosynthetically with 13 C at the peptide carbonyls of alanine, glycine, phenylalanine, proline, and lysine, and the exchange rates of 12 assigned amide protons in the hydrophilic regions were measured as a function of pH by using the isotope shift method. This equilibrium technique is sensitive to the more rapidly exchanging protons which are difficult to measure by classical exchange-out experiments. In proteins, structural factors, notably H bonding, can decrease the exchange rate of an amide proton by many orders of magnitude from that observed in the freely exposed amides of model peptides such as poly(DL-alanine). With corrections for sequence-related inductive effects, the retardation of amide exchange in sodium dodecyl sulfate solubilized coat protein has been calculated with respect to poly(DL-alanine). The most rapidly exchanging protons, which are retarded very little or not at all, are shown to occur at the N- and C-termini of the molecule. A model of the detergent-solubilized coat protein is constructed from these H-exchange data which is consistent with circular dichroism and other NMR results

  2. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  3. Coating extracellular matrix proteins on a (3-aminopropyl)triethoxysilane-treated glass substrate for improved cell culture.

    Science.gov (United States)

    Masuda, Hiro-taka; Ishihara, Seiichiro; Harada, Ichiro; Mizutani, Takeomi; Ishikawa, Masayori; Kawabata, Kazushige; Haga, Hisashi

    2014-01-01

    We demonstrate that a (3-aminopropyl)triethoxysilane-treated glass surface is superior to an untreated glass surface for coating with extracellular matrix (ECM) proteins when used as a cell culture substrate to observe cell physiology and behavior. We found that MDCK cells cultured on untreated glass coated with ECM removed the coated ECM protein and secreted different ECM proteins. In contrast, the cells did not remove the coated ECM protein when seeded on (3-aminopropyl)triethoxysilane-treated (i.e., silanized) glass coated with ECM. Furthermore, the morphology and motility of cells grown on silanized glass differed from those grown on non-treated glass, even when both types of glass were initially coated with laminin. We also found that cells on silanized glass coated with laminin had higher motility than those on silanized glass coated with fibronectin. Based on our results, we suggest that silanized glass is a more suitable cell culture substrate than conventional non-treated glass when coated by ECM for observations of ECM effects on cell physiology.

  4. Optically and biologically active mussel protein-coated double-walled carbon nanotubes.

    Science.gov (United States)

    Jung, Yong Chae; Muramatsu, Hiroyuki; Fujisawa, Kazunori; Kim, Jin Hee; Hayashi, Takuya; Kim, Yoong Ahm; Endo, Morinobu; Terrones, Mauricio; Dresselhaus, Mildred S

    2011-12-02

    A method of dispersing strongly bundled double-walled carbon nanotubes (DWNTs) via a homogeneous coating of mussel protein in an aqueous solution is presented. Optical activity, mechanical strength, as well as electrical conductivity coming from the nanotubes and the versatile biological activity from the mussel protein make mussel-coated DWNTs promising as a multifunctional scaffold and for anti-fouling materials. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. An Intramolecular Chaperone Inserted in Bacteriophage P22 Coat Protein Mediates Its Chaperonin-independent Folding*

    Science.gov (United States)

    Suhanovsky, Margaret M.; Teschke, Carolyn M.

    2013-01-01

    The bacteriophage P22 coat protein has the common HK97-like fold but with a genetically inserted domain (I-domain). The role of the I-domain, positioned at the outermost surface of the capsid, is unknown. We hypothesize that the I-domain may act as an intramolecular chaperone because the coat protein folds independently, and many folding mutants are localized to the I-domain. The function of the I-domain was investigated by generating the coat protein core without its I-domain and the isolated I-domain. The core coat protein shows a pronounced folding defect. The isolated I-domain folds autonomously and has a high thermodynamic stability and fast folding kinetics in the presence of a peptidyl prolyl isomerase. Thus, the I-domain provides thermodynamic stability to the full-length coat protein so that it can fold reasonably efficiently while still allowing the HK97-like core to retain the flexibility required for conformational switching during procapsid assembly and maturation. PMID:24126914

  6. GuiTope: an application for mapping random-sequence peptides to protein sequences.

    Science.gov (United States)

    Halperin, Rebecca F; Stafford, Phillip; Emery, Jack S; Navalkar, Krupa Arun; Johnston, Stephen Albert

    2012-01-03

    Random-sequence peptide libraries are a commonly used tool to identify novel ligands for binding antibodies, other proteins, and small molecules. It is often of interest to compare the selected peptide sequences to the natural protein binding partners to infer the exact binding site or the importance of particular residues. The ability to search a set of sequences for similarity to a set of peptides may sometimes enable the prediction of an antibody epitope or a novel binding partner. We have developed a software application designed specifically for this task. GuiTope provides a graphical user interface for aligning peptide sequences to protein sequences. All alignment parameters are accessible to the user including the ability to specify the amino acid frequency in the peptide library; these frequencies often differ significantly from those assumed by popular alignment programs. It also includes a novel feature to align di-peptide inversions, which we have found improves the accuracy of antibody epitope prediction from peptide microarray data and shows utility in analyzing phage display datasets. Finally, GuiTope can randomly select peptides from a given library to estimate a null distribution of scores and calculate statistical significance. GuiTope provides a convenient method for comparing selected peptide sequences to protein sequences, including flexible alignment parameters, novel alignment features, ability to search a database, and statistical significance of results. The software is available as an executable (for PC) at http://www.immunosignature.com/software and ongoing updates and source code will be available at sourceforge.net.

  7. GuiTope: an application for mapping random-sequence peptides to protein sequences

    Directory of Open Access Journals (Sweden)

    Halperin Rebecca F

    2012-01-01

    Full Text Available Abstract Background Random-sequence peptide libraries are a commonly used tool to identify novel ligands for binding antibodies, other proteins, and small molecules. It is often of interest to compare the selected peptide sequences to the natural protein binding partners to infer the exact binding site or the importance of particular residues. The ability to search a set of sequences for similarity to a set of peptides may sometimes enable the prediction of an antibody epitope or a novel binding partner. We have developed a software application designed specifically for this task. Results GuiTope provides a graphical user interface for aligning peptide sequences to protein sequences. All alignment parameters are accessible to the user including the ability to specify the amino acid frequency in the peptide library; these frequencies often differ significantly from those assumed by popular alignment programs. It also includes a novel feature to align di-peptide inversions, which we have found improves the accuracy of antibody epitope prediction from peptide microarray data and shows utility in analyzing phage display datasets. Finally, GuiTope can randomly select peptides from a given library to estimate a null distribution of scores and calculate statistical significance. Conclusions GuiTope provides a convenient method for comparing selected peptide sequences to protein sequences, including flexible alignment parameters, novel alignment features, ability to search a database, and statistical significance of results. The software is available as an executable (for PC at http://www.immunosignature.com/software and ongoing updates and source code will be available at sourceforge.net.

  8. How Does the VSG Coat of Bloodstream Form African Trypanosomes Interact with External Proteins?

    Directory of Open Access Journals (Sweden)

    Angela Schwede

    2015-12-01

    Full Text Available Variations on the statement "the variant surface glycoprotein (VSG coat that covers the external face of the mammalian bloodstream form of Trypanosoma brucei acts a physical barrier" appear regularly in research articles and reviews. The concept of the impenetrable VSG coat is an attractive one, as it provides a clear model for understanding how a trypanosome population persists; each successive VSG protects the plasma membrane and is immunologically distinct from previous VSGs. What is the evidence that the VSG coat is an impenetrable barrier, and how do antibodies and other extracellular proteins interact with it? In this review, the nature of the extracellular surface of the bloodstream form trypanosome is described, and past experiments that investigated binding of antibodies and lectins to trypanosomes are analysed using knowledge of VSG sequence and structure that was unavailable when the experiments were performed. Epitopes for some VSG monoclonal antibodies are mapped as far as possible from previous experimental data, onto models of VSG structures. The binding of lectins to some, but not to other, VSGs is revisited with more recent knowledge of the location and nature of N-linked oligosaccharides. The conclusions are: (i Much of the variation observed in earlier experiments can be explained by the identity of the individual VSGs. (ii Much of an individual VSG is accessible to antibodies, and the barrier that prevents access to the cell surface is probably at the base of the VSG N-terminal domain, approximately 5 nm from the plasma membrane. This second conclusion highlights a gap in our understanding of how the VSG coat works, as several plasma membrane proteins with large extracellular domains are very unlikely to be hidden from host antibodies by VSG.

  9. Dynamics of domain coverage of the protein sequence universe

    Science.gov (United States)

    2012-01-01

    Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data. PMID:23157439

  10. Dynamics of domain coverage of the protein sequence universe

    Directory of Open Access Journals (Sweden)

    Rekapalli Bhanu

    2012-11-01

    Full Text Available Abstract Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data.

  11. MIPS: a database for genomes and protein sequences.

    Science.gov (United States)

    Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B

    2002-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).

  12. Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

    Directory of Open Access Journals (Sweden)

    Jinjian Jiang

    2017-07-01

    Full Text Available Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences.

  13. The interaction of M13 coat protein with lipid bilayers : a spectroscopic study

    NARCIS (Netherlands)

    Sanders, J.C.

    1992-01-01

    In this thesis a small part of the reproductive cycle of the M13 bacteriophage is studied in more detail, namely the interaction of the major coat protein (MW 5240) with lipid bilayers. During the infection process is the major coat protein of M13 bacteriophage stored in the cytoplasm

  14. Development of bioactive coatings based on γ-irradiated proteins to preserve strawberries

    International Nuclear Information System (INIS)

    Vu, K.D.; Hollingsworth, R.G.; Salmieri, S.; Takala, P.N.; Lacroix, M.

    2012-01-01

    Gamma irradiation was applied for creating cross-linked proteins to enhance the physicochemical properties of edible films made of calcium caseinate, whey protein isolate and glycerol. The characteristics of γ irradiated cross-linked proteins were analyzed by Fourier Transform Infrared spectroscopy. A second derivative spectra exhibited changes in band intensities that were correlated to an increase of β-sheet structure and a decrease of α-helix and unordered fractions of γ irradiated-cross-linked proteins as compared to the control without irradiation. Furthermore, on addition of methylcellulose to the irradiated protein matrix it was found that it has potential in enhancing the puncture strength and has no detrimental effect on water vapor permeability of protein based films. Finally, these film formulations were used as bioactive edible coatings containing natural antimicrobial agents (limonene and peppermint) to preserve the shelf life of fresh strawberries during storage. The bioactive coatings containing peppermint was found to be more efficient as preserving coatings than the formulations containing limonene. Irradiated proteins/methylcellulose/peppermint formulation had only 40% of decay at day 8 while it was 65% for the control. - Highlights: ► Crosslinked proteins and antimicrobials agents was able to preserve strawberries. ► Crosslinked protein structure was more ordered. ► Films based on crosslinked proteins and methylcellulose enhanced puncture strength.

  15. Sensing of heavy metal ions by intrinsic TMV coat protein fluorescence

    Science.gov (United States)

    Bayram, Serene S.; Green, Philippe; Blum, Amy Szuchmacher

    2018-04-01

    We propose the use of a cysteine mutant of TMV coat protein as a signal transducer for the selective sensing and quantification of the heavy metal ions, Cd2+, Pb2+, Zn2+ and Ni2+ based on intrinsic tryptophan quenching. TMV coat protein is inexpensive, can be mass-produced since it is expressed and extracted from E-coli. It also displays several different functional groups, enabling a wide repertoire of bioconjugation chemistries; thus it can be easily integrated into functional devices. In addition, TMV-ion interactions have been widely reported and utilized for metallization to generate organic-inorganic hybrid composite novel materials. Building on these previous observations, we herein determine, for the first time, the TMV-ion binding constants assuming the static fluorescence quenching model. We also show that by comparing TMV-ion interactions between native and denatured coat protein, we can distinguish between chemically similar heavy metal ions such as cadmium and zinc ions.

  16. 40 CFR 174.512 - Coat Protein of Potato Virus Y; exemption from the requirement of a tolerance.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 23 2010-07-01 2010-07-01 false Coat Protein of Potato Virus Y...-INCORPORATED PROTECTANTS Tolerances and Tolerance Exemptions § 174.512 Coat Protein of Potato Virus Y; exemption from the requirement of a tolerance. Residues of Coat Protein of Potato Virus Y are exempt from...

  17. Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

    Science.gov (United States)

    Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

    2018-01-01

    We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation.  Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases.  We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

  18. Structure and Sequence Search on Aptamer-Protein Docking

    Science.gov (United States)

    Xiao, Jiajie; Bonin, Keith; Guthold, Martin; Salsbury, Freddie

    2015-03-01

    Interactions between proteins and deoxyribonucleic acid (DNA) play a significant role in the living systems, especially through gene regulation. However, short nucleic acids sequences (aptamers) with specific binding affinity to specific proteins exhibit clinical potential as therapeutics. Our capillary and gel electrophoresis selection experiments show that specific sequences of aptamers can be selected that bind specific proteins. Computationally, given the experimentally-determined structure and sequence of a thrombin-binding aptamer, we can successfully dock the aptamer onto thrombin in agreement with experimental structures of the complex. In order to further study the conformational flexibility of this thrombin-binding aptamer and to potentially develop a predictive computational model of aptamer-binding, we use GPU-enabled molecular dynamics simulations to both examine the conformational flexibility of the aptamer in the absence of binding to thrombin, and to determine our ability to fold an aptamer. This study should help further de-novo predictions of aptamer sequences by enabling the study of structural and sequence-dependent effects on aptamer-protein docking specificity.

  19. Isolation of nuclear proteins from flax (Linum usitatissimum L. seed coats for gene expression regulation studies

    Directory of Open Access Journals (Sweden)

    Renouard Sullivan

    2012-01-01

    Full Text Available Abstract Background While seed biology is well characterized and numerous studies have focused on this subject over the past years, the regulation of seed coat development and metabolism is for the most part still non-elucidated. It is well known that the seed coat has an essential role in seed development and its features are associated with important agronomical traits. It also constitutes a rich source of valuable compounds such as pharmaceuticals. Most of the cell genetic material is contained in the nucleus; therefore nuclear proteins constitute a major actor for gene expression regulation. Isolation of nuclear proteins responsible for specific seed coat expression is an important prerequisite for understanding seed coat metabolism and development. The extraction of nuclear proteins may be problematic due to the presence of specific components that can interfere with the extraction process. The seed coat is a rich source of mucilage and phenolics, which are good examples of these hindering compounds. Findings In the present study, we propose an optimized nuclear protein extraction protocol able to provide nuclear proteins from flax seed coat without contaminants and sufficient yield and quality for their use in transcriptional gene expression regulation by gel shift experiments. Conclusions Routinely, around 250 μg of nuclear proteins per gram of fresh weight were extracted from immature flax seed coats. The isolation protocol described hereafter may serve as an effective tool for gene expression regulation and seed coat-focused proteomics studies.

  20. AlignMe—a membrane protein sequence alignment web server

    Science.gov (United States)

    Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.

    2014-01-01

    We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425

  1. 40 CFR 174.531 - Coat protein of plum pox virus; exemption from the requirement of a tolerance.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 23 2010-07-01 2010-07-01 false Coat protein of plum pox virus...-INCORPORATED PROTECTANTS Tolerances and Tolerance Exemptions § 174.531 Coat protein of plum pox virus; exemption from the requirement of a tolerance. Residues of the coat protein of plum pox virus in or on the...

  2. The relationship of protein conservation and sequence length

    Directory of Open Access Journals (Sweden)

    Panchenko Anna R

    2002-11-01

    Full Text Available Abstract Background In general, the length of a protein sequence is determined by its function and the wide variance in the lengths of an organism's proteins reflects the diversity of specific functional roles for these proteins. However, additional evolutionary forces that affect the length of a protein may be revealed by studying the length distributions of proteins evolving under weaker functional constraints. Results We performed sequence comparisons to distinguish highly conserved and poorly conserved proteins from the bacterium Escherichia coli, the archaeon Archaeoglobus fulgidus, and the eukaryotes Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. For all organisms studied, the conserved and nonconserved proteins have strikingly different length distributions. The conserved proteins are, on average, longer than the poorly conserved ones, and the length distributions for the poorly conserved proteins have a relatively narrow peak, in contrast to the conserved proteins whose lengths spread over a wider range of values. For the two prokaryotes studied, the poorly conserved proteins approximate the minimal length distribution expected for a diverse range of structural folds. Conclusions There is a relationship between protein conservation and sequence length. For all the organisms studied, there seems to be a significant evolutionary trend favoring shorter proteins in the absence of other, more specific functional constraints.

  3. Bacterial surface layer proteins as a novel capillary coating material for capillary electrophoretic separations

    Energy Technology Data Exchange (ETDEWEB)

    Moreno-Gordaliza, Estefanía, E-mail: emorenog@ucm.es [Division of Analytical Biosciences, Leiden Academic Centre for Drug Research, Universiteit Leiden, Einsteinweg 55, 2300, RA, Leiden (Netherlands); Department of Analytical Chemistry, Faculty of Chemistry, Universidad Complutense de Madrid, Avda. Complutense s/n, 28040, Madrid (Spain); Stigter, Edwin C.A. [Division of Analytical Biosciences, Leiden Academic Centre for Drug Research, Universiteit Leiden, Einsteinweg 55, 2300, RA, Leiden (Netherlands); Department of Molecular Cancer Research, Universitair Medisch Centrum Utrecht, Wilhelmina Kinder Ziekenhuis, Lundlaan 6, 3584, EA Utrecht (Netherlands); Lindenburg, Petrus W.; Hankemeier, Thomas [Division of Analytical Biosciences, Leiden Academic Centre for Drug Research, Universiteit Leiden, Einsteinweg 55, 2300, RA, Leiden (Netherlands)

    2016-06-07

    A novel concept for stable coating in capillary electrophoresis, based on recrystallization of surface layer proteins on hydrophobized fused silica capillaries, was demonstrated. Surface layer protein A (SlpA) from Lactobacillus acidophilus bacteria was extracted, purified and used for coating pre-silanized glass substrates presenting different surface wettabilities (either hydrophobic or hydrophilic). Contact angle determination on SlpA-coated hydrophobic silica slides showed that the surfaces turned to hydrophilic after coating (53 ± 5°), due to a protein monolayer formation by protein-surface hydrophobic interactions. Visualization by atomic force microscopy demonstrated the presence of a SlpA layer on methylated silica slides displaying a surface roughness of 0.44 ± 0.02 nm. Additionally, a protein layer was visualized by fluorescence microscopy in methylated silica capillaries coated with SlpA and fluorescein isothiocyanate-labeled. The SlpA-coating showed an outstanding stability, even after treatment with 20 mM NaOH (pH 12.3). The electroosmotic flow in coated capillaries showed a partial suppression at pH 7.50 (3.8 ± 0.5 10{sup −9} m{sup 2} V{sup −1} s{sup −1}) when compared with unmodified fused silica (5.9 ± 0.1 10{sup −8} m{sup 2} V{sup −1} s{sup −1}). To demonstrate the potential of this novel coating, the SlpA-coated capillaries were applied for the first time for electrophoretic separation, and proved to be very suitable for the isotachophoretic separation of lipoproteins in human serum. The separations showed a high degree of repeatability (absolute migration times with 1.1–1.8% coefficient-of-variation (CV) within a day) and 2–3% CV inter-capillary reproducibility. The capillaries were stable for more than 100 runs at pH 9.40, and showed to be an exceptional alternative for challenging electrophoretic separations at long-term use. - Highlights: • New coating using recrystallized surface-layer proteins on

  4. MIPS: a database for protein sequences and complete genomes.

    Science.gov (United States)

    Mewes, H W; Hani, J; Pfeiffer, F; Frishman, D

    1998-01-01

    The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled. PMID:9399795

  5. Can Natural Proteins Designed with ‘Inverted’ Peptide Sequences Adopt Native-Like Protein Folds?

    Science.gov (United States)

    Sridhar, Settu; Guruprasad, Kunchur

    2014-01-01

    We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to ‘swap’ certain short peptide sequences in naturally occurring proteins with their corresponding ‘inverted’ peptides and generate ‘artificial’ proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5–12 and 18 amino acid residues. Our analysis illustrates with examples that such ‘artificial’ proteins may be generated by identifying peptides with ‘similar structural environment’ and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides. PMID:25210740

  6. ZP Domain Proteins in the Abalone Egg Coat Include a Paralog of VERL under Positive Selection That Binds Lysin and 18-kDa Sperm Proteins

    Science.gov (United States)

    Aagaard, Jan E.; Vacquier, Victor D.; MacCoss, Michael J.; Swanson, Willie J.

    2010-01-01

    Identifying fertilization molecules is key to our understanding of reproductive biology, yet only a few examples of interacting sperm and egg proteins are known. One of the best characterized comes from the invertebrate archeogastropod abalone (Haliotis spp.), where sperm lysin mediates passage through the protective egg vitelline envelope (VE) by binding to the VE protein vitelline envelope receptor for lysin (VERL). Rapid adaptive divergence of abalone lysin and VERL are an example of positive selection on interacting fertilization proteins contributing to reproductive isolation. Previously, we characterized a subset of the abalone VE proteins that share a structural feature, the zona pellucida (ZP) domain, which is common to VERL and the egg envelopes of vertebrates. Here, we use additional expressed sequence tag sequencing and shotgun proteomics to characterize this family of proteins in the abalone egg VE. We expand 3-fold the number of known ZP domain proteins present within the VE (now 30 in total) and identify a paralog of VERL (vitelline envelope zona pellucida domain protein [VEZP] 14) that contains a putative lysin-binding motif. We find that, like VERL, the divergence of VEZP14 among abalone species is driven by positive selection on the lysin-binding motif alone and that these paralogous egg VE proteins bind a similar set of sperm proteins including a rapidly evolving 18-kDa paralog of lysin, which may mediate sperm–egg fusion. This work identifies an egg coat paralog of VERL under positive selection and the candidate sperm proteins with which it may interact during abalone fertilization. PMID:19767347

  7. The SWISS-PROT protein sequence data bank: current status.

    OpenAIRE

    Bairoch, A; Boeckmann, B

    1994-01-01

    SWISS-PROT is an annotated protein sequence database established in 1986 and maintained collaboratively, since 1988, by the Department of Medical Biochemistry of the University of Geneva and the EMBL Data Library. The SWISS-PROT protein sequence data bank consist of sequence entries. Sequence entries are composed of different lines types, each with their own format. For standardization purposes the format of SWISS-PROT follows as closely as possible that of the EMBL Nucleotide Sequence Databa...

  8. A simple coated-tube assay for alpha-foeto protein for clinical use

    International Nuclear Information System (INIS)

    Dakubu, S.; Ahene, I.S.; Foli, A.K.

    1977-01-01

    A standard method for coating plastic tubes with antiserum has been applied to coat tubes with rabbit antiserum to human alpha-foeto protein. The coated plastic tubes have been used to set up a radioimmunoassay system which is sensitive and convenient for use on the occasional clinical sample. For a successful coated-tube assay, it was found necessary to modify the final incubation mixture from what was suitable in a standard double antibody assay system. (orig.) [de

  9. Protein-adsorption and Ca-phosphate formation on chitosan-bioactive glass composite coatings

    Science.gov (United States)

    Wagener, V.; Boccaccini, A. R.; Virtanen, S.

    2017-09-01

    In the last years, chitosan-bioactive glass (BG) composites have been developed and investigated as bioactive coatings for orthopedic applications. The increase of bioactivity occurs due to the stimulation of calcium-phosphate/hydroxyapatite formation on the surface while the coating is degrading. In the present work, protein adsorption and its influence on calcium-phosphate precipitation was studied for the first time on such composite coatings. The experiments involved coating of 316L stainless steel substrates with chitosan (Ch) and chitosan-bioactive glass (Ch-BG) and immersion of the coated samples in two different bovine serum albumin (BSA) containing solutions, namely DI H2O (with pH adjusted to about 7.2 with diluted NaOH) and simulated body fluid (SBF). In order to investigate the influence of protein adsorption on calcium-phosphate precipitation, samples were also immersed in DI H2O and in SBF without BSA. Samples were analyzed by scanning electron microscopy (SEM), X-ray photoelectron spectroscopy (XPS) and time-of-flight secondary ion mass spectrometry (ToF-SIMS). Surface analysis revealed that adsorption of BSA takes place on all studied samples and that protein adsorption is influenced by the presence of Ca2+ and PO43- ions. Bioactivity in the form of hydroxyapatite pre-stage formation is significantly increased on Ch-BG composite coating as compared with bare stainless steel surface. However, calcium-phosphate precipitation in SBF is reduced by the presence of BSA.

  10. Facile Photoimmobilization of Proteins onto Low-Binding PEG-Coated Polymer Surfaces

    DEFF Research Database (Denmark)

    Larsen, Esben Kjær Unmack; Mikkelsen, Morten Bo Lindholm; Larsen, Niels Bent

    2014-01-01

    was verified for both enzymes and antibodies, and their presence on the surface was confirmed by X-ray photoelectron spectroscopy (XPS) and confocal fluorescence microscopy. Conjugation of capture antibody onto the PEG coating was employed for a simplified ELISA protocol without the need for blocking uncoated...... surface areas, showing ng/mL sensitivity to a cytokine antigen target. Moreover, spatially patterned attachment of fluorescently labeled protein onto the low-binding PEG-coated surface was achieved with a projection lithography system that enabled the creation of micrometer-sized protein features....

  11. Partial sequence determination of metabolically labeled radioactive proteins and peptides

    International Nuclear Information System (INIS)

    Anderson, C.W.

    1982-01-01

    The author has used the sequence analysis of radioactive proteins and peptides to approach several problems during the past few years. They, in collaboration with others, have mapped precisely several adenovirus proteins with respect to the nucleotide sequence of the adenovirus genome; identified hitherto missed proteins encoded by bacteriophage MS2 and by simian virus 40; analyzed the aminoterminal maturation of several virus proteins; determined the cleavage sites for processing of the poliovirus polyprotein; and analyzed the mechanism of frameshifting by excess normal tRNAs during cell-free protein synthesis. This chapter is designed to aid those without prior experience at protein sequence determinations. It is based primarily on the experience gained in the studies cited above, which made use of the Beckman 890 series automated protein sequencers

  12. Correlation between protein sequence similarity and x-ray diffraction quality in the protein data bank.

    Science.gov (United States)

    Lu, Hui-Meng; Yin, Da-Chuan; Ye, Ya-Jing; Luo, Hui-Min; Geng, Li-Qiang; Li, Hai-Sheng; Guo, Wei-Hong; Shang, Peng

    2009-01-01

    As the most widely utilized technique to determine the 3-dimensional structure of protein molecules, X-ray crystallography can provide structure of the highest resolution among the developed techniques. The resolution obtained via X-ray crystallography is known to be influenced by many factors, such as the crystal quality, diffraction techniques, and X-ray sources, etc. In this paper, the authors found that the protein sequence could also be one of the factors. We extracted information of the resolution and the sequence of proteins from the Protein Data Bank (PDB), classified the proteins into different clusters according to the sequence similarity, and statistically analyzed the relationship between the sequence similarity and the best resolution obtained. The results showed that there was a pronounced correlation between the sequence similarity and the obtained resolution. These results indicate that protein structure itself is one variable that may affect resolution when X-ray crystallography is used.

  13. Taxonomic colouring of phylogenetic trees of protein sequences

    Directory of Open Access Journals (Sweden)

    Andrade-Navarro Miguel A

    2006-02-01

    Full Text Available Abstract Background Phylogenetic analyses of protein families are used to define the evolutionary relationships between homologous proteins. The interpretation of protein-sequence phylogenetic trees requires the examination of the taxonomic properties of the species associated to those sequences. However, there is no online tool to facilitate this interpretation, for example, by automatically attaching taxonomic information to the nodes of a tree, or by interactively colouring the branches of a tree according to any combination of taxonomic divisions. This is especially problematic if the tree contains on the order of hundreds of sequences, which, given the accelerated increase in the size of the protein sequence databases, is a situation that is becoming common. Results We have developed PhyloView, a web based tool for colouring phylogenetic trees upon arbitrary taxonomic properties of the species represented in a protein sequence phylogenetic tree. Provided that the tree contains SwissProt, SpTrembl, or GenBank protein identifiers, the tool retrieves the taxonomic information from the corresponding database. A colour picker displays a summary of the findings and allows the user to associate colours to the leaves of the tree according to any number of taxonomic partitions. Then, the colours are propagated to the branches of the tree. Conclusion PhyloView can be used at http://www.ogic.ca/projects/phyloview/. A tutorial, the software with documentation, and GPL licensed source code, can be accessed at the same web address.

  14. Role of monomer sequence and backbone chemistry in polypeptoid copolymers for marine antifouling coatings

    Science.gov (United States)

    Patterson, Anastasia; Wenning, Brandon; Rizis, Georgios; Calabrese, David; Finlay, John; Franco, Sofia; Clare, Anthony; Kramer, Edward; Ober, Christopher; Segalman, Rachel

    The design rules elucidated in this work suggest that antifouling coatings bearing pendant peptoid side chains perform better overall in marine fouling tests than those with peptide side chains, with extremely low attachment of N. incerta and high removal of U. linza. This difference in performance is likely due to the lack of a hydrogen bond donor in the peptoid backbone. Furthermore, we show that the bulk polymer material of these hierarchical coatings (based on PEO or PDMS) plays a key role in determining both surface presentation and fouling release performance. We demonstrate these trends utilizing a modular coating based on a triblock copolymer consisting of polystyrene and a vinyl-containing midblock, to which sequence-defined pendant oligomers (peptides or peptoids with sequences of oligo-PEO and fluoroalkyl groups) are attached via thiol-ene ``click'' chemistry. Surface presentation was analyzed with X-ray photoelectron spectroscopy and captive bubble water contact angle, and antifouling performance was evaluated with attachment and removal bioassays of the marine macroalga U. linza and diatom N. incerta. NSF GRFP and ONR PECASE.

  15. Semi-Supervised Learning for Classification of Protein Sequence Data

    Directory of Open Access Journals (Sweden)

    Brian R. King

    2008-01-01

    Full Text Available Protein sequence data continue to become available at an exponential rate. Annotation of functional and structural attributes of these data lags far behind, with only a small fraction of the data understood and labeled by experimental methods. Classification methods that are based on semi-supervised learning can increase the overall accuracy of classifying partly labeled data in many domains, but very few methods exist that have shown their effect on protein sequence classification. We show how proven methods from text classification can be applied to protein sequence data, as we consider both existing and novel extensions to the basic methods, and demonstrate restrictions and differences that must be considered. We demonstrate comparative results against the transductive support vector machine, and show superior results on the most difficult classification problems. Our results show that large repositories of unlabeled protein sequence data can indeed be used to improve predictive performance, particularly in situations where there are fewer labeled protein sequences available, and/or the data are highly unbalanced in nature.

  16. Molecular interactions of mussel protective coating protein, mcfp-1, from Mytilus californianus.

    Science.gov (United States)

    Lu, Qingye; Hwang, Dong Soo; Liu, Yang; Zeng, Hongbo

    2012-02-01

    Protective coating of the byssus of mussels (Mytilus sp.) has been suggested as a new paradigm of medical coating due to its high extensibility and hardness co-existence without their mutual detriment. The only known biomacromolecule in the extensible and tough coating on the byssus is mussel foot protein-1 (mfp-1), which is made up with positively charged residues (~20 mol%) and lack of negatively charged residues. Here, adhesion and molecular interaction mechanisms of Mytilus californianus foot protein-1 (mcfp-1) from California blue mussel were investigated using a surface forces apparatus (SFA) in buffer solutions of different ionic concentrations (0.2-0.7 M) and pHs (3.0-5.5). Strong and reversible cohesion between opposed positively charged mcfp-1 films was measured in 0.1 M sodium acetate buffer with 0.1 M KNO(3). Cohesion of mcfp-1 was gradually reduced with increasing the ionic strength, but was not changed with pH variations. Oxidation of 3,4-dihydroxyphenylalanine (DOPA) residues of mcfp-1, a key residue for adhesive and coating proteins of mussel, didn't change the cohesion strength of mcfp-1 films, but the addition of chemicals with aromatic groups (i.e., aspirin and 4-methylcatechol) increased the cohesion. These results suggest that the cohesion of mcfp-1 films is mainly mediated by cation-π interactions between the positively charged residues and benzene rings of DOPA and other aromatic amino acids (~20 mol% of total amino acids of mcfp-1), and π-π interactions between the phenyl groups in mcfp-1. The adhesion mechanism obtained for the mcfp-1 proteins provides important insight into the design and development of functional biomaterials and coatings mimicking the extensible and robust mussel cuticle coating. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. Anisotropic biodegradable lipid coated particles for spatially dynamic protein presentation.

    Science.gov (United States)

    Meyer, Randall A; Mathew, Mohit P; Ben-Akiva, Elana; Sunshine, Joel C; Shmueli, Ron B; Ren, Qiuyin; Yarema, Kevin J; Green, Jordan J

    2018-05-01

    There has been growing interest in the use of particles coated with lipids for applications ranging from drug delivery, gene delivery, and diagnostic imaging to immunoengineering. To date, almost all particles with lipid coatings have been spherical despite emerging evidence that non-spherical shapes can provide important advantages including reduced non-specific elimination and increased target-specific binding. We combine control of core particle geometry with control of particle surface functionality by developing anisotropic, biodegradable ellipsoidal particles with lipid coatings. We demonstrate that these lipid coated ellipsoidal particles maintain advantageous properties of lipid polymer hybrid particles, such as the ability for modular protein conjugation to the particle surface using versatile bioorthogonal ligation reactions. In addition, they exhibit biomimetic membrane fluidity and demonstrate lateral diffusive properties characteristic of natural membrane proteins. These ellipsoidal particles simultaneously provide benefits of non-spherical particles in terms of stability and resistance to non-specific phagocytosis by macrophages as well as enhanced targeted binding. These biomaterials provide a novel and flexible platform for numerous biomedical applications. The research reported here documents the ability of non-spherical polymeric particles to be coated with lipids to form anisotropic biomimetic particles. In addition, we demonstrate that these lipid-coated biodegradable polymeric particles can be conjugated to a wide variety of biological molecules in a "click-like" fashion. This is of interest due to the multiple types of cellular mimicry enabled by this biomaterial based technology. These features include mimicry of the highly anisotropic shape exhibited by cells, surface presentation of membrane bound protein mimetics, and lateral diffusivity of membrane bound substrates comparable to that of a plasma membrane. This platform is demonstrated to

  18. Effect of manufacturing process sequence on the corrosion resistance characteristics of coated metallic bipolar plates

    Science.gov (United States)

    Dur, Ender; Cora, Ömer Necati; Koç, Muammer

    2014-01-01

    Metallic bipolar plate (BPP) with high corrosion and low contact resistance, durability, strength, low cost, volume, and weight requirements is one of the critical parts of the PEMFC. This study is dedicated to understand the effect of the process sequence (manufacturing then coating vs. coating then manufacturing) on the corrosion resistance of coated metallic bipolar plates. To this goal, three different PVD coatings (titanium nitride (TiN), chromium nitride (CrN), zirconium nitride (ZrN)), with three thicknesses, (0.1, 0.5, 1 μm) were applied on BPPs made of 316L stainless steel alloy before and after two types of manufacturing (i.e., stamping or hydroforming). Corrosion test results indicated that ZrN coating exhibited the best corrosion protection while the performance of TiN coating was the lowest among the tested coatings and thicknesses. For most of the cases tested, in which coating was applied before manufacturing, occurrence of corrosion was found to be more profound than the case where coating was applied after manufacturing. Increasing the coating thickness was found to improve the corrosion resistance. It was also revealed that hydroformed BPPs performed slightly better than stamped BPPs in terms of the corrosion behavior.

  19. Sequence motifs in MADS transcription factors responsible for specificity and diversification of protein-protein interaction.

    Directory of Open Access Journals (Sweden)

    Aalt D J van Dijk

    Full Text Available Protein sequences encompass tertiary structures and contain information about specific molecular interactions, which in turn determine biological functions of proteins. Knowledge about how protein sequences define interaction specificity is largely missing, in particular for paralogous protein families with high sequence similarity, such as the plant MADS domain transcription factor family. In comparison to the situation in mammalian species, this important family of transcription regulators has expanded enormously in plant species and contains over 100 members in the model plant species Arabidopsis thaliana. Here, we provide insight into the mechanisms that determine protein-protein interaction specificity for the Arabidopsis MADS domain transcription factor family, using an integrated computational and experimental approach. Plant MADS proteins have highly similar amino acid sequences, but their dimerization patterns vary substantially. Our computational analysis uncovered small sequence regions that explain observed differences in dimerization patterns with reasonable accuracy. Furthermore, we show the usefulness of the method for prediction of MADS domain transcription factor interaction networks in other plant species. Introduction of mutations in the predicted interaction motifs demonstrated that single amino acid mutations can have a large effect and lead to loss or gain of specific interactions. In addition, various performed bioinformatics analyses shed light on the way evolution has shaped MADS domain transcription factor interaction specificity. Identified protein-protein interaction motifs appeared to be strongly conserved among orthologs, indicating their evolutionary importance. We also provide evidence that mutations in these motifs can be a source for sub- or neo-functionalization. The analyses presented here take us a step forward in understanding protein-protein interactions and the interplay between protein sequences and

  20. How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis.

    Science.gov (United States)

    Tian, Pengfei; Best, Robert B

    2017-10-17

    Quantifying the relationship between protein sequence and structure is key to understanding the protein universe. A fundamental measure of this relationship is the total number of amino acid sequences that can fold to a target protein structure, known as the "sequence capacity," which has been suggested as a proxy for how designable a given protein fold is. Although sequence capacity has been extensively studied using lattice models and theory, numerical estimates for real protein structures are currently lacking. In this work, we have quantitatively estimated the sequence capacity of 10 proteins with a variety of different structures using a statistical model based on residue-residue co-evolution to capture the variation of sequences from the same protein family. Remarkably, we find that even for the smallest protein folds, such as the WW domain, the number of foldable sequences is extremely large, exceeding the Avogadro constant. In agreement with earlier theoretical work, the calculated sequence capacity is positively correlated with the size of the protein, or better, the density of contacts. This allows the absolute sequence capacity of a given protein to be approximately predicted from its structure. On the other hand, the relative sequence capacity, i.e., normalized by the total number of possible sequences, is an extremely tiny number and is strongly anti-correlated with the protein length. Thus, although there may be more foldable sequences for larger proteins, it will be much harder to find them. Lastly, we have correlated the evolutionary age of proteins in the CATH database with their sequence capacity as predicted by our model. The results suggest a trade-off between the opposing requirements of high designability and the likelihood of a novel fold emerging by chance. Published by Elsevier Inc.

  1. Predicting the tolerated sequences for proteins and protein interfaces using RosettaBackrub flexible backbone design.

    Directory of Open Access Journals (Sweden)

    Colin A Smith

    Full Text Available Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface, interactions between and within parts of the structure (e.g. domains can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others.

  2. Comprehensive protein profiling by multiplexed capillary zone electrophoresis using cross-linked polyacrylamide coated capillaries.

    Science.gov (United States)

    Liu, Shaorong; Gao, Lin; Pu, Qiaosheng; Lu, Joann J; Wang, Xingjia

    2006-02-01

    We have recently developed a new process to create cross-linked polyacrylamide (CPA) coatings on capillary walls to suppress protein-wall interactions. Here, we demonstrate CPA-coated capillaries for high-efficiency (>2 x 10(6) plates per meter) protein separations by capillary zone electrophoresis (CZE). Because CPA virtually eliminates electroosmotic flow, positive and negative proteins cannot be analyzed in a single run. A "one-sample-two-separation" approach is developed to achieve a comprehensive protein analysis. High throughput is achieved through a multiplexed CZE system.

  3. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng

    2016-06-15

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

  4. EST2Prot: Mapping EST sequences to proteins

    Directory of Open Access Journals (Sweden)

    Lin David M

    2006-03-01

    Full Text Available Abstract Background EST libraries are used in various biological studies, from microarray experiments to proteomic and genetic screens. These libraries usually contain many uncharacterized ESTs that are typically ignored since they cannot be mapped to known genes. Consequently, new discoveries are possibly overlooked. Results We describe a system (EST2Prot that uses multiple elements to map EST sequences to their corresponding protein products. EST2Prot uses UniGene clusters, substring analysis, information about protein coding regions in existing DNA sequences and protein database searches to detect protein products related to a query EST sequence. Gene Ontology terms, Swiss-Prot keywords, and protein similarity data are used to map the ESTs to functional descriptors. Conclusion EST2Prot extends and significantly enriches the popular UniGene mapping by utilizing multiple relations between known biological entities. It produces a mapping between ESTs and proteins in real-time through a simple web-interface. The system is part of the Biozon database and is accessible at http://biozon.org/tools/est/.

  5. Adaptive compressive learning for prediction of protein-protein interactions from primary sequence.

    Science.gov (United States)

    Zhang, Ya-Nan; Pan, Xiao-Yong; Huang, Yan; Shen, Hong-Bin

    2011-08-21

    Protein-protein interactions (PPIs) play an important role in biological processes. Although much effort has been devoted to the identification of novel PPIs by integrating experimental biological knowledge, there are still many difficulties because of lacking enough protein structural and functional information. It is highly desired to develop methods based only on amino acid sequences for predicting PPIs. However, sequence-based predictors are often struggling with the high-dimensionality causing over-fitting and high computational complexity problems, as well as the redundancy of sequential feature vectors. In this paper, a novel computational approach based on compressed sensing theory is proposed to predict yeast Saccharomyces cerevisiae PPIs from primary sequence and has achieved promising results. The key advantage of the proposed compressed sensing algorithm is that it can compress the original high-dimensional protein sequential feature vector into a much lower but more condensed space taking the sparsity property of the original signal into account. What makes compressed sensing much more attractive in protein sequence analysis is its compressed signal can be reconstructed from far fewer measurements than what is usually considered necessary in traditional Nyquist sampling theory. Experimental results demonstrate that proposed compressed sensing method is powerful for analyzing noisy biological data and reducing redundancy in feature vectors. The proposed method represents a new strategy of dealing with high-dimensional protein discrete model and has great potentiality to be extended to deal with many other complicated biological systems. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. Design of Protein Multi-specificity Using an Independent Sequence Search Reduces the Barrier to Low Energy Sequences.

    Directory of Open Access Journals (Sweden)

    Alexander M Sevy

    2015-07-01

    Full Text Available Computational protein design has found great success in engineering proteins for thermodynamic stability, binding specificity, or enzymatic activity in a 'single state' design (SSD paradigm. Multi-specificity design (MSD, on the other hand, involves considering the stability of multiple protein states simultaneously. We have developed a novel MSD algorithm, which we refer to as REstrained CONvergence in multi-specificity design (RECON. The algorithm allows each state to adopt its own sequence throughout the design process rather than enforcing a single sequence on all states. Convergence to a single sequence is encouraged through an incrementally increasing convergence restraint for corresponding positions. Compared to MSD algorithms that enforce (constrain an identical sequence on all states the energy landscape is simplified, which accelerates the search drastically. As a result, RECON can readily be used in simulations with a flexible protein backbone. We have benchmarked RECON on two design tasks. First, we designed antibodies derived from a common germline gene against their diverse targets to assess recovery of the germline, polyspecific sequence. Second, we design "promiscuous", polyspecific proteins against all binding partners and measure recovery of the native sequence. We show that RECON is able to efficiently recover native-like, biologically relevant sequences in this diverse set of protein complexes.

  7. Membrane-bound conformation of M13 major coat protein : a structure validation through FRET-derived constraints

    NARCIS (Netherlands)

    Vos, W.L.; Koehorst, R.B.M.; Spruijt, R.B.; Hemminga, M.A.

    2005-01-01

    M13 major coat protein, a 50-amino-acid-long protein, was incorporated into DOPC/DOPG (80/20 molar ratio) unilamellar vesicles. Over 60% of all amino acid residues was replaced with cysteine residues, and the single cysteine mutants were labeled with the fluorescent label I-AEDANS. The coat protein

  8. Nonlinear analysis of sequence symmetry of beta-trefoil family proteins

    Energy Technology Data Exchange (ETDEWEB)

    Li Mingfeng [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Huang Yanzhao [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Xu Ruizhen [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Xiao Yi [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China)]. E-mail: yxiao@mail.hust.edu.cn

    2005-07-01

    The tertiary structures of proteins of beta-trefoil family have three-fold quasi-symmetry while their amino acid sequences appear almost at random. In the present paper we show that these amino acid sequences have hidden symmetries in fact and furthermore the degrees of these hidden symmetries are the same as those of their tertiary structures. We shall present a modified recurrence plot to reveal hidden symmetries in protein sequences. Our results can explain the contradiction in sequence-structure relations of proteins of beta-trefoil family.

  9. Contrasting evolutionary patterns of spore coat proteins in two Bacillus species groups are linked to a difference in cellular structure

    Science.gov (United States)

    2013-01-01

    Background The Bacillus subtilis-group and the Bacillus cereus-group are two well-studied groups of species in the genus Bacillus. Bacteria in this genus can produce a highly resistant cell type, the spore, which is encased in a complex protective protein shell called the coat. Spores in the B. cereus-group contain an additional outer layer, the exosporium, which encircles the coat. The coat in B. subtilis spores possesses inner and outer layers. The aim of this study is to investigate whether differences in the spore structures influenced the divergence of the coat protein genes during the evolution of these two Bacillus species groups. Results We designed and implemented a computational framework to compare the evolutionary histories of coat proteins. We curated a list of B. subtilis coat proteins and identified their orthologs in 11 Bacillus species based on phylogenetic congruence. Phylogenetic profiles of these coat proteins show that they can be divided into conserved and labile ones. Coat proteins comprising the B. subtilis inner coat are significantly more conserved than those comprising the outer coat. We then performed genome-wide comparisons of the nonsynonymous/synonymous substitution rate ratio, dN/dS, and found contrasting patterns: Coat proteins have significantly higher dN/dS in the B. subtilis-group genomes, but not in the B. cereus-group genomes. We further corroborated this contrast by examining changes of dN/dS within gene trees, and found that some coat protein gene trees have significantly different dN/dS between the B subtilis-clade and the B. cereus-clade. Conclusions Coat proteins in the B. subtilis- and B. cereus-group species are under contrasting selective pressures. We speculate that the absence of the exosporium in the B. subtilis spore coat effectively lifted a structural constraint that has led to relaxed negative selection pressure on the outer coat. PMID:24283940

  10. Correlated mutations in protein sequences: Phylogenetic and structural effects

    Energy Technology Data Exchange (ETDEWEB)

    Lapedes, A.S. [Los Alamos National Lab., NM (United States). Theoretical Div.]|[Santa Fe Inst., NM (United States); Giraud, B.G. [C.E.N. Saclay, Gif/Yvette (France). Service Physique Theorique; Liu, L.C. [Los Alamos National Lab., NM (United States). Theoretical Div.; Stormo, G.D. [Univ. of Colorado, Boulder, CO (United States). Dept. of Molecular, Cellular and Developmental Biology

    1998-12-01

    Covariation analysis of sets of aligned sequences for RNA molecules is relatively successful in elucidating RNA secondary structure, as well as some aspects of tertiary structure. Covariation analysis of sets of aligned sequences for protein molecules is successful in certain instances in elucidating certain structural and functional links, but in general, pairs of sites displaying highly covarying mutations in protein sequences do not necessarily correspond to sites that are spatially close in the protein structure. In this paper the authors identify two reasons why naive use of covariation analysis for protein sequences fails to reliably indicate sequence positions that are spatially proximate. The first reason involves the bias introduced in calculation of covariation measures due to the fact that biological sequences are generally related by a non-trivial phylogenetic tree. The authors present a null-model approach to solve this problem. The second reason involves linked chains of covariation which can result in pairs of sites displaying significant covariation even though they are not spatially proximate. They present a maximum entropy solution to this classic problem of causation versus correlation. The methodologies are validated in simulation.

  11. Association of sequences in the coat protein/readthrough domain of potato mop-top virus with transmission by Spongospora subterranea.

    Science.gov (United States)

    Reavy, B; Arif, M; Cowan, G H; Torrance, L

    1998-10-01

    A monofungal culture of Spongospora subterranea was unable to acquire and transmit the T isolate of potato mop-top pomovirus (PMTV-T), which has been maintained by manual transmission in the laboratory for 30 years. A recently obtained field isolate (PMTV-S) was efficiently acquired and transmitted by the same fungus culture. Sequence analysis of the readthrough (RT) protein-coding region of PMTV-S showed the presence of an additional 543 nt in the 3' half of the coding region relative to that of PMTV-T. These additional nucleotides preserved the reading frame of the RT protein and inserted 181 amino acids into the RT protein. This was confirmed by a comparison by immunoblotting of the sizes of the RT protein of PMTV-T and other recent isolates of PMTV.

  12. The Influence of Sporulation Conditions on the Spore Coat Protein Composition of Bacillus subtilis Spores.

    Science.gov (United States)

    Abhyankar, Wishwas R; Kamphorst, Kiki; Swarge, Bhagyashree N; van Veen, Henk; van der Wel, Nicole N; Brul, Stanley; de Koster, Chris G; de Koning, Leo J

    2016-01-01

    Spores are of high interest to the food and health sectors because of their extreme resistance to harsh conditions, especially against heat. Earlier research has shown that spores prepared on solid agar plates have a higher heat resistance than those prepared under a liquid medium condition. It has also been shown that the more mature a spore is, the higher is its heat resistance most likely mediated, at least in part, by the progressive cross-linking of coat proteins. The current study for the first time assesses, at the proteomic level, the effect of two commonly used sporulation conditions on spore protein presence. 14 N spores prepared on solid Schaeffer's-glucose (SG) agar plates and 15 N metabolically labeled spores prepared in shake flasks containing 3-( N -morpholino) propane sulfonic acid (MOPS) buffered defined liquid medium differ in their coat protein composition as revealed by LC-FT-MS/MS analyses. The former condition mimics the industrial settings while the latter conditions mimic the routine laboratory environment wherein spores are developed. As seen previously in many studies, the spores prepared on the solid agar plates show a higher thermal resistance than the spores prepared under liquid culture conditions. The 14 N: 15 N isotopic ratio of the 1:1 mixture of the spore suspensions exposes that most of the identified inner coat and crust proteins are significantly more abundant while most of the outer coat proteins are significantly less abundant for the spores prepared on solid SG agar plates relative to the spores prepared in the liquid MOPS buffered defined medium. Sporulation condition-specific differences and variation in isotopic ratios between the tryptic peptides of expected cross-linked proteins suggest that the coat protein cross-linking may also be condition specific. Since the core dipicolinic acid content is found to be similar in both the spore populations, it appears that the difference in wet heat resistance is connected to the

  13. The Influence of Sporulation Conditions on the Spore Coat Protein Composition of Bacillus subtilis Spores

    Science.gov (United States)

    Abhyankar, Wishwas R.; Kamphorst, Kiki; Swarge, Bhagyashree N.; van Veen, Henk; van der Wel, Nicole N.; Brul, Stanley; de Koster, Chris G.; de Koning, Leo J.

    2016-01-01

    Spores are of high interest to the food and health sectors because of their extreme resistance to harsh conditions, especially against heat. Earlier research has shown that spores prepared on solid agar plates have a higher heat resistance than those prepared under a liquid medium condition. It has also been shown that the more mature a spore is, the higher is its heat resistance most likely mediated, at least in part, by the progressive cross-linking of coat proteins. The current study for the first time assesses, at the proteomic level, the effect of two commonly used sporulation conditions on spore protein presence. 14N spores prepared on solid Schaeffer’s-glucose (SG) agar plates and 15N metabolically labeled spores prepared in shake flasks containing 3-(N-morpholino) propane sulfonic acid (MOPS) buffered defined liquid medium differ in their coat protein composition as revealed by LC-FT-MS/MS analyses. The former condition mimics the industrial settings while the latter conditions mimic the routine laboratory environment wherein spores are developed. As seen previously in many studies, the spores prepared on the solid agar plates show a higher thermal resistance than the spores prepared under liquid culture conditions. The 14N:15N isotopic ratio of the 1:1 mixture of the spore suspensions exposes that most of the identified inner coat and crust proteins are significantly more abundant while most of the outer coat proteins are significantly less abundant for the spores prepared on solid SG agar plates relative to the spores prepared in the liquid MOPS buffered defined medium. Sporulation condition-specific differences and variation in isotopic ratios between the tryptic peptides of expected cross-linked proteins suggest that the coat protein cross-linking may also be condition specific. Since the core dipicolinic acid content is found to be similar in both the spore populations, it appears that the difference in wet heat resistance is connected to the

  14. The influence of sporulation conditions on the spore coat protein composition of Bacillus subtilis spores.

    Directory of Open Access Journals (Sweden)

    Wishwas R. Abhyankar

    2016-10-01

    Full Text Available Spores are of high interest to the food and health sectors because of their extreme resistance to harsh conditions, especially against heat. Earlier research has shown that spores prepared on solid agar plates have a higher heat resistance than those prepared under a liquid medium condition. It has also been shown that the more mature a spore is, the higher is its heat resistance most likely mediated, at least in part, by the progressive cross-linking of coat proteins. The current study for the first time assesses, at the proteomic level, the effect of two commonly used sporulation conditions on spore protein presence. 14N spores prepared on solid SG agar plates and 15N metabolically labelled spores prepared in shake flasks containing MOPS buffered defined liquid medium differ in their coat protein composition as revealed by LC-FT-MS/MS analyses. The former condition mimics the industrial settings while the latter conditions mimic the routine laboratory environment wherein spores are developed. As seen previously in many studies, the spores prepared on the solid agar plates show a higher thermal resistance than the spores prepared under liquid culture conditions. The 14N: 15N isotopic ratio of the 1:1 mixture of the spore suspensions exposes that most of the identified inner coat and crust proteins are significantly more abundant while most of the outer coat proteins are significantly less abundant for the spores prepared on solid SG agar plates relative to the spores prepared in the liquid MOPS buffered defined medium. Sporulation condition-specific differences and variation in isotopic ratios between the tryptic peptides of expected cross-linked proteins suggest that the coat protein cross-linking may also be condition specific. Since the core dipicolinic acid content is found to be similar in both the spore populations, it appears that the difference in wet heat resistance is connected to the differences in the coat protein composition and

  15. Deep sequencing methods for protein engineering and design.

    Science.gov (United States)

    Wrenbeck, Emily E; Faber, Matthew S; Whitehead, Timothy A

    2017-08-01

    The advent of next-generation sequencing (NGS) has revolutionized protein science, and the development of complementary methods enabling NGS-driven protein engineering have followed. In general, these experiments address the functional consequences of thousands of protein variants in a massively parallel manner using genotype-phenotype linked high-throughput functional screens followed by DNA counting via deep sequencing. We highlight the use of information rich datasets to engineer protein molecular recognition. Examples include the creation of multiple dual-affinity Fabs targeting structurally dissimilar epitopes and engineering of a broad germline-targeted anti-HIV-1 immunogen. Additionally, we highlight the generation of enzyme fitness landscapes for conducting fundamental studies of protein behavior and evolution. We conclude with discussion of technological advances. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Biophysical and structural considerations for protein sequence evolution

    Directory of Open Access Journals (Sweden)

    Grahnen Johan A

    2011-12-01

    Full Text Available Abstract Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model.

  17. Differential trypanosome surface coat regulation by a CCCH protein that co-associates with procyclin mRNA cis-elements.

    Directory of Open Access Journals (Sweden)

    Pegine Walrad

    2009-02-01

    Full Text Available The genome of Trypanosoma brucei is unusual in being regulated almost entirely at the post-transcriptional level. In terms of regulation, the best-studied genes are procyclins, which encode a family of major surface GPI-anchored glycoproteins (EP1, EP2, EP3, GPEET that show differential expression in the parasite's tsetse-fly vector. Although procyclin mRNA cis-regulatory sequences have provided the paradigm for post-transcriptional control in kinetoplastid parasites, trans-acting regulators of procyclin mRNAs are unidentified, despite intensive effort over 15 years. Here we identify the developmental regulator, TbZFP3, a CCCH-class predicted RNA binding protein, as an isoform-specific regulator of Procyclin surface coat expression in trypanosomes. We demonstrate (i that endogenous TbZFP3 shows sequence-specific co-precipitation of EP1 and GPEET, but not EP2 and EP3, procyclin mRNA isoforms, (ii that ectopic overexpression of TbZFP3 does not perturb the mRNA abundance of procyclin transcripts, but rather that (iii their protein expression is regulated in an isoform-specific manner, as evidenced by mass spectrometric analysis of the Procyclin expression signature in the transgenic cell lines. The TbZFP3 mRNA-protein complex (TbZFP3mRNP is identified as a trans-regulator of differential surface protein expression in trypanosomes. Moreover, its sequence-specific interactions with procyclin mRNAs are compatible with long-established predictions for Procyclin regulation. Combined with the known association of TbZFP3 with the translational apparatus, this study provides a long-sought missing link between surface protein cis-regulatory signals and the gene expression machinery in trypanosomes.

  18. Photoreactive elastin-like proteins for use as versatile bioactive materials and surface coatings.

    Science.gov (United States)

    Raphel, Jordan; Parisi-Amon, Andreina; Heilshorn, Sarah

    2012-10-07

    Photocrosslinkable, protein-engineered biomaterials combine a rapid, controllable, cytocompatible crosslinking method with a modular design strategy to create a new family of bioactive materials. These materials have a wide range of biomedical applications, including the development of bioactive implant coatings, drug delivery vehicles, and tissue engineering scaffolds. We present the successful functionalization of a bioactive elastin-like protein with photoreactive diazirine moieties. Scalable synthesis is achieved using a standard recombinant protein expression host followed by site-specific modification of lysine residues with a heterobifunctional N-hydroxysuccinimide ester-diazirine crosslinker. The resulting biomaterial is demonstrated to be processable by spin coating, drop casting, soft lithographic patterning, and mold casting to fabricate a variety of two- and three-dimensional photocrosslinked biomaterials with length scales spanning the nanometer to millimeter range. Protein thin films proved to be highly stable over a three-week period. Cell-adhesive functional domains incorporated into the engineered protein materials were shown to remain active post-photo-processing. Human adipose-derived stem cells achieved faster rates of cell adhesion and larger spread areas on thin films of the engineered protein compared to control substrates. The ease and scalability of material production, processing versatility, and modular bioactive functionality make this recombinantly engineered protein an ideal candidate for the development of novel biomaterial coatings, films, and scaffolds.

  19. Albizia lebbeck Seed Coat Proteins Bind to Chitin and Act as a Defense against Cowpea Weevil Callosobruchus maculatus.

    Science.gov (United States)

    Silva, Nadia C M; De Sá, Leonardo F R; Oliveira, Eduardo A G; Costa, Monique N; Ferreira, Andre T S; Perales, Jonas; Fernandes, Kátia V S; Xavier-Filho, Jose; Oliveira, Antonia E A

    2016-05-11

    The seed coat is an external tissue that participates in defense against insects. In some nonhost seeds, including Albizia lebbeck, the insect Callosobruchus maculatus dies during seed coat penetration. We investigated the toxicity of A. lebbeck seed coat proteins to C. maculatus. A chitin-binding protein fraction was isolated from seed coat, and mass spectrometry showed similarity to a C1 cysteine protease. By ELM program an N-glycosylation interaction motif was identified in this protein, and by molecular docking the potential to interact with N-acetylglucosamine (NAG) was shown. The chitin-binding protein fraction was toxic to C. maculatus and was present in larval midgut and feces but not able to hydrolyze larval gut proteins. It did not interfere, though, with the intestinal cell permeability. These results indicate that the toxicity mechanism of this seed coat fraction may be related to its binding to chitin, present in the larvae gut, disturbing nutrient absorption.

  20. MIPS: a database for protein sequences, homology data and yeast genome information.

    Science.gov (United States)

    Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

    1997-01-01

    The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498

  1. The 96th Amino Acid of the Coat Protein of Cucumber Green Mottle Mosaic Virus Affects Virus Infectivity

    Directory of Open Access Journals (Sweden)

    Zhenwei Zhang

    2017-12-01

    Full Text Available Cucumber green mottle mosaic virus (CGMMV is one of the most devastating viruses infecting members of the family Cucurbitaceae. The assembly initiation site of CGMMV is located in the coding region of the coat protein, which is not only involved in virion assembly but is also a key factor determining the long-distance movement of the virus. To understand the effect of assembly initiation site and the adjacent region on CGMMV infectivity, we created a GTT deletion mutation in the GAGGTTG assembly initiation site of the infectious clone of CGMMV, which we termed V97 (deletion mutation at residue 97 of coat protein, followed by the construction of the V94A and T104A mutants. We observed that these three mutations caused mosaic after Agrobacterium-mediated transformation in Nicotiana benthamiana, albeit with a significant delay compared to the wild type clone. The mutants also had a common spontaneous E96K mutation in the coat protein. These results indicated that the initial assembly site and the sequence of the adjacent region affected the infectivity of the virus and that E96 might play an essential role in this process. We constructed two single point mutants—E96A and E96K—and three double mutants—V94A-E96K, V97-E96K and T104A-E96K—to further understand the role of E96 in CGMMV pathogenesis. After inoculation in N. benthamiana, E96A showed delayed systemic symptoms, but the E96K and three double mutants exhibited typical symptoms of mosaic at seven days post-infection. Then, sap from CGMMV-infected N. benthamiana leaves was mechanically inoculated on watermelon plants. We confirmed that E96 affected CGMMV infection using double antibody sandwich-enzyme-linked immunosorbent assay (DAS-ELISA, reverse transcription-polymerase chain reaction (RT-PCR, and sequencing, which further confirmed the successful infection of the related mutants, and that E96K can compensate the effect of the V94, V97, and T104 mutations on virus infectivity. In

  2. Dual Function of Novel Pollen Coat (Surface) Proteins: IgE-binding Capacity and Proteolytic Activity Disrupting the Airway Epithelial Barrier

    Science.gov (United States)

    Bashir, Mohamed Elfatih H.; Ward, Jason M.; Cummings, Matthew; Karrar, Eltayeb E.; Root, Michael; Mohamed, Abu Bekr A.; Naclerio, Robert M.; Preuss, Daphne

    2013-01-01

    Background The pollen coat is the first structure of the pollen to encounter the mucosal immune system upon inhalation. Prior characterizations of pollen allergens have focused on water-soluble, cytoplasmic proteins, but have overlooked much of the extracellular pollen coat. Due to washing with organic solvents when prepared, these pollen coat proteins are typically absent from commercial standardized allergenic extracts (i.e., “de-fatted”), and, as a result, their involvement in allergy has not been explored. Methodology/Principal Findings Using a unique approach to search for pollen allergenic proteins residing in the pollen coat, we employed transmission electron microscopy (TEM) to assess the impact of organic solvents on the structural integrity of the pollen coat. TEM results indicated that de-fatting of Cynodon dactylon (Bermuda grass) pollen (BGP) by use of organic solvents altered the structural integrity of the pollen coat. The novel IgE-binding proteins of the BGP coat include a cysteine protease (CP) and endoxylanase (EXY). The full-length cDNA that encodes the novel IgE-reactive CP was cloned from floral RNA. The EXY and CP were purified to homogeneity and tested for IgE reactivity. The CP from the BGP coat increased the permeability of human airway epithelial cells, caused a clear concentration-dependent detachment of cells, and damaged their barrier integrity. Conclusions/Significance Using an immunoproteomics approach, novel allergenic proteins of the BGP coat were identified. These proteins represent a class of novel dual-function proteins residing on the coat of the pollen grain that have IgE-binding capacity and proteolytic activity, which disrupts the integrity of the airway epithelial barrier. The identification of pollen coat allergens might explain the IgE-negative response to available skin-prick-testing proteins in patients who have positive symptoms. Further study of the role of these pollen coat proteins in allergic responses is

  3. Dual function of novel pollen coat (surface proteins: IgE-binding capacity and proteolytic activity disrupting the airway epithelial barrier.

    Directory of Open Access Journals (Sweden)

    Mohamed Elfatih H Bashir

    Full Text Available BACKGROUND: The pollen coat is the first structure of the pollen to encounter the mucosal immune system upon inhalation. Prior characterizations of pollen allergens have focused on water-soluble, cytoplasmic proteins, but have overlooked much of the extracellular pollen coat. Due to washing with organic solvents when prepared, these pollen coat proteins are typically absent from commercial standardized allergenic extracts (i.e., "de-fatted", and, as a result, their involvement in allergy has not been explored. METHODOLOGY/PRINCIPAL FINDINGS: Using a unique approach to search for pollen allergenic proteins residing in the pollen coat, we employed transmission electron microscopy (TEM to assess the impact of organic solvents on the structural integrity of the pollen coat. TEM results indicated that de-fatting of Cynodon dactylon (Bermuda grass pollen (BGP by use of organic solvents altered the structural integrity of the pollen coat. The novel IgE-binding proteins of the BGP coat include a cysteine protease (CP and endoxylanase (EXY. The full-length cDNA that encodes the novel IgE-reactive CP was cloned from floral RNA. The EXY and CP were purified to homogeneity and tested for IgE reactivity. The CP from the BGP coat increased the permeability of human airway epithelial cells, caused a clear concentration-dependent detachment of cells, and damaged their barrier integrity. CONCLUSIONS/SIGNIFICANCE: Using an immunoproteomics approach, novel allergenic proteins of the BGP coat were identified. These proteins represent a class of novel dual-function proteins residing on the coat of the pollen grain that have IgE-binding capacity and proteolytic activity, which disrupts the integrity of the airway epithelial barrier. The identification of pollen coat allergens might explain the IgE-negative response to available skin-prick-testing proteins in patients who have positive symptoms. Further study of the role of these pollen coat proteins in allergic

  4. Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment

    Directory of Open Access Journals (Sweden)

    Daniels Noah M

    2012-10-01

    Full Text Available Abstract Background The quality of multiple protein structure alignments are usually computed and assessed based on geometric functions of the coordinates of the backbone atoms from the protein chains. These purely geometric methods do not utilize directly protein sequence similarity, and in fact, determining the proper way to incorporate sequence similarity measures into the construction and assessment of protein multiple structure alignments has proved surprisingly difficult. Results We present Formatt, a multiple structure alignment based on the Matt purely geometric multiple structure alignment program, that also takes into account sequence similarity when constructing alignments. We show that Formatt outperforms Matt and other popular structure alignment programs on the popular HOMSTRAD benchmark. For the SABMark twilight zone benchmark set that captures more remote homology, Formatt and Matt outperform other programs; depending on choice of embedded sequence aligner, Formatt produces either better sequence and structural alignments with a smaller core size than Matt, or similarly sized alignments with better sequence similarity, for a small cost in average RMSD. Conclusions Considering sequence information as well as purely geometric information seems to improve quality of multiple structure alignments, though defining what constitutes the best alignment when sequence and structural measures would suggest different alignments remains a difficult open question.

  5. Single-molecule protein sequencing through fingerprinting: computational assessment

    Science.gov (United States)

    Yao, Yao; Docter, Margreet; van Ginkel, Jetty; de Ridder, Dick; Joo, Chirlmin

    2015-10-01

    Proteins are vital in all biological systems as they constitute the main structural and functional components of cells. Recent advances in mass spectrometry have brought the promise of complete proteomics by helping draft the human proteome. Yet, this commonly used protein sequencing technique has fundamental limitations in sensitivity. Here we propose a method for single-molecule (SM) protein sequencing. A major challenge lies in the fact that proteins are composed of 20 different amino acids, which demands 20 molecular reporters. We computationally demonstrate that it suffices to measure only two types of amino acids to identify proteins and suggest an experimental scheme using SM fluorescence. When achieved, this highly sensitive approach will result in a paradigm shift in proteomics, with major impact in the biological and medical sciences.

  6. Single-molecule protein sequencing through fingerprinting: computational assessment

    International Nuclear Information System (INIS)

    Yao, Yao; Docter, Margreet; Van Ginkel, Jetty; Joo, Chirlmin; De Ridder, Dick

    2015-01-01

    Proteins are vital in all biological systems as they constitute the main structural and functional components of cells. Recent advances in mass spectrometry have brought the promise of complete proteomics by helping draft the human proteome. Yet, this commonly used protein sequencing technique has fundamental limitations in sensitivity. Here we propose a method for single-molecule (SM) protein sequencing. A major challenge lies in the fact that proteins are composed of 20 different amino acids, which demands 20 molecular reporters. We computationally demonstrate that it suffices to measure only two types of amino acids to identify proteins and suggest an experimental scheme using SM fluorescence. When achieved, this highly sensitive approach will result in a paradigm shift in proteomics, with major impact in the biological and medical sciences. (paper)

  7. Nucleotide sequence of tomato ringspot virus RNA-2.

    Science.gov (United States)

    Rott, M E; Tremaine, J H; Rochon, D M

    1991-07-01

    The sequence of tomato ringspot virus (TomRSV) RNA-2 has been determined. It is 7273 nucleotides in length excluding the 3' poly(A) tail and contains a single long open reading frame (ORF) of 5646 nucleotides in the positive sense beginning at position 78 and terminating at position 5723. A second in-frame AUG at position 441 is in a more favourable context for initiation of translation and may act as a site for initiation of translation. The TomRSV RNA-2 3' noncoding region is 1550 nucleotides in length. The coat protein is located in the C-terminal region of the large polypeptide and shows significant but limited amino acid sequence similarity to the putative coat proteins of the nepoviruses tomato black ring (TBRV), Hungarian grapevine chrome mosaic (GCMV) and grapevine fanleaf (GFLV). Comparisons of the coding and non-coding regions of TomRSV RNA-2 and the RNA components of TBRV, GCMV, GFLV and the comovirus cowpea mosaic virus revealed significant similarity for over 300 amino acids between the coding region immediately to the N-terminal side of the putative coat proteins of TomRSV and GFLV; very little similarity could be detected among the non-coding regions of TomRSV and any of these viruses.

  8. The Conserved Spore Coat Protein SpoVM Is Largely Dispensable in Clostridium difficile Spore Formation.

    Science.gov (United States)

    Ribis, John W; Ravichandran, Priyanka; Putnam, Emily E; Pishdadian, Keyan; Shen, Aimee

    2017-01-01

    The spore-forming bacterial pathogen Clostridium difficile is a leading cause of health care-associated infections in the United States. In order for this obligate anaerobe to transmit infection, it must form metabolically dormant spores prior to exiting the host. A key step during this process is the assembly of a protective, multilayered proteinaceous coat around the spore. Coat assembly depends on coat morphogenetic proteins recruiting distinct subsets of coat proteins to the developing spore. While 10 coat morphogenetic proteins have been identified in Bacillus subtilis , only two of these morphogenetic proteins have homologs in the Clostridia : SpoIVA and SpoVM. C. difficile SpoIVA is critical for proper coat assembly and functional spore formation, but the requirement for SpoVM during this process was unknown. Here, we show that SpoVM is largely dispensable for C. difficile spore formation, in contrast with B. subtilis . Loss of C. difficile SpoVM resulted in modest decreases (~3-fold) in heat- and chloroform-resistant spore formation, while morphological defects such as coat detachment from the forespore and abnormal cortex thickness were observed in ~30% of spoVM mutant cells. Biochemical analyses revealed that C. difficile SpoIVA and SpoVM directly interact, similarly to their B. subtilis counterparts. However, in contrast with B. subtilis , C. difficile SpoVM was not essential for SpoIVA to encase the forespore. Since C. difficile coat morphogenesis requires SpoIVA-interacting protein L (SipL), which is conserved exclusively in the Clostridia , but not the more broadly conserved SpoVM, our results reveal another key difference between C. difficile and B. subtilis spore assembly pathways. IMPORTANCE The spore-forming obligate anaerobe Clostridium difficile is the leading cause of antibiotic-associated diarrheal disease in the United States. When C. difficile spores are ingested by susceptible individuals, they germinate within the gut and

  9. The regulated synthesis of a Bacillus anthracis spore coat protein that affects spore surface properties.

    Science.gov (United States)

    Aronson, A; Goodman, B; Smith, Z

    2014-05-01

    Examine the regulation of a spore coat protein and the effects on spore properties. A c. 23 kDa band in coat/exosporial extracts of Bacillus anthracis Sterne spores varied in amount depending upon the conditions of sporulation. It was identified by MALDI as a likely orthologue of ExsB of Bacillus cereus. Little if any was present in an exosporial preparation with a location to the inner coat/cortex region established by spore fractionation and immunogold labelling of electron micrograph sections. Because of its predominant location in the inner coat, it has been renamed Cotγ. It was relatively deficient in spores produced at 37°C and when acidic fermentation products were produced a difference attributable to transcriptional regulation. The deficiency or absence of Cotγ resulted in a less robust exosporium positioned more closely to the coat. These spores were less hydrophobic and germinated somewhat more rapidly. Hydrophobicity and appearance were rescued in the deletion strain by introduction of the cotγ gene. The deficiency or lack of a protein largely found in the inner coat altered spore hydrophobicity and surface appearance. The regulated synthesis of Cotγ may be a paradigm for other spore coat proteins with unknown functions that modulate spore properties in response to environmental conditions. © 2014 The Society for Applied Microbiology.

  10. Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM

    Directory of Open Access Journals (Sweden)

    Yunyun Liang

    2015-01-01

    Full Text Available Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM. Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS, segmented PsePSSM, and segmented autocovariance transformation (ACT based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640 are adopted in this paper. Then a 700-dimensional (700D feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA. To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.

  11. Scoring protein relationships in functional interaction networks predicted from sequence data.

    Directory of Open Access Journals (Sweden)

    Gaston K Mazandu

    Full Text Available UNLABELLED: The abundance of diverse biological data from various sources constitutes a rich source of knowledge, which has the power to advance our understanding of organisms. This requires computational methods in order to integrate and exploit these data effectively and elucidate local and genome wide functional connections between protein pairs, thus enabling functional inferences for uncharacterized proteins. These biological data are primarily in the form of sequences, which determine functions, although functional properties of a protein can often be predicted from just the domains it contains. Thus, protein sequences and domains can be used to predict protein pair-wise functional relationships, and thus contribute to the function prediction process of uncharacterized proteins in order to ensure that knowledge is gained from sequencing efforts. In this work, we introduce information-theoretic based approaches to score protein-protein functional interaction pairs predicted from protein sequence similarity and conserved protein signature matches. The proposed schemes are effective for data-driven scoring of connections between protein pairs. We applied these schemes to the Mycobacterium tuberculosis proteome to produce a homology-based functional network of the organism with a high confidence and coverage. We use the network for predicting functions of uncharacterised proteins. AVAILABILITY: Protein pair-wise functional relationship scores for Mycobacterium tuberculosis strain CDC1551 sequence data and python scripts to compute these scores are available at http://web.cbio.uct.ac.za/~gmazandu/scoringschemes.

  12. Three-dimensional reconstructions of the bacteriophage CUS-3 virion reveal a conserved coat protein I-domain but a distinct tailspike receptor-binding domain

    International Nuclear Information System (INIS)

    Parent, Kristin N.; Tang, Jinghua; Cardone, Giovanni; Gilcrease, Eddie B.; Janssen, Mandy E.; Olson, Norman H.; Casjens, Sherwood R.; Baker, Timothy S.

    2014-01-01

    CUS-3 is a short-tailed, dsDNA bacteriophage that infects serotype K1 Escherichia coli. We report icosahedrally averaged and asymmetric, three-dimensional, cryo-electron microscopic reconstructions of the CUS-3 virion. Its coat protein structure adopts the “HK97-fold” shared by other tailed phages and is quite similar to that in phages P22 and Sf6 despite only weak amino acid sequence similarity. In addition, these coat proteins share a unique extra external domain (“I-domain”), suggesting that the group of P22-like phages has evolved over a very long time period without acquiring a new coat protein gene from another phage group. On the other hand, the morphology of the CUS-3 tailspike differs significantly from that of P22 or Sf6, but is similar to the tailspike of phage K1F, a member of the extremely distantly related T7 group of phages. We conclude that CUS-3 obtained its tailspike gene from a distantly related phage quite recently. - Highlights: • Asymmetric and symmetric three-dimensional reconstructions of phage CUS-3 are presented. • CUS-3 major capsid protein has a conserved I-domain, which is found in all three categories of “P22-like phage”. • CUS-3 has very different tailspike receptor binding domain from those of P22 and Sf6. • The CUS-3 tailspike likely was acquired by horizontal gene transfer

  13. Three-dimensional reconstructions of the bacteriophage CUS-3 virion reveal a conserved coat protein I-domain but a distinct tailspike receptor-binding domain

    Energy Technology Data Exchange (ETDEWEB)

    Parent, Kristin N., E-mail: kparent@msu.edu [Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0378 (United States); Tang, Jinghua; Cardone, Giovanni [Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0378 (United States); Gilcrease, Eddie B. [University of Utah School of Medicine, Division of Microbiology and Immunology, Department of Pathology, Salt Lake City, UT 84112 (United States); Janssen, Mandy E.; Olson, Norman H. [Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0378 (United States); Casjens, Sherwood R., E-mail: sherwood.casjens@path.utah.edu [University of Utah School of Medicine, Division of Microbiology and Immunology, Department of Pathology, Salt Lake City, UT 84112 (United States); Baker, Timothy S., E-mail: tsb@ucsd.edu [Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0378 (United States); University of California, San Diego, Division of Biological Sciences, La Jolla, CA, 92093 (United States)

    2014-09-15

    CUS-3 is a short-tailed, dsDNA bacteriophage that infects serotype K1 Escherichia coli. We report icosahedrally averaged and asymmetric, three-dimensional, cryo-electron microscopic reconstructions of the CUS-3 virion. Its coat protein structure adopts the “HK97-fold” shared by other tailed phages and is quite similar to that in phages P22 and Sf6 despite only weak amino acid sequence similarity. In addition, these coat proteins share a unique extra external domain (“I-domain”), suggesting that the group of P22-like phages has evolved over a very long time period without acquiring a new coat protein gene from another phage group. On the other hand, the morphology of the CUS-3 tailspike differs significantly from that of P22 or Sf6, but is similar to the tailspike of phage K1F, a member of the extremely distantly related T7 group of phages. We conclude that CUS-3 obtained its tailspike gene from a distantly related phage quite recently. - Highlights: • Asymmetric and symmetric three-dimensional reconstructions of phage CUS-3 are presented. • CUS-3 major capsid protein has a conserved I-domain, which is found in all three categories of “P22-like phage”. • CUS-3 has very different tailspike receptor binding domain from those of P22 and Sf6. • The CUS-3 tailspike likely was acquired by horizontal gene transfer.

  14. Inverse statistical physics of protein sequences: a key issues review.

    Science.gov (United States)

    Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin

    2018-03-01

    In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.

  15. Ultra-fast evaluation of protein energies directly from sequence.

    Directory of Open Access Journals (Sweden)

    Gevorg Grigoryan

    2006-06-01

    Full Text Available The structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbone, an important sub-problem in computational protein design and in some methods for protein structure prediction. The cluster expansion (CE method that we employ can, in principle, be extended to model any computable or measurable protein property directly as a function of sequence. Here we show how CE can be applied to the problem of computational protein design, and use it to derive excellent approximations of physical potentials. The approach provides several attractive advantages. First, following a one-time derivation of a CE expansion, the amount of time necessary to evaluate the energy of a sequence adopting a specified backbone conformation is reduced by a factor of 10(7 compared to standard full-atom methods for the same task. Second, the agreement between two full-atom methods that we tested and their CE sequence-based expressions is very high (root mean square deviation 1.1-4.7 kcal/mol, R2 = 0.7-1.0. Third, the functional form of the CE energy expression is such that individual terms of the expansion have clear physical interpretations. We derived expressions for the energies of three classic protein design targets-a coiled coil, a zinc finger, and a WW domain-as functions of sequence, and examined the most significant terms. Single-residue and residue-pair interactions are sufficient to accurately capture the energetics of the dimeric coiled coil, whereas higher-order contributions are important for the two more globular folds. For the task of designing novel zinc-finger sequences, a CE-derived energy function provides significantly better solutions than a standard design protocol, in comparable computation time. Given these advantages

  16. ProteinSplit: splitting of multi-domain proteins using prediction of ordered and disordered regions in protein sequences for virtual structural genomics

    International Nuclear Information System (INIS)

    Wyrwicz, Lucjan S; Koczyk, Grzegorz; Rychlewski, Leszek; Plewczynski, Dariusz

    2007-01-01

    The annotation of protein folds within newly sequenced genomes is the main target for semi-automated protein structure prediction (virtual structural genomics). A large number of automated methods have been developed recently with very good results in the case of single-domain proteins. Unfortunately, most of these automated methods often fail to properly predict the distant homology between a given multi-domain protein query and structural templates. Therefore a multi-domain protein should be split into domains in order to overcome this limitation. ProteinSplit is designed to identify protein domain boundaries using a novel algorithm that predicts disordered regions in protein sequences. The software utilizes various sequence characteristics to assess the local propensity of a protein to be disordered or ordered in terms of local structure stability. These disordered parts of a protein are likely to create interdomain spacers. Because of its speed and portability, the method was successfully applied to several genome-wide fold annotation experiments. The user can run an automated analysis of sets of proteins or perform semi-automated multiple user projects (saving the results on the server). Additionally the sequences of predicted domains can be sent to the Bioinfo.PL Protein Structure Prediction Meta-Server for further protein three-dimensional structure and function prediction. The program is freely accessible as a web service at http://lucjan.bioinfo.pl/proteinsplit together with detailed benchmark results on the critical assessment of a fully automated structure prediction (CAFASP) set of sequences. The source code of the local version of protein domain boundary prediction is available upon request from the authors

  17. Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

    Directory of Open Access Journals (Sweden)

    Kevin R Ramkissoon

    Full Text Available The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

  18. Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation

    Science.gov (United States)

    Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392

  19. Complete cDNA sequence coding for human docking protein

    Energy Technology Data Exchange (ETDEWEB)

    Hortsch, M; Labeit, S; Meyer, D I

    1988-01-11

    Docking protein (DP, or SRP receptor) is a rough endoplasmic reticulum (ER)-associated protein essential for the targeting and translocation of nascent polypeptides across this membrane. It specifically interacts with a cytoplasmic ribonucleoprotein complex, the signal recognition particle (SRP). The nucleotide sequence of cDNA encoding the entire human DP and its deduced amino acid sequence are given.

  20. Designing sequence to control protein function in an EF-hand protein.

    Science.gov (United States)

    Bunick, Christopher G; Nelson, Melanie R; Mangahas, Sheryll; Hunter, Michael J; Sheehan, Jonathan H; Mizoue, Laura S; Bunick, Gerard J; Chazin, Walter J

    2004-05-19

    The extent of conformational change that calcium binding induces in EF-hand proteins is a key biochemical property specifying Ca(2+) sensor versus signal modulator function. To understand how differences in amino acid sequence lead to differences in the response to Ca(2+) binding, comparative analyses of sequence and structures, combined with model building, were used to develop hypotheses about which amino acid residues control Ca(2+)-induced conformational changes. These results were used to generate a first design of calbindomodulin (CBM-1), a calbindin D(9k) re-engineered with 15 mutations to respond to Ca(2+) binding with a conformational change similar to that of calmodulin. The gene for CBM-1 was synthesized, and the protein was expressed and purified. Remarkably, this protein did not exhibit any non-native-like molten globule properties despite the large number of mutations and the nonconservative nature of some of them. Ca(2+)-induced changes in CD intensity and in the binding of the hydrophobic probe, ANS, implied that CBM-1 does undergo Ca(2+) sensorlike conformational changes. The X-ray crystal structure of Ca(2+)-CBM-1 determined at 1.44 A resolution reveals the anticipated increase in hydrophobic surface area relative to the wild-type protein. A nascent calmodulin-like hydrophobic docking surface was also found, though it is occluded by the inter-EF-hand loop. The results from this first calbindomodulin design are discussed in terms of progress toward understanding the relationships between amino acid sequence, protein structure, and protein function for EF-hand CaBPs, as well as the additional mutations for the next CBM design.

  1. Unraveling the Role of the C-terminal Helix Turn Helix of the Coat-binding Domain of Bacteriophage P22 Scaffolding Protein*

    Science.gov (United States)

    Padilla-Meier, G. Pauline; Gilcrease, Eddie B.; Weigele, Peter R.; Cortines, Juliana R.; Siegel, Molly; Leavitt, Justin C.; Teschke, Carolyn M.; Casjens, Sherwood R.

    2012-01-01

    Many viruses encode scaffolding and coat proteins that co-assemble to form procapsids, which are transient precursor structures leading to progeny virions. In bacteriophage P22, the association of scaffolding and coat proteins is mediated mainly by ionic interactions. The coat protein-binding domain of scaffolding protein is a helix turn helix structure near the C terminus with a high number of charged surface residues. Residues Arg-293 and Lys-296 are particularly important for coat protein binding. The two helices contact each other through hydrophobic side chains. In this study, substitution of the residues of the interface between the helices, and the residues in the β-turn, by aspartic acid was used examine the importance of the conformation of the domain in coat binding. These replacements strongly affected the ability of the scaffolding protein to interact with coat protein. The severity of the defect in the association of scaffolding protein to coat protein was dependent on location, with substitutions at residues in the turn and helix 2 causing the most significant effects. Substituting aspartic acid for hydrophobic interface residues dramatically perturbs the stability of the structure, but similar substitutions in the turn had much less effect on the integrity of this domain, as determined by circular dichroism. We propose that the binding of scaffolding protein to coat protein is dependent on angle of the β-turn and the orientation of the charged surface on helix 2. Surprisingly, formation of the highly complex procapsid structure depends on a relatively simple interaction. PMID:22879595

  2. Prediction of protein-protein interaction sites in sequences and 3D structures by random forests.

    Directory of Open Access Journals (Sweden)

    Mile Sikić

    2009-01-01

    Full Text Available Identifying interaction sites in proteins provides important clues to the function of a protein and is becoming increasingly relevant in topics such as systems biology and drug discovery. Although there are numerous papers on the prediction of interaction sites using information derived from structure, there are only a few case reports on the prediction of interaction residues based solely on protein sequence. Here, a sliding window approach is combined with the Random Forests method to predict protein interaction sites using (i a combination of sequence- and structure-derived parameters and (ii sequence information alone. For sequence-based prediction we achieved a precision of 84% with a 26% recall and an F-measure of 40%. When combined with structural information, the prediction performance increases to a precision of 76% and a recall of 38% with an F-measure of 51%. We also present an attempt to rationalize the sliding window size and demonstrate that a nine-residue window is the most suitable for predictor construction. Finally, we demonstrate the applicability of our prediction methods by modeling the Ras-Raf complex using predicted interaction sites as target binding interfaces. Our results suggest that it is possible to predict protein interaction sites with quite a high accuracy using only sequence information.

  3. Albumen foam stability and s-ovalbumin contents in eggs coated with whey protein concentrate

    Directory of Open Access Journals (Sweden)

    ACC Alleoni

    2004-06-01

    Full Text Available Food products such as breads, cakes, crackers, meringues, ice creams and several bakery items depend on air incorporation to maintain their texture and structure during or after processing. Proteins are utilized in the food industry since they improve texture attributes through their ability to encapsulate and retain air. The objectives of this work were to quantify s-ovalbumin contents in albumen and to determine alterations in egg white foam stability in fresh eggs, and in eggs coated and non-coated with a whey protein-based concentrate film (WPC, stored at 25°C for 28 days. The volume of drained liquid was higher in non-coated eggs than in coated eggs stored at 25°C at all storage periods. The difference on the third day of storage was in the order of 59% between coated and non-coated eggs, while on the twenty-eighth day it was 202%. During the storage period, an increase in pH and drainage volume was observed for non-coated eggs. After three days, the non-coated eggs showed a s-ovalbumin content 33% higher than coated eggs; this increase jumped to 205% at 28 days of storage. There was a positive correlation between s-ovalbumin content and the volume of drained liquid for coated and non-coated eggs; in other words, when the s-ovalbumin content increased, there was an increase in the volume of drained liquid and a decrease in foam stability. WPC coating maintain egg quality, since it is an effective barrier against the loss of CO2, avoiding changes in the pH of egg white.

  4. Comparative analysis of the prion protein gene sequences in African lion.

    Science.gov (United States)

    Wu, Chang-De; Pang, Wan-Yong; Zhao, De-Ming

    2006-10-01

    The prion protein gene of African lion (Panthera Leo) was first cloned and polymorphisms screened. The results suggest that the prion protein gene of eight African lions is highly homogenous. The amino acid sequences of the prion protein (PrP) of all samples tested were identical. Four single nucleotide polymorphisms (C42T, C81A, C420T, T600C) in the prion protein gene (Prnp) of African lion were found, but no amino acid substitutions. Sequence analysis showed that the higher homology is observed to felis catus AF003087 (96.7%) and to sheep number M31313.1 (96.2%) Genbank accessed. With respect to all the mammalian prion protein sequences compared, the African lion prion protein sequence has three amino acid substitutions. The homology might in turn affect the potential intermolecular interactions critical for cross species transmission of prion disease.

  5. Repeat Sequence Proteins as Matrices for Nanocomposites

    Energy Technology Data Exchange (ETDEWEB)

    Drummy, L.; Koerner, H; Phillips, D; McAuliffe, J; Kumar, M; Farmer, B; Vaia, R; Naik, R

    2009-01-01

    Recombinant protein-inorganic nanocomposites comprised of exfoliated Na+ montmorillonite (MMT) in a recombinant protein matrix based on silk-like and elastin-like amino acid motifs (silk elastin-like protein (SELP)) were formed via a solution blending process. Charged residues along the protein backbone are shown to dominate long-range interactions, whereas the SELP repeat sequence leads to local protein/MMT compatibility. Up to a 50% increase in room temperature modulus and a comparable decrease in high temperature coefficient of thermal expansion occur for cast films containing 2-10 wt.% MMT.

  6. Protein adsorption and cell adhesion on nanoscale bioactive coatings formed from poly(ethylene glycol) and albumin microgels

    Science.gov (United States)

    Scott, Evan A.; Nichols, Michael D.; Cordova, Lee H.; George, Brandon J.; Jun, Young-Shin; Elbert, Donald L.

    2008-01-01

    Late-term thrombosis on drug-eluting stents is an emerging problem that might be addressed using extremely thin, biologically-active hydrogel coatings. We report a dip-coating strategy to covalently link poly(ethylene glycol) (PEG) to substrates, producing coatings with crosslinked microgels and deviation from Flory-Stockmayer theory. Before macrogelation, the reacting solutions were diluted and incubated with nucleophile-functionalized surfaces. Using optical waveguide lightmode spectroscopy (OWLS) and quartz crystal microbalance with dissipation (QCM-D), we identified a highly hydrated, protein-resistant layer with a thickness of approximately 75 nm. Atomic force microscopy in buffered water revealed the presence of coalesced spheres of various sizes but with diameters less than about 100 nm. Microgel-coated glass or poly(ethylene terephthalate) exhibited reduced protein adsorption and cell adhesion. Cellular interactions with the surface could be controlled by using different proteins to cap unreacted vinylsulfone groups within the coating. PMID:18771802

  7. Analysis of long-range correlation in sequences data of proteins

    OpenAIRE

    ADRIANA ISVORAN; LAURA UNIPAN; DANA CRACIUN; VASILE MORARIU

    2007-01-01

    The results presented here suggest the existence of correlations in the sequence data of proteins. 32 proteins, both globular and fibrous, both monomeric and polymeric, were analyzed. The primary structures of these proteins were treated as time series. Three spatial series of data for each sequence of a protein were generated from numerical correspondences between each amino acid and a physical property associated with it, i.e., its electric charge, its polar character and its dipole moment....

  8. The coat protein of prunus necrotic ringspot virus specifically binds to and regulates the conformation of its genomic RNA.

    Science.gov (United States)

    Aparicio, Frederic; Vilar, Marçal; Perez-Payá, Enrique; Pallás, Vicente

    2003-08-15

    Binding of coat protein (CP) to the 3' nontranslated region (3'-NTR) of viral RNAs is a crucial requirement to establish the infection of Alfamo- and Ilarviruses. In vitro binding properties of the Prunus necrotic ringspot ilarvirus (PNRSV) CP to the 3'-NTR of its genomic RNA using purified E. coli- expressed CP and different synthetic peptides corresponding to a 26-residue sequence near the N-terminus were investigated by electrophoretic mobility shift assays. PNRSV CP bound to, at least, three different sites existing on the 3'-NTR. Moreover, the N-terminal region between amino acid residues 25 to 50 of the protein could function as an independent RNA-binding domain. Single exchange of some arginine residues by alanine eliminated the RNA-interaction capacity of the synthetic peptides, consistent with a crucial role for Arg residues common to many RNA-binding proteins possessing Arg-rich domains. Circular dichroism spectroscopy revealed that the RNA conformation is altered when amino-terminal CP peptides bind to the viral RNA. Finally, mutational analysis of the 3'-NTR suggested the presence of a pseudoknotted structure at this region on the PNRSV RNA that, when stabilized by the presence of Mg(2+), lost its capability to bind the coat protein. The existence of two mutually exclusive conformations for the 3'-NTR of PNRSV strongly suggests a similar regulatory mechanism at the 3'-NTR level in Alfamo- and Ilarvirus genera.

  9. The coat protein of prunus necrotic ringspot virus specifically binds to and regulates the conformation of its genomic RNA

    International Nuclear Information System (INIS)

    Aparicio, Frederic; Vilar, Marcal; Perez-Paya, Enrique; Pallas, Vicente

    2003-01-01

    Binding of coat protein (CP) to the 3' nontranslated region (3'-NTR) of viral RNAs is a crucial requirement to establish the infection of Alfamo- and Ilarviruses. In vitro binding properties of the Prunus necrotic ringspot ilarvirus (PNRSV) CP to the 3'-NTR of its genomic RNA using purified E. coli- expressed CP and different synthetic peptides corresponding to a 26-residue sequence near the N-terminus were investigated by electrophoretic mobility shift assays. PNRSV CP bound to, at least, three different sites existing on the 3'-NTR. Moreover, the N-terminal region between amino acid residues 25 to 50 of the protein could function as an independent RNA-binding domain. Single exchange of some arginine residues by alanine eliminated the RNA-interaction capacity of the synthetic peptides, consistent with a crucial role for Arg residues common to many RNA-binding proteins possessing Arg-rich domains. Circular dichroism spectroscopy revealed that the RNA conformation is altered when amino-terminal CP peptides bind to the viral RNA. Finally, mutational analysis of the 3'-NTR suggested the presence of a pseudoknotted structure at this region on the PNRSV RNA that, when stabilized by the presence of Mg 2+ , lost its capability to bind the coat protein. The existence of two mutually exclusive conformations for the 3'-NTR of PNRSV strongly suggests a similar regulatory mechanism at the 3'-NTR level in Alfamo- and Ilarvirus genera

  10. Sequence analysis reveals how G protein-coupled receptors transduce the signal to the G protein.

    NARCIS (Netherlands)

    Oliveira, L.; Paiva, P.B.; Paiva, A.C.; Vriend, G.

    2003-01-01

    Sequence entropy-variability plots based on alignments of very large numbers of sequences-can indicate the location in proteins of the main active site and modulator sites. In the previous article in this issue, we applied this observation to a series of well-studied proteins and concluded that it

  11. Transgenic Sugarcane Resistant to Sorghum mosaic virus Based on Coat Protein Gene Silencing by RNA Interference

    Directory of Open Access Journals (Sweden)

    Jinlong Guo

    2015-01-01

    Full Text Available As one of the critical diseases of sugarcane, sugarcane mosaic disease can lead to serious decline in stalk yield and sucrose content. It is mainly caused by Potyvirus sugarcane mosaic virus (SCMV and/or Sorghum mosaic virus (SrMV, with additional differences in viral strains. RNA interference (RNAi is a novel strategy for producing viral resistant plants. In this study, based on multiple sequence alignment conducted on genomic sequences of different strains and isolates of SrMV, the conserved region of coat protein (CP genes was selected as the target gene and the interference sequence with size of 423 bp in length was obtained through PCR amplification. The RNAi vector pGII00-HACP with an expression cassette containing both hairpin interference sequence and cp4-epsps herbicide-tolerant gene was transferred to sugarcane cultivar ROC22 via Agrobacterium-mediated transformation. After herbicide screening, PCR molecular identification, and artificial inoculation challenge, anti-SrMV positive transgenic lines were successfully obtained. SrMV resistance rate of the transgenic lines with the interference sequence was 87.5% based on SrMV challenge by artificial inoculation. The genetically modified SrMV-resistant lines of cultivar ROC22 provide resistant germplasm for breeding lines and can also serve as resistant lines having the same genetic background for study of resistance mechanisms.

  12. Natural supramolecular building blocks: from virus coat proteins to viral nanoparticles.

    Science.gov (United States)

    Liu, Zhi; Qiao, Jing; Niu, Zhongwei; Wang, Qian

    2012-09-21

    Viruses belong to a fascinating class of natural supramolecular structures, composed of multiple copies of coat proteins (CPs) that assemble into different shapes with a variety of sizes from tens to hundreds of nanometres. Because of their advantages including simple/economic production, well-defined structural features, unique shapes and sizes, genetic programmability and robust chemistries, recently viruses and virus-like nanoparticles (VLPs) have been used widely in biomedical applications and materials synthesis. In this critical review, we highlight recent advances in the use of virus coat proteins (VCPs) and viral nanoparticles (VNPs) as building blocks in self-assembly studies and materials development. We first discuss the self-assembly of VCPs into VLPs, which can efficiently incorporate a variety of different materials as cores inside the viral protein shells. Then, the self-assembly of VNPs at surfaces or interfaces is summarized. Finally, we discuss the co-assembly of VNPs with different functional materials (178 references).

  13. Intravascular local gene transfer mediated by protein-coated metallic stent.

    Science.gov (United States)

    Yuan, J; Gao, R; Shi, R; Song, L; Tang, J; Li, Y; Tang, C; Meng, L; Yuan, W; Chen, Z

    2001-10-01

    To assess the feasibility, efficiency and selectivity of adenovirus-mediated gene transfer to local arterial wall by protein-coated metallic stent. A replication-defective recombinant adenovirus carrying the Lac Z reporter gene for nuclear-specific beta-galactosidase (Ad-beta gal) was used in this study. The coating for metallic stent was made by immersing it in a gelatin solution containing crosslinker. The coated stents were mounted on a 4.0 or 3.0 mm percutaneous transluminal coronary angioplasty (PTCA) balloon and submersed into a high-titer Ad-beta gal viral stock (2 x 10(10) pfu/ml) for 3 min, and then implanted into the carotid arteries in 4 mini-swines and into the left anterior descending branch of the coronary artery in 2 mini-swines via 8F large lumen guiding catheters. The animals were sacrificed 7 (n = 4), 14 (n = 1) and 21 (n = 1) days after implantation, respectively. The beta-galactosidase expression was assessed by X-gal staining. The results showed that the expression of transgene was detected in all animal. In 1 of carotid artery with an intact intima, the beta-gal expression was limited to endothelial cells. In vessels with denuded endothelium, gene expression was found in the sub-intima, media and adventitia. The transfection efficiency of medial smooth muscle cells was 38.6%. In 2 animals sacrificed 7 days after transfection, a microscopic examination of X-gal-stained samples did not show evidence of transfection in remote organs and arterial segments adjacent to the treated arterial site. Adenovirus-mediated arterial gene transfer to endothelial, smooth muscle cells and adventitia by protein-coated metallic stent is feasible. The transfection efficiency is higher. The coated stent may act as a good carrier of adenovirus-mediated gene transfer and have a potential to prevent restenosis following PTCA.

  14. Thickness and morphology of polyelectrolyte coatings on silica surfaces before and after protein exposure studied by atomic force microscopy

    Energy Technology Data Exchange (ETDEWEB)

    Haselberg, Rob, E-mail: r.haselberg@vu.nl [Biomolecular Analysis, Utrecht University, Universiteitsweg 99, 3584 CG Utrecht (Netherlands); AIMMS Division of BioMolecular Analysis, VU University Amsterdam, de Boelelaan 1083, 1081 HV Amsterdam (Netherlands); Flesch, Frits M. [Biomolecular Analysis, Utrecht University, Universiteitsweg 99, 3584 CG Utrecht (Netherlands); Boerke, Arjan [Department of Biochemistry and Cell Biology, Utrecht University, Yalelaan 2, 3508 TD Utrecht (Netherlands); Somsen, Govert W. [Biomolecular Analysis, Utrecht University, Universiteitsweg 99, 3584 CG Utrecht (Netherlands); AIMMS Division of BioMolecular Analysis, VU University Amsterdam, de Boelelaan 1083, 1081 HV Amsterdam (Netherlands)

    2013-05-24

    Graphical abstract: -- Highlights: •Atomic force microscopy is used to characterize polyelectrolyte coatings. •Coating procedure leads to nm-thick layers on a silica surface. •Polyelectrolyte coatings effectively prevent protein adsorption. •AFM provides the high resolution to investigate these thin films. •AFM results support earlier findings obtained with capillary electrophoresis. -- Abstract: Analyte–wall interaction is a significant problem in capillary electrophoresis (CE) as it may compromise separation efficiencies and migration time repeatability. In CE, self-assembled polyelectrolyte multilayer films of Polybrene (PB) and dextran sulfate (DS) or poly(vinylsulfonic acid) (PVS) have been used to coat the capillary inner wall and thereby prevent analyte adsorption. In this study, atomic force microscopy (AFM) was employed to investigate the layer thickness and surface morphology of monolayer (PB), bilayer, (PB-DS and PB-PVS), and trilayer (PB-DS-PB and PB-PVS-PB) coatings on glass surfaces. AFM nanoshaving experiments providing height distributions demonstrated that the coating procedures led to average layer thicknesses between 1 nm (PB) and 5 nm (PB-DS-PB), suggesting the individual polyelectrolytes adhere flat on the silica surface. Investigation of the surface morphology of the different coatings by AFM revealed that the PB coating does not completely cover the silica surface, whereas full coverage was observed for the trilayer coatings. The DS-containing coatings appeared on average 1 nm thicker than the corresponding PVS-containing coatings, which could be attributed to the molecular structure of the anionic polymers applied. Upon exposure to the basic protein cytochrome c, AFM measurements showed an increase of the layer thickness for bare (3.1 nm) and PB-DS-coated (4.6 nm) silica, indicating substantial protein adsorption. In contrast, a very small or no increase of the layer thickness was observed for the PB and PB-DS-PB coatings

  15. Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering

    DEFF Research Database (Denmark)

    Busk, Peter Kamp

    2017-01-01

    Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited...... number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than...... the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer. Peptide Pattern Recognition...

  16. Genetic diversity and molecular evolution of Ornithogalum mosaic virus based on the coat protein gene sequence

    Directory of Open Access Journals (Sweden)

    Fangluan Gao

    2018-03-01

    Full Text Available Ornithogalum mosaic virus (OrMV has a wide host range and affects the production of a variety of ornamentals. In this study, the coat protein (CP gene of OrMVwas used to investigate the molecular mechanisms underlying the evolution of this virus. The 36 OrMV isolates fell into two groups which have significant subpopulation differentiation with an FST value of 0.470. One isolate was identified as a recombinant and the other 35 recombination-free isolates could be divided into two major clades under different evolutionary constraints with dN/dS values of 0.055 and 0.028, respectively, indicating a role of purifying selection in the differentiation of OrMV. In addition, the results from analysis of molecular variance (AMOVA indicated that the effect of host species on the genetic divergence of OrMV is greater than that of geography. Furthermore, OrMV isolates from the genera Ornithogalum, Lachenalia and Diuri tended to group together, indicating that OrMV diversification was maintained, in part, by host-driven adaptation.

  17. Coat protein-mediated resistance against an Indian isolate of the ...

    Indian Academy of Sciences (India)

    Coat protein (CP)-mediated resistance against an Indian isolate of the Cucumber mosaic virus (CMV) subgroup IB was demonstrated in transgenic lines of Nicotiana benthamiana through Agrobacterium tumefaciens-mediated transformation. Out of the fourteen independently transformed lines developed, two lines were ...

  18. PENGKLONAN DAN PERUNUTAN NUKLEOTIDA GEN SELUBUNG PROTEIN DAN 3’UTR (untranslated region PEANUT STRIPE VIRUS

    Directory of Open Access Journals (Sweden)

    Hasriadi Mat Akin

    2011-10-01

    Full Text Available Cloning and sequencing of coat protein gene and 3’UTR (untranslated region of peanut stripe virus. The cDNA of 3' terminal of peanut stripe virus genomic RNA was cloned and sequenced. The cDNA was ligated with plasmid vector pGEM-T Easy and transformed to competent cells of Escherichia coli. The 3' terminal of PstV genomic RNA contained 1195 nucleotides (nts.  The region included the nucleotide sequences of NIb (nuclear inclusion body (129 nts, CP gene (coat protein gene (861 nts, and 3'UTR (untranslated region (205 nts. The nucleotide sequence of a CP gene contained one long uninterrupted open reading frame (ORF without a start codon, which ended a UAG stop codon. The 287 amino acid residues of PStV coat protein were predicted from the CP gene.  The amino acid was analyzed for the presence of consensus polyprotein cleavage site for maturation of potyvirus polyprotein.  A putative cleavage site was found at position 43 (Q/S following the Valine (V residue at -4 position.  This isolate of PstV can be expected to be aphid transmissible because the coat protein contained a DAG triplet at position 53-55.

  19. Hydroxyapatite coating on the titanium substrate modulated by a recombinant collagen-like protein

    International Nuclear Information System (INIS)

    Pan Mingli; Kong Xiangdong; Cai Yurong; Yao Juming

    2011-01-01

    Research highlights: → Hydroxyapatite was deposited on alkali-heat treated Ti substrate by immersing in 1.5 x SBF solution containing the recombinant collagen-like protein. → The recombinant collagen-like protein accelerated the preferential nucleation and growth of hydroxyapatite along c axis on the Ti substrate. → Hydroxyapatite-collagen composite on the Ti substrate promoted the attachment, subsequently proliferation and differentiation of MG-63 cells. - Abstract: Plenty of techniques have been developed to modify the surface character of titanium (Ti) and its alloys in order to realize their biological bond to natural bone. In this work, a biomimetic process was employed to form a hydroxyapatite (HAp) coating on the alkali-heat treated Ti substrate in 1.5 times simulated body fluid (1.5 x SBF) with the addition of a recombinant collagen-like protein. The coating was characterized using SEM-EDX, FESEM, and XRD. Results showed that the recombinant collagen-like protein could accelerate the preferential nucleation and directional growth along c axis of HAp on the pretreated Ti substrates. The investigation of in vitro cell cultivation showed that the existence of recombinant collagen-like protein in coating could improve the initial cell adhesion, proliferation and differentiation of MG-63 cells, which implied the materials possessed excellent biocompatibility and had a wide potential in biomedical application.

  20. Hydroxyapatite coating on the titanium substrate modulated by a recombinant collagen-like protein

    Energy Technology Data Exchange (ETDEWEB)

    Pan Mingli [Key Laboratory of Advanced Textile Materials and Manufacturing Technology of Ministry of Education, College of Materials and Textile, Zhejiang Sci-Tech University, Hangzhou 310018 (China); Kong Xiangdong [College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou 310018 (China); Cai Yurong [Key Laboratory of Advanced Textile Materials and Manufacturing Technology of Ministry of Education, College of Materials and Textile, Zhejiang Sci-Tech University, Hangzhou 310018 (China); Yao Juming, E-mail: yaoj@zstu.edu.cn [Key Laboratory of Advanced Textile Materials and Manufacturing Technology of Ministry of Education, College of Materials and Textile, Zhejiang Sci-Tech University, Hangzhou 310018 (China)

    2011-04-15

    Research highlights: {yields} Hydroxyapatite was deposited on alkali-heat treated Ti substrate by immersing in 1.5 x SBF solution containing the recombinant collagen-like protein. {yields} The recombinant collagen-like protein accelerated the preferential nucleation and growth of hydroxyapatite along c axis on the Ti substrate. {yields} Hydroxyapatite-collagen composite on the Ti substrate promoted the attachment, subsequently proliferation and differentiation of MG-63 cells. - Abstract: Plenty of techniques have been developed to modify the surface character of titanium (Ti) and its alloys in order to realize their biological bond to natural bone. In this work, a biomimetic process was employed to form a hydroxyapatite (HAp) coating on the alkali-heat treated Ti substrate in 1.5 times simulated body fluid (1.5 x SBF) with the addition of a recombinant collagen-like protein. The coating was characterized using SEM-EDX, FESEM, and XRD. Results showed that the recombinant collagen-like protein could accelerate the preferential nucleation and directional growth along c axis of HAp on the pretreated Ti substrates. The investigation of in vitro cell cultivation showed that the existence of recombinant collagen-like protein in coating could improve the initial cell adhesion, proliferation and differentiation of MG-63 cells, which implied the materials possessed excellent biocompatibility and had a wide potential in biomedical application.

  1. Elman RNN based classification of proteins sequences on account of their mutual information.

    Science.gov (United States)

    Mishra, Pooja; Nath Pandey, Paras

    2012-10-21

    In the present work we have employed the method of estimating residue correlation within the protein sequences, by using the mutual information (MI) of adjacent residues, based on structural and solvent accessibility properties of amino acids. The long range correlation between nonadjacent residues is improved by constructing a mutual information vector (MIV) for a single protein sequence, like this each protein sequence is associated with its corresponding MIVs. These MIVs are given to Elman RNN to obtain the classification of protein sequences. The modeling power of MIV was shown to be significantly better, giving a new approach towards alignment free classification of protein sequences. We also conclude that sequence structural and solvent accessible property based MIVs are better predictor. Copyright © 2012 Elsevier Ltd. All rights reserved.

  2. Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition.

    Science.gov (United States)

    Hayat, Maqsood; Khan, Asifullah

    2011-02-21

    Membrane proteins are vital type of proteins that serve as channels, receptors, and energy transducers in a cell. Prediction of membrane protein types is an important research area in bioinformatics. Knowledge of membrane protein types provides some valuable information for predicting novel example of the membrane protein types. However, classification of membrane protein types can be both time consuming and susceptible to errors due to the inherent similarity of membrane protein types. In this paper, neural networks based membrane protein type prediction system is proposed. Composite protein sequence representation (CPSR) is used to extract the features of a protein sequence, which includes seven feature sets; amino acid composition, sequence length, 2 gram exchange group frequency, hydrophobic group, electronic group, sum of hydrophobicity, and R-group. Principal component analysis is then employed to reduce the dimensionality of the feature vector. The probabilistic neural network (PNN), generalized regression neural network, and support vector machine (SVM) are used as classifiers. A high success rate of 86.01% is obtained using SVM for the jackknife test. In case of independent dataset test, PNN yields the highest accuracy of 95.73%. These classifiers exhibit improved performance using other performance measures such as sensitivity, specificity, Mathew's correlation coefficient, and F-measure. The experimental results show that the prediction performance of the proposed scheme for classifying membrane protein types is the best reported, so far. This performance improvement may largely be credited to the learning capabilities of neural networks and the composite feature extraction strategy, which exploits seven different properties of protein sequences. The proposed Mem-Predictor can be accessed at http://111.68.99.218/Mem-Predictor. Copyright © 2010 Elsevier Ltd. All rights reserved.

  3. The HMMER Web Server for Protein Sequence Similarity Search.

    Science.gov (United States)

    Prakash, Ananth; Jeffryes, Matt; Bateman, Alex; Finn, Robert D

    2017-12-08

    Protein sequence similarity search is one of the most commonly used bioinformatics methods for identifying evolutionarily related proteins. In general, sequences that are evolutionarily related share some degree of similarity, and sequence-search algorithms use this principle to identify homologs. The requirement for a fast and sensitive sequence search method led to the development of the HMMER software, which in the latest version (v3.1) uses a combination of sophisticated acceleration heuristics and mathematical and computational optimizations to enable the use of profile hidden Markov models (HMMs) for sequence analysis. The HMMER Web server provides a common platform by linking the HMMER algorithms to databases, thereby enabling the search for homologs, as well as providing sequence and functional annotation by linking external databases. This unit describes three basic protocols and two alternate protocols that explain how to use the HMMER Web server using various input formats and user defined parameters. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  4. Quantiprot - a Python package for quantitative analysis of protein sequences.

    Science.gov (United States)

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  5. Prevalence of Tobacco mosaic virus in Iran and Evolutionary Analyses of the Coat Protein Gene

    Directory of Open Access Journals (Sweden)

    Athar Alishiri

    2013-09-01

    Full Text Available The incidence and distribution of Tobacco mosaic virus (TMV and related tobamoviruses was determined using an enzyme-linked immunosorbent assay on 1,926 symptomatic horticultural crops and 107 asymptomatic weed samples collected from 78 highly infected fields in the major horticultural crop-producing areas in 17 provinces throughout Iran. The results were confirmed by host range studies and reverse transcription-polymerase chain reaction. The overall incidence of infection by these viruses in symptomatic plants was 11.3%. The coat protein (CP gene sequences of a number of isolates were determined and disclosed to be a high identity (up to 100% among the Iranian isolates. Phylogenetic analysis of all known TMV CP genes showed three clades on the basis of nucleotide sequences with all Iranian isolates distinctly clustered in clade II. Analysis using the complete CP amino acid sequence showed one clade with two subgroups, IA and IB, with Iranian isolates in both subgroups. The nucleotide diversity within each sub-group was very low, but higher between the two clades. No correlation was found between genetic distance and geographical origin or host species of isolation. Statistical analyses suggested a negative selection and demonstrated the occurrence of gene flow from the isolates in other clades to the Iranian population.

  6. Nanocomposited coatings produced by laser-assisted process to prevent silicone hydogels from protein fouling and bacterial contamination

    International Nuclear Information System (INIS)

    Huang, Guobang; Chen, Yi; Zhang, Jin

    2016-01-01

    Graphical abstract: Nanocomposited-coating was deposited on silicone hydrogel by using the matrix-assisted pulsed laser evaporation (MAPLE) process. The ZnO–PEG nanocomposited coating reduces over 50% protein absorption on silicone hydrogel, and can inhibit the bacterial growth efficiently. - Highlights: • We developed a nanocomposited coating to prevent silicone hydrogel from biofouling. • Matrix-assisted pulsed laser evaporation can deposit inorganic–organic nanomaterials. • The designed nanocomposited coating reduces protein absorption by over 50%. • The designed nanocomposited coating shows significant antimicrobial efficiency. - Abstract: Zinc oxide (ZnO) nanoparticles incorporating with polyethylene glycol (PEG) were deposited together on the surface of silicone hydrogel through matrix-assisted pulsed laser evaporation (MAPLE). In this process, frozen nanocomposites (ZnO–PEG) in isopropanol were irradiated under a pulsed Nd:YAG laser at 532 nm for 1 h. Our results indicate that the MAPLE process is able to maintain the chemical backbone of polymer and prevent the nanocomposite coating from contamination. The ZnO–PEG nanocomposited coating reduces over 50% protein absorption on silicone hydrogel. The cytotoxicity study shows that the ZnO–PEG nanocomposites deposited on silicone hydrogels do not impose the toxic effect on mouse NIH/3T3 cells. In addition, MAPLE-deposited ZnO–PEG nanocomposites can inhibit the bacterial growth significantly.

  7. Complete nucleotide sequence of a novel Hibiscus-infecting Cilevirus from Florida and its relationship with closely associated Cileviruses

    Science.gov (United States)

    The complete nucleotide sequence of a recently discovered Florida (FL) isolate of Hibiscus infecting Cilevirus (HiCV) was determined by Sanger sequencing. The movement- and coat- protein gene sequences of the HiCV-FL isolate are more divergent than other genes of the previously sequenced HiCV-HA (Ha...

  8. Osteocalcin protein sequences of Neanderthals and modern primates.

    Science.gov (United States)

    Nielsen-Marsh, Christina M; Richards, Michael P; Hauschka, Peter V; Thomas-Oates, Jane E; Trinkaus, Erik; Pettitt, Paul B; Karavanic, Ivor; Poinar, Hendrik; Collins, Matthew J

    2005-03-22

    We report here protein sequences of fossil hominids, from two Neanderthals dating to approximately 75,000 years old from Shanidar Cave in Iraq. These sequences, the oldest reported fossil primate protein sequences, are of bone osteocalcin, which was extracted and sequenced by using MALDI-TOF/TOF mass spectrometry. Through a combination of direct sequencing and peptide mass mapping, we determined that Neanderthals have an osteocalcin amino acid sequence that is identical to that of modern humans. We also report complete osteocalcin sequences for chimpanzee (Pan troglodytes) and gorilla (Gorilla gorilla gorilla) and a partial sequence for orangutan (Pongo pygmaeus), all of which are previously unreported. We found that the osteocalcin sequences of Neanderthals, modern human, chimpanzee, and orangutan are unusual among mammals in that the ninth amino acid is proline (Pro-9), whereas most species have hydroxyproline (Hyp-9). Posttranslational hydroxylation of Pro-9 in osteocalcin by prolyl-4-hydroxylase requires adequate concentrations of vitamin C (l-ascorbic acid), molecular O(2), Fe(2+), and 2-oxoglutarate, and also depends on enzyme recognition of the target proline substrate consensus sequence Leu-Gly-Ala-Pro-9-Ala-Pro-Tyr occurring in most mammals. In five species with Pro-9-Val-10, hydroxylation is blocked, whereas in gorilla there is a mixture of Pro-9 and Hyp-9. We suggest that the absence of hydroxylation of Pro-9 in Pan, Pongo, and Homo may reflect response to a selective pressure related to a decline in vitamin C in the diet during omnivorous dietary adaptation, either independently or through the common ancestor of these species.

  9. Sequence protein identification by randomized sequence database and transcriptome mass spectrometry (SPIDER-TMS): from manual to automatic application of a 'de novo sequencing' approach.

    Science.gov (United States)

    Pascale, Raffaella; Grossi, Gerarda; Cruciani, Gabriele; Mecca, Giansalvatore; Santoro, Donatello; Sarli Calace, Renzo; Falabella, Patrizia; Bianco, Giuliana

    Sequence protein identification by a randomized sequence database and transcriptome mass spectrometry software package has been developed at the University of Basilicata in Potenza (Italy) and designed to facilitate the determination of the amino acid sequence of a peptide as well as an unequivocal identification of proteins in a high-throughput manner with enormous advantages of time, economical resource and expertise. The software package is a valid tool for the automation of a de novo sequencing approach, overcoming the main limits and a versatile platform useful in the proteomic field for an unequivocal identification of proteins, starting from tandem mass spectrometry data. The strength of this software is that it is a user-friendly and non-statistical approach, so protein identification can be considered unambiguous.

  10. On the relationship between residue structural environment and sequence conservation in proteins.

    Science.gov (United States)

    Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao

    2017-09-01

    Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.

  11. SAAS: Short Amino Acid Sequence - A Promising Protein Secondary Structure Prediction Method of Single Sequence

    Directory of Open Access Journals (Sweden)

    Zhou Yuan Wu

    2013-07-01

    Full Text Available In statistical methods of predicting protein secondary structure, many researchers focus on single amino acid frequencies in α-helices, β-sheets, and so on, or the impact near amino acids on an amino acid forming a secondary structure. But the paper considers a short sequence of amino acids (3, 4, 5 or 6 amino acids as integer, and statistics short sequence's probability forming secondary structure. Also, many researchers select low homologous sequences as statistical database. But this paper select whole PDB database. In this paper we propose a strategy to predict protein secondary structure using simple statistical method. Numerical computation shows that, short amino acids sequence as integer to statistics, which can easy see trend of short sequence forming secondary structure, and it will work well to select large statistical database (whole PDB database without considering homologous, and Q3 accuracy is ca. 74% using this paper proposed simple statistical method, but accuracy of others statistical methods is less than 70%.

  12. The coat protein complex II, COPII, protein Sec13 directly interacts with presenilin-1

    International Nuclear Information System (INIS)

    Nielsen, Anders Lade

    2009-01-01

    Mutations in the human gene encoding presenilin-1, PS1, account for most cases of early-onset familial Alzheimer's disease. PS1 has nine transmembrane domains and a large loop orientated towards the cytoplasm. PS1 locates to cellular compartments as endoplasmic reticulum (ER), Golgi apparatus, vesicular structures, and plasma membrane, and is an integral member of γ-secretase, a protein protease complex with specificity for intra-membranous cleavage of substrates such as β-amyloid precursor protein. Here, an interaction between PS1 and the Sec13 protein is described. Sec13 takes part in coat protein complex II, COPII, vesicular trafficking, nuclear pore function, and ER directed protein sequestering and degradation control. The interaction maps to the N-terminal part of the large hydrophilic PS1 loop and the first of the six WD40-repeats present in Sec13. The identified Sec13 interaction to PS1 is a new candidate interaction for linking PS1 to secretory and protein degrading vesicular circuits.

  13. The coat protein complex II, COPII, protein Sec13 directly interacts with presenilin-1

    Energy Technology Data Exchange (ETDEWEB)

    Nielsen, Anders Lade, E-mail: aln@humgen.au.dk [Department of Human Genetics, The Bartholin Building, University of Aarhus, DK-8000 Aarhus C (Denmark)

    2009-10-23

    Mutations in the human gene encoding presenilin-1, PS1, account for most cases of early-onset familial Alzheimer's disease. PS1 has nine transmembrane domains and a large loop orientated towards the cytoplasm. PS1 locates to cellular compartments as endoplasmic reticulum (ER), Golgi apparatus, vesicular structures, and plasma membrane, and is an integral member of {gamma}-secretase, a protein protease complex with specificity for intra-membranous cleavage of substrates such as {beta}-amyloid precursor protein. Here, an interaction between PS1 and the Sec13 protein is described. Sec13 takes part in coat protein complex II, COPII, vesicular trafficking, nuclear pore function, and ER directed protein sequestering and degradation control. The interaction maps to the N-terminal part of the large hydrophilic PS1 loop and the first of the six WD40-repeats present in Sec13. The identified Sec13 interaction to PS1 is a new candidate interaction for linking PS1 to secretory and protein degrading vesicular circuits.

  14. Computational identification of MoRFs in protein sequences.

    Science.gov (United States)

    Malhis, Nawar; Gsponer, Jörg

    2015-06-01

    Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is the binding of molecular recognition features (MoRFs) to globular protein domains in a process known as a disorder-to-order transition. Predicting the location of MoRFs in protein sequences with high accuracy remains an important computational challenge. In this study, we introduce MoRFCHiBi, a new computational approach for fast and accurate prediction of MoRFs in protein sequences. MoRFCHiBi combines the outcomes of two support vector machine (SVM) models that take advantage of two different kernels with high noise tolerance. The first, SVMS, is designed to extract maximal information from the general contrast in amino acid compositions between MoRFs, their surrounding regions (Flanks), and the remainders of the sequences. The second, SVMT, is used to identify similarities between regions in a query sequence and MoRFs of the training set. We evaluated the performance of our predictor by comparing its results with those of two currently available MoRF predictors, MoRFpred and ANCHOR. Using three test sets that have previously been collected and used to evaluate MoRFpred and ANCHOR, we demonstrate that MoRFCHiBi outperforms the other predictors with respect to different evaluation metrics. In addition, MoRFCHiBi is downloadable and fast, which makes it useful as a component in other computational prediction tools. http://www.chibi.ubc.ca/morf/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. Formation of the outer layer of the Dictyostelium spore coat depends on the inner-layer protein SP85/PsB.

    Science.gov (United States)

    Metcalf, Talibah; Kelley, Karen; Erdos, Gregory W; Kaplan, Lee; West, Christopher M

    2003-02-01

    The Dictyostelium spore is surrounded by a 220 microm thick trilaminar coat that consists of inner and outer electron-dense layers surrounding a central region of cellulose microfibrils. In previous studies, a mutant strain (TL56) lacking three proteins associated with the outer layer exhibited increased permeability to macromolecular tracers, suggesting that this layer contributes to the coat permeability barrier. Electron microscopy now shows that the outer layer is incomplete in the coats of this mutant and consists of a residual regular array of punctate electron densities. The outer layer is also incomplete in a mutant lacking a cellulose-binding protein associated with the inner layer, and these coats are deficient in an outer-layer protein and another coat protein. To examine the mechanism by which this inner-layer protein, SP85, contributes to outer-layer formation, various domain fragments were overexpressed in forming spores. Most of these exert dominant negative effects similar to the deletion of outer-layer proteins, but one construct, consisting of a fusion of the N-terminal and Cys-rich C1 domain, induces a dense mat of novel filaments at the surface of the outer layer. Biochemical studies show that the C1 domain binds cellulose, and a combination of site-directed mutations that inhibits its cellulose-binding activity suppresses outer-layer filament induction. The results suggest that, in addition to a previously described early role in regulating cellulose synthesis, SP85 subsequently contributes a cross-bridging function between cellulose and other coat proteins to organize previously unrecognized structural elements in the outer layer of the coat.

  16. MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization

    Directory of Open Access Journals (Sweden)

    Kuczmarski Thomas A

    2006-10-01

    Full Text Available Abstract Background MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. Description MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. Conclusion MannDB comprises a large number of genomes and comprehensive protein

  17. Nonlinear analysis of sequence repeats of multi-domain proteins

    Energy Technology Data Exchange (ETDEWEB)

    Huang Yanzhao [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Li Mingfeng [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Xiao Yi [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China)]. E-mail: lmf_bill@sina.com

    2007-11-15

    Many multi-domain proteins have repetitive three-dimensional structures but nearly-random amino acid sequences. In the present paper, by using a modified recurrence plot proposed by us previously, we show that these amino acid sequences have hidden repetitions in fact. These results indicate that the repetitive domain structures are encoded by the repetitive sequences. This also gives a method to detect the repetitive domain structures directly from amino acid sequences.

  18. The SWISS-PROT protein sequence data bank

    OpenAIRE

    Bairoch, Amos; Boeckmann, Brigitte

    1992-01-01

    SWISS-PROT is an annotated protein sequence database established in 1986 and maintained collaboratively, since 1988, by the Department of Medical Biochemistry of the University of Geneva and the EMBL Data Library

  19. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

    Directory of Open Access Journals (Sweden)

    Holly J Atkinson

    Full Text Available The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

  20. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

    Science.gov (United States)

    Atkinson, Holly J; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C

    2009-01-01

    The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

  1. Poly(norepinephrine)-coated open tubular column for the separation of proteins and recombination human erythropoietin by capillary electrochromatography.

    Science.gov (United States)

    Xiao, Xue; Zhang, Yamin; Wu, Jia; Jia, Li

    2017-12-01

    Recombinant human erythropoietin is an important therapeutic protein with high economic interest due to the benefits provided by its clinical use for the treatment of anemias associated with chronic renal failure and chemotherapy. In this work, a poly(norepinephrine)-coated open tubular column was successfully prepared based on the self-polymerization of norepinephrine under mild alkaline condition, the favorable film forming and easy adhesive properties of poly(norepinephrine). The poly(norepinephrine) coating was characterized by scanning electron microscopy and measurement of the electro-osmotic flow. The thickness of the coating was about 431 nm. The electrochromatographic performance of the poly(norepinephrine)-coated open tubular column was evaluated by separation of proteins. Some basic and acidic proteins including two variants of bovine serum albumin and two variants of β-lactoglobulin achieved separation in the poly(norepinephrine)-coated open tubular column. More importantly, the column demonstrated separation ability for the glycoforms of recombinant human erythropoietin. In addition, the column demonstrated good repeatability with the run-to-run, day-to-day, and column-to-column relative standard deviations of migration times of proteins less than 3.40%. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Regulation of expression of a select group of Bacillus anthracis spore coat proteins.

    Science.gov (United States)

    Aronson, Arthur

    2018-04-01

    The spore coat of Bacilli is a relatively complex structure comprised of about 70 species of proteins in 2 or 3 layers. While some are involved in assembly or protection, the regulation of many are not well defined so lacZ transcriptional fusions were constructed to six Bacillus anthracis spore coat genes in order to gain insight into their possible functions. The genes were selected on the basis of the location of the encoded proteins within the coat and distribution among spore forming species. Conditions tested were temperature and media either as solid or liquid. The most extensive differences were for the relatively well expressed fusions to the cotH and cotM genes, which were greatest at 30°C on plates of a nutrient rich medium. The cotJ operon was moderately expressed under all conditions although somewhat higher on enriched plates at 30°C. Cot S was low under all conditions except for a substantial increase in biofilm medium. Cot∝ and cotF were essentially invariant with a somewhat greater expression in the more enriched medium. The capacity of a subset of coat genes to respond to various conditions reflects a flexibility in spore coat structure that may be necessary for adaptation to environmental challenges. This could account, at least in part, for the complexity of this structure.

  3. Binding constants of Southern rice black-streaked dwarf virus Coat Protein with ferulic acid derivatives

    Directory of Open Access Journals (Sweden)

    Longlu Ran

    2018-04-01

    Full Text Available The data present binding constants between ferulic acid derivatives and the Coat Protein (P10 by fluorescence titration in this article, which is hosted in the research article entitled “Interaction Research on an Antiviral Molecule that Targets the Coat Protein of Southern rice black-streaked dwarf virus’’ (Ran et al., 2017 [1]. The data include fluorescence quenching spectrum, Stern–Volmer quenching constants, and binding parameters. In this article, a more comprehensive data interpretation and analysis is explained.

  4. PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.

    Directory of Open Access Journals (Sweden)

    Huilin Wang

    Full Text Available X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed 'PredPPCrys' using the support vector machine (SVM. Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I. Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II, which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization

  5. Microwave-assisted acid and base hydrolysis of intact proteins containing disulfide bonds for protein sequence analysis by mass spectrometry.

    Science.gov (United States)

    Reiz, Bela; Li, Liang

    2010-09-01

    Controlled hydrolysis of proteins to generate peptide ladders combined with mass spectrometric analysis of the resultant peptides can be used for protein sequencing. In this paper, two methods of improving the microwave-assisted protein hydrolysis process are described to enable rapid sequencing of proteins containing disulfide bonds and increase sequence coverage, respectively. It was demonstrated that proteins containing disulfide bonds could be sequenced by MS analysis by first performing hydrolysis for less than 2 min, followed by 1 h of reduction to release the peptides originally linked by disulfide bonds. It was shown that a strong base could be used as a catalyst for microwave-assisted protein hydrolysis, producing complementary sequence information to that generated by microwave-assisted acid hydrolysis. However, using either acid or base hydrolysis, amide bond breakages in small regions of the polypeptide chains of the model proteins (e.g., cytochrome c and lysozyme) were not detected. Dynamic light scattering measurement of the proteins solubilized in an acid or base indicated that protein-protein interaction or aggregation was not the cause of the failure to hydrolyze certain amide bonds. It was speculated that there were some unknown local structures that might play a role in preventing an acid or base from reacting with the peptide bonds therein. 2010 American Society for Mass Spectrometry. Published by Elsevier Inc. All rights reserved.

  6. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    Science.gov (United States)

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of

  7. Analysis of long-range correlation in sequences data of proteins

    Directory of Open Access Journals (Sweden)

    ADRIANA ISVORAN

    2007-04-01

    Full Text Available The results presented here suggest the existence of correlations in the sequence data of proteins. 32 proteins, both globular and fibrous, both monomeric and polymeric, were analyzed. The primary structures of these proteins were treated as time series. Three spatial series of data for each sequence of a protein were generated from numerical correspondences between each amino acid and a physical property associated with it, i.e., its electric charge, its polar character and its dipole moment. For each series, the spectral coefficient, the scaling exponent and the Hurst coefficient were determined. The values obtained for these coefficients revealed non-randomness in the series of data.

  8. Automatic discovery of cross-family sequence features associated with protein function

    Directory of Open Access Journals (Sweden)

    Krings Andrea

    2006-01-01

    Full Text Available Abstract Background Methods for predicting protein function directly from amino acid sequences are useful tools in the study of uncharacterised protein families and in comparative genomics. Until now, this problem has been approached using machine learning techniques that attempt to predict membership, or otherwise, to predefined functional categories or subcellular locations. A potential drawback of this approach is that the human-designated functional classes may not accurately reflect the underlying biology, and consequently important sequence-to-function relationships may be missed. Results We show that a self-supervised data mining approach is able to find relationships between sequence features and functional annotations. No preconceived ideas about functional categories are required, and the training data is simply a set of protein sequences and their UniProt/Swiss-Prot annotations. The main technical aspect of the approach is the co-evolution of amino acid-based regular expressions and keyword-based logical expressions with genetic programming. Our experiments on a strictly non-redundant set of eukaryotic proteins reveal that the strongest and most easily detected sequence-to-function relationships are concerned with targeting to various cellular compartments, which is an area already well studied both experimentally and computationally. Of more interest are a number of broad functional roles which can also be correlated with sequence features. These include inhibition, biosynthesis, transcription and defence against bacteria. Despite substantial overlaps between these functions and their corresponding cellular compartments, we find clear differences in the sequence motifs used to predict some of these functions. For example, the presence of polyglutamine repeats appears to be linked more strongly to the "transcription" function than to the general "nuclear" function/location. Conclusion We have developed a novel and useful approach for

  9. Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.

    Science.gov (United States)

    Asara, John M; Schweitzer, Mary H; Freimark, Lisa M; Phillips, Matthew; Cantley, Lewis C

    2007-04-13

    Fossilized bones from extinct taxa harbor the potential for obtaining protein or DNA sequences that could reveal evolutionary links to extant species. We used mass spectrometry to obtain protein sequences from bones of a 160,000- to 600,000-year-old extinct mastodon (Mammut americanum) and a 68-million-year-old dinosaur (Tyrannosaurus rex). The presence of T. rex sequences indicates that their peptide bonds were remarkably stable. Mass spectrometry can thus be used to determine unique sequences from ancient organisms from peptide fragmentation patterns, a valuable tool to study the evolution and adaptation of ancient taxa from which genomic sequences are unlikely to be obtained.

  10. Formulating food protein-stabilized indomethacin nanosuspensions into pellets by fluid-bed coating technology: physical characterization, redispersibility, and dissolution.

    Science.gov (United States)

    He, Wei; Lu, Yi; Qi, Jianping; Chen, Lingyun; Yin, Lifang; Wu, Wei

    2013-01-01

    Drug nanosuspensions are very promising for enhancing the dissolution and bioavailability of drugs that are poorly soluble in water. However, the poor stability of nanosuspensions, reflected in particle growth, aggregation/agglomeration, and change in crystallinity state greatly limits their applications. Solidification of nanosuspensions is an ideal strategy for addressing this problem. Hence, the present work aimed to convert drug nanosuspensions into pellets using fluid-bed coating technology. Indomethacin nanosuspensions were prepared by the precipitation-ultrasonication method using food proteins (soybean protein isolate, whey protein isolate, β-lactoglobulin) as stabilizers. Dried nanosuspensions were prepared by coating the nanosuspensions onto pellets. The redispersibility, drug dissolution, solid-state forms, and morphology of the dried nanosuspensions were evaluated. The mean particle size for the nanosuspensions stabilized using soybean protein isolate, whey protein isolate, and β-lactoglobulin was 588 nm, 320 nm, and 243 nm, respectively. The nanosuspensions could be successfully layered onto pellets with high coating efficiency. Both the dried nanosuspensions and nanosuspensions in their original amorphous state and not influenced by the fluid-bed coating drying process could be redispersed in water, maintaining their original particle size and size distribution. Both the dried nanosuspensions and the original drug nanosuspensions showed similar dissolution profiles, which were both much faster than that of the raw crystals. Fluid-bed coating technology has potential for use in the solidification of drug nanosuspensions.

  11. Structural insights and ab initio sequencing within the DING proteins family

    International Nuclear Information System (INIS)

    Elias, Mikael; Liebschner, Dorothee; Gotthard, Guillaume; Chabriere, Eric

    2011-01-01

    DING proteins constitute a recently discovered protein family that is ubiquitous in eukaryotes. The structural insights and the physiological involvements of these intriguing proteins are hereby deciphered. DING proteins constitute an intriguing family of phosphate-binding proteins that was identified in a wide range of organisms, from prokaryotes and archae to eukaryotes. Despite their seemingly ubiquitous occurrence in eukaryotes, their encoding genes are missing from sequenced genomes. Such a lack has considerably hampered functional studies. In humans, these proteins have been related to several diseases, like atherosclerosis, kidney stones, inflammation processes and HIV inhibition. The human phosphate binding protein is a human representative of the DING family that was serendipitously discovered from human plasma. An original approach was developed to determine ab initio the complete and exact sequence of this 38 kDa protein by utilizing mass spectrometry and X-ray data in tandem. Taking advantage of this first complete eukaryotic DING sequence, a immunohistochemistry study was undertaken to check the presence of DING proteins in various mice tissues, revealing that these proteins are widely expressed. Finally, the structure of a bacterial representative from Pseudomonas fluorescens was solved at sub-angstrom resolution, allowing the molecular mechanism of the phosphate binding in these high-affinity proteins to be elucidated

  12. Structural insights and ab initio sequencing within the DING proteins family

    Energy Technology Data Exchange (ETDEWEB)

    Elias, Mikael, E-mail: mikael.elias@weizmann.ac.il [Weizmann Institute of Science, Rehovot (Israel); Liebschner, Dorothee [CRM2, Nancy Université (France); Gotthard, Guillaume; Chabriere, Eric [AFMB, Université Aix-Marseille II (France)

    2011-01-01

    DING proteins constitute a recently discovered protein family that is ubiquitous in eukaryotes. The structural insights and the physiological involvements of these intriguing proteins are hereby deciphered. DING proteins constitute an intriguing family of phosphate-binding proteins that was identified in a wide range of organisms, from prokaryotes and archae to eukaryotes. Despite their seemingly ubiquitous occurrence in eukaryotes, their encoding genes are missing from sequenced genomes. Such a lack has considerably hampered functional studies. In humans, these proteins have been related to several diseases, like atherosclerosis, kidney stones, inflammation processes and HIV inhibition. The human phosphate binding protein is a human representative of the DING family that was serendipitously discovered from human plasma. An original approach was developed to determine ab initio the complete and exact sequence of this 38 kDa protein by utilizing mass spectrometry and X-ray data in tandem. Taking advantage of this first complete eukaryotic DING sequence, a immunohistochemistry study was undertaken to check the presence of DING proteins in various mice tissues, revealing that these proteins are widely expressed. Finally, the structure of a bacterial representative from Pseudomonas fluorescens was solved at sub-angstrom resolution, allowing the molecular mechanism of the phosphate binding in these high-affinity proteins to be elucidated.

  13. Genetic variation of coat protein gene among the isolates of Rice tungro spherical virus from tungro-endemic states of the India.

    Science.gov (United States)

    Mangrauthia, Satendra K; Malathi, P; Agarwal, Surekha; Ramkumar, G; Krishnaveni, D; Neeraja, C N; Madhav, M Sheshu; Ladhalakshmi, D; Balachandran, S M; Viraktamath, B C

    2012-06-01

    Rice tungro disease, one of the major constraints to rice production in South and Southeast Asia, is caused by a combination of two viruses: Rice tungro spherical virus (RTSV) and Rice tungro bacilliform virus (RTBV). The present study was undertaken to determine the genetic variation of RTSV population present in tungro endemic states of Indian subcontinent. Phylogenetic analysis based on coat protein sequences showed distinct divergence of Indian RTSV isolates into two groups; one consisted isolates from Hyderabad (Andhra Pradesh), Cuttack (Orissa), and Puducherry and another from West Bengal, Coimbatore (Tamil Nadu), and Kanyakumari (Tamil Nadu). The results obtained from phylogenetic study were further supported with the SNPs (single nucleotide polymorphism), INDELs (insertion and deletion) and evolutionary distance analysis. In addition, sequence difference count matrix revealed 2-68 nucleotides differences among all the Indian RTSV isolates taken in this study. However, at the protein level these differences were not significant as revealed by Ka/Ks ratio calculation. Sequence identity at nucleotide and amino acid level was 92-100% and 97-100%, respectively, among Indian isolates of RTSV. Understanding of the population structure of RTSV from tungro endemic regions of India would potentially provide insights into the molecular diversification of this virus.

  14. Solving Classification Problems for Large Sets of Protein Sequences with the Example of Hox and ParaHox Proteins

    Directory of Open Access Journals (Sweden)

    Stefanie D. Hueber

    2016-02-01

    Full Text Available Phylogenetic methods are key to providing models for how a given protein family evolved. However, these methods run into difficulties when sequence divergence is either too low or too high. Here, we provide a case study of Hox and ParaHox proteins so that additional insights can be gained using a new computational approach to help solve old classification problems. For two (Gsx and Cdx out of three ParaHox proteins the assignments differ between the currently most established view and four alternative scenarios. We use a non-phylogenetic, pairwise-sequence-similarity-based method to assess which of the previous predictions, if any, are best supported by the sequence-similarity relationships between Hox and ParaHox proteins. The overall sequence-similarities show Gsx to be most similar to Hox2–3, and Cdx to be most similar to Hox4–8. The results indicate that a purely pairwise-sequence-similarity-based approach can provide additional information not only when phylogenetic inference methods have insufficient information to provide reliable classifications (as was shown previously for central Hox proteins, but also when the sequence variation is so high that the resulting phylogenetic reconstructions are likely plagued by long-branch-attraction artifacts.

  15. Sulfonate-terminated carbosilane dendron-coated nanotubes: a greener point of view in protein sample preparation.

    Science.gov (United States)

    González-García, Estefanía; Gutiérrez Ulloa, Carlos E; de la Mata, Francisco Javier; Marina, María Luisa; García, María Concepción

    2017-09-01

    Reduction or removal of solvents and reagents in protein sample preparation is a requirement. Dendrimers can strongly interact with proteins and have great potential as a greener alternative to conventional methods used in protein sample preparation. This work proposes the use of single-walled carbon nanotubes (SWCNTs) functionalized with carbosilane dendrons with sulfonate groups for protein sample preparation and shows the successful application of the proposed methodology to extract proteins from a complex matrix. SEM images of nanotubes and mixtures of nanotubes and proteins were taken. Moreover, intrinsic fluorescence intensity of proteins was monitored to observe the most significant interactions at increasing dendron generations under neutral and basic pHs. Different conditions for the disruption of interactions between proteins and nanotubes after protein extraction and different concentrations of the disrupting reagent and the nanotube were also tried. Compatibility of extraction and disrupting conditions with the enzymatic digestion of proteins for obtaining bioactive peptides was also studied. Finally, sulfonate-terminated carbosilane dendron-coated SWCNTs enabled the extraction of proteins from a complex sample without using non-environmentally friendly solvents that were required so far. Graphical Abstract Green protein extraction from a complex sample employing carbosilane dendron coated nanotubes.

  16. Protein Science by DNA Sequencing: How Advances in Molecular Biology Are Accelerating Biochemistry.

    Science.gov (United States)

    Higgins, Sean A; Savage, David F

    2018-01-09

    A fundamental goal of protein biochemistry is to determine the sequence-function relationship, but the vastness of sequence space makes comprehensive evaluation of this landscape difficult. However, advances in DNA synthesis and sequencing now allow researchers to assess the functional impact of every single mutation in many proteins, but challenges remain in library construction and the development of general assays applicable to a diverse range of protein functions. This Perspective briefly outlines the technical innovations in DNA manipulation that allow massively parallel protein biochemistry and then summarizes the methods currently available for library construction and the functional assays of protein variants. Areas in need of future innovation are highlighted with a particular focus on assay development and the use of computational analysis with machine learning to effectively traverse the sequence-function landscape. Finally, applications in the fundamentals of protein biochemistry, disease prediction, and protein engineering are presented.

  17. An algorithm to find all palindromic sequences in proteins

    Indian Academy of Sciences (India)

    2013-01-20

    Jan 20, 2013 ... 1976; Karrer and Gall 1976; Vogt and Braun 1976) and (iii) in the formation of hairpin loops in the newly transcribed RNA. Palindromic sequences are observed in various classes of proteins like histones (Cheng et al. 1989), prion proteins (Sulkowski 1992; Kazim 1993),. DNA-binding proteins (Suzuki 1992; ...

  18. Dissolution, agglomerate morphology, and stability limits of protein-coated silver nanoparticles.

    Science.gov (United States)

    Martin, Matthew N; Allen, Andrew J; MacCuspie, Robert I; Hackley, Vincent A

    2014-09-30

    Little is understood regarding the impact that molecular coatings have on nanoparticle dissolution kinetics and agglomerate formation in a dilute nanoparticle dispersion. Dissolution and agglomeration processes compete in removing isolated nanoparticles from the dispersion, making quantitative time-dependent measurements of the mechanisms of nanoparticle loss particularly challenging. In this article, we present in situ ultra-small-angle X-ray scattering (USAXS) results, simultaneously quantifying dissolution, agglomeration, and stability limits of silver nanoparticles (AgNPs) coated with bovine serum albumin (BSA) protein. When the BSA corona is disrupted, we find that the loss of silver from the nanoparticle core is well matched by a second-order kinetic rate reaction, arising from the oxidative dissolution of silver. Dissolution and agglomeration are quantified, and morphological transitions throughout the process are qualified. By probing the BSA-AgNP suspension around its stability limits, we provide insight into the destabilization mechanism by which individual particles rapidly dissolve as a whole rather than undergo slow dissolution from the aqueous interface inward, once the BSA layer is breached. Because USAXS rapidly measures over the entire nanometer to micrometer size range during the dissolution process, many insights are also gained into the stabilization of NPs by protein and its ability to protect the labile metal core from the solution environment by prohibiting the diffusion of reactive species. This approach can be extended to a wide variety of coating molecules and reactive metal nanoparticle systems to carefully survey their stability limits, revealing the likely mechanisms of coating breakdown and ensuing reactions.

  19. 3D representations of amino acids—applications to protein sequence comparison and classification

    Directory of Open Access Journals (Sweden)

    Jie Li

    2014-08-01

    Full Text Available The amino acid sequence of a protein is the key to understanding its structure and ultimately its function in the cell. This paper addresses the fundamental issue of encoding amino acids in ways that the representation of such a protein sequence facilitates the decoding of its information content. We show that a feature-based representation in a three-dimensional (3D space derived from amino acid substitution matrices provides an adequate representation that can be used for direct comparison of protein sequences based on geometry. We measure the performance of such a representation in the context of the protein structural fold prediction problem. We compare the results of classifying different sets of proteins belonging to distinct structural folds against classifications of the same proteins obtained from sequence alone or directly from structural information. We find that sequence alone performs poorly as a structure classifier. We show in contrast that the use of the three dimensional representation of the sequences significantly improves the classification accuracy. We conclude with a discussion of the current limitations of such a representation and with a description of potential improvements.

  20. Sequence- and interactome-based prediction of viral protein hotspots targeting host proteins: a case study for HIV Nef.

    Directory of Open Access Journals (Sweden)

    Mahdi Sarmady

    Full Text Available Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk.

  1. Investigation of the effects of process sequence on the contact resistance characteristics of coated metallic bipolar plates for polymer electrolyte membrane fuel cells

    Science.gov (United States)

    Turan, Cabir; Cora, Ömer Necati; Koç, Muammer

    2013-12-01

    In this study, results of an investigation on the effects of manufacturing and coating process sequence on the contact resistance (ICR) of metallic bipolar plates (BPP) for polymer electrolyte membrane fuel cells (PEMFCs) are presented. Firstly, uncoated stainless steel 316L blanks were formed into BPP through hydroforming and stamping processes. Then, these formed BPP samples were coated with three different PVD coatings (CrN, TiN and ZrN) at three different thicknesses (0.1, 0.5 and 1 μm). Secondly, blanks of the same alloy were coated first with the same coatings, thickness and technique; then, they were formed into BPPs of the same shape and dimensions using the manufacturing methods as in the first group. Finally, these two groups of BPP samples were tested for their ICR to reveal the effect of process sequence. ICR tests were also conducted on the BPP plates both before and after exposure to corrosion to disclose the effect of corrosion on ICR. Coated-then-formed BPP samples exhibited similar or even better ICR performance than formed-then-coated BPP samples. Thus, manufacturing of coated blanks can be concluded to be more favorable and worth further investigation in quest of making cost effective BPPs for mass production of PEMFC.

  2. Sequence-based prediction of protein protein interaction using a deep-learning algorithm.

    Science.gov (United States)

    Sun, Tanlin; Zhou, Bo; Lai, Luhua; Pei, Jianfeng

    2017-05-25

    Protein-protein interactions (PPIs) are critical for many biological processes. It is therefore important to develop accurate high-throughput methods for identifying PPI to better understand protein function, disease occurrence, and therapy design. Though various computational methods for predicting PPI have been developed, their robustness for prediction with external datasets is unknown. Deep-learning algorithms have achieved successful results in diverse areas, but their effectiveness for PPI prediction has not been tested. We used a stacked autoencoder, a type of deep-learning algorithm, to study the sequence-based PPI prediction. The best model achieved an average accuracy of 97.19% with 10-fold cross-validation. The prediction accuracies for various external datasets ranged from 87.99% to 99.21%, which are superior to those achieved with previous methods. To our knowledge, this research is the first to apply a deep-learning algorithm to sequence-based PPI prediction, and the results demonstrate its potential in this field.

  3. Prediction of protein hydration sites from sequence by modular neural networks

    DEFF Research Database (Denmark)

    Ehrlich, L.; Reczko, M.; Bohr, Henrik

    1998-01-01

    The hydration properties of a protein are important determinants of its structure and function. Here, modular neural networks are employed to predict ordered hydration sites using protein sequence information. First, secondary structure and solvent accessibility are predicted from sequence with two...... separate neural networks. These predictions are used as input together with protein sequences for networks predicting hydration of residues, backbone atoms and sidechains. These networks are teined with protein crystal structures. The prediction of hydration is improved by adding information on secondary...... structure and solvent accessibility and, using actual values of these properties, redidue hydration can be predicted to 77% accuracy with a Metthews coefficient of 0.43. However, predicted property data with an accuracy of 60-70% result in less than half the improvement in predictive performance observed...

  4. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-01-01

    operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching

  5. Humidity control and hydrophilic glue coating applied to mounted protein crystals improves X-ray diffraction experiments

    International Nuclear Information System (INIS)

    Baba, Seiki; Hoshino, Takeshi; Ito, Len; Kumasaka, Takashi

    2013-01-01

    A new crystal-mounting method has been developed that involves a combination of controlled humid air and polymer glue for crystal coating. This method is particularly useful when applied to fragile protein crystals that are known to be sensitive to subtle changes in their physicochemical environment. Protein crystals are fragile, and it is sometimes difficult to find conditions suitable for handling and cryocooling the crystals before conducting X-ray diffraction experiments. To overcome this issue, a protein crystal-mounting method has been developed that involves a water-soluble polymer and controlled humid air that can adjust the moisture content of a mounted crystal. By coating crystals with polymer glue and exposing them to controlled humid air, the crystals were stable at room temperature and were cryocooled under optimized humidity. Moreover, the glue-coated crystals reproducibly showed gradual transformations of their lattice constants in response to a change in humidity; thus, using this method, a series of isomorphous crystals can be prepared. This technique is valuable when working on fragile protein crystals, including membrane proteins, and will also be useful for multi-crystal data collection

  6. Humidity control and hydrophilic glue coating applied to mounted protein crystals improves X-ray diffraction experiments

    Energy Technology Data Exchange (ETDEWEB)

    Baba, Seiki; Hoshino, Takeshi; Ito, Len; Kumasaka, Takashi, E-mail: kumasaka@spring8.or.jp [Japan Synchrotron Radiation Research Institute (JASRI/SPring-8), 1-1-1 Kouto, Sayo, Hyogo 679-5198 (Japan)

    2013-09-01

    A new crystal-mounting method has been developed that involves a combination of controlled humid air and polymer glue for crystal coating. This method is particularly useful when applied to fragile protein crystals that are known to be sensitive to subtle changes in their physicochemical environment. Protein crystals are fragile, and it is sometimes difficult to find conditions suitable for handling and cryocooling the crystals before conducting X-ray diffraction experiments. To overcome this issue, a protein crystal-mounting method has been developed that involves a water-soluble polymer and controlled humid air that can adjust the moisture content of a mounted crystal. By coating crystals with polymer glue and exposing them to controlled humid air, the crystals were stable at room temperature and were cryocooled under optimized humidity. Moreover, the glue-coated crystals reproducibly showed gradual transformations of their lattice constants in response to a change in humidity; thus, using this method, a series of isomorphous crystals can be prepared. This technique is valuable when working on fragile protein crystals, including membrane proteins, and will also be useful for multi-crystal data collection.

  7. Sequencing and Analysis of the Pseudomonas fluorescens GcM5-1A Genome: A Pathogen Living in the Surface Coat of Bursaphelenchus xylophilus.

    Directory of Open Access Journals (Sweden)

    Kai Feng

    Full Text Available It is known that several bacteria are adherent to the surface coat of pine wood nematode (Bursaphelenchus xylophilus, but their function and role in the pathogenesis of pine wilt disease remains debatable. The Pseudomonas fluorescens GcM5-1A is a bacterium isolated from the surface coat of pine wood nematodes. In previous studies, GcM5-1A was evident in connection with the pathogenicity of pine wilt disease. In this study, we report the de novo sequencing of the GcM5-1A genome. A 600-Mb collection of high-quality reads was obtained and assembled into sequence contigs spanning a 6.01-Mb length. Sequence annotation predicted 5,413 open reading frames, of which 2,988 were homologous to genes in the other four sequenced P. fluorescens isolates (SBW25, WH6, Pf0-1 and Pf-5 and 1,137 were unique to GcM5-1A. Phylogenetic studies and genome comparison revealed that GcM5-1A is more closely related to SBW25 and WH6 isolates than to Pf0-1 and Pf-5 isolates. Towards study of pathogenesis, we identified 79 candidate virulence factors in the genome of GcM5-1A, including the Alg, Fl, Waa gene families, and genes coding the major pathogenic protein fliC. In addition, genes for a complete T3SS system were identified in the genome of GcM5-1A. Such systems have proved to play a critical role in subverting and colonizing the host organisms of many gram-negative pathogenic bacteria. Although the functions of the candidate virulence factors need yet to be deciphered experimentally, the availability of this genome provides a basic platform to obtain informative clues to be addressed in future studies by the pine wilt disease research community.

  8. In Silico Characterization of Pectate Lyase Protein Sequences from Different Source Organisms

    Directory of Open Access Journals (Sweden)

    Amit Kumar Dubey

    2010-01-01

    Full Text Available A total of 121 protein sequences of pectate lyases were subjected to homology search, multiple sequence alignment, phylogenetic tree construction, and motif analysis. The phylogenetic tree constructed revealed different clusters based on different source organisms representing bacterial, fungal, plant, and nematode pectate lyases. The multiple accessions of bacterial, fungal, nematode, and plant pectate lyase protein sequences were placed closely revealing a sequence level similarity. The multiple sequence alignment of these pectate lyase protein sequences from different source organisms showed conserved regions at different stretches with maximum homology from amino acid residues 439–467, 715–816, and 829–910 which could be used for designing degenerate primers or probes specific for pectate lyases. The motif analysis revealed a conserved Pec_Lyase_C domain uniformly observed in all pectate lyases irrespective of variable sources suggesting its possible role in structural and enzymatic functions.

  9. Sequence alignment reveals possible MAPK docking motifs on HIV proteins.

    Directory of Open Access Journals (Sweden)

    Perry Evans

    Full Text Available Over the course of HIV infection, virus replication is facilitated by the phosphorylation of HIV proteins by human ERK1 and ERK2 mitogen-activated protein kinases (MAPKs. MAPKs are known to phosphorylate their substrates by first binding with them at a docking site. Docking site interactions could be viable drug targets because the sequences guiding them are more specific than phosphorylation consensus sites. In this study we use multiple bioinformatics tools to discover candidate MAPK docking site motifs on HIV proteins known to be phosphorylated by MAPKs, and we discuss the possibility of targeting docking sites with drugs. Using sequence alignments of HIV proteins of different subtypes, we show that MAPK docking patterns previously described for human proteins appear on the HIV matrix, Tat, and Vif proteins in a strain dependent manner, but are absent from HIV Rev and appear on all HIV Nef strains. We revise the regular expressions of previously annotated MAPK docking patterns in order to provide a subtype independent motif that annotates all HIV proteins. One revision is based on a documented human variant of one of the substrate docking motifs, and the other reduces the number of required basic amino acids in the standard docking motifs from two to one. The proposed patterns are shown to be consistent with in silico docking between ERK1 and the HIV matrix protein. The motif usage on HIV proteins is sufficiently different from human proteins in amino acid sequence similarity to allow for HIV specific targeting using small-molecule drugs.

  10. Role of Charge Regulation and Size Polydispersity in Nanoparticle Encapsulation by Viral Coat Proteins

    NARCIS (Netherlands)

    Kusters, Remy; Lin, Hsiang-Ku; Zandi, Roya; Tsvetkova, Irina; Dragnea, Bogdan; van der Schoot, Paul

    2015-01-01

    Nanoparticles can be encapsulated by virus coat proteins if their surfaces are functionalized to acquire a sufficiently large negative charge. A minimal surface charge is required to overcome (i) repulsive interactions between the positively charged RNA-binding domains on the proteins and (ii) the

  11. Feature Selection and the Class Imbalance Problem in Predicting Protein Function from Sequence

    NARCIS (Netherlands)

    Al-Shahib, A.; Breitling, R.; Gilbert, D.

    2005-01-01

    Abstract: When the standard approach to predict protein function by sequence homology fails, other alternative methods can be used that require only the amino acid sequence for predicting function. One such approach uses machine learning to predict protein function directly from amino acid sequence

  12. Effect of Protein-Based Edible Coating from Red Snapper (Lutjanus sp.) Surimi on Cooked Shrimp

    Science.gov (United States)

    Rostini, I.; Ibrahim, B.; Trilaksani, W.

    2018-02-01

    Surimi can be used as a raw material for making protein based edible coating to protect cooked shrimp color. The purpose of this study was to determine consumers preference level on cooked shrimp which coated by surimi edible coating from red snapper and to know the microscopic visualization of edible coating layer on cooked shrimp. The treatments for surimi edible coating were without and added by sappan wood (Caesalpinia sappan Linn) extract. Application of surimi edible coating on cooked shrimp was comprised methods (1) boiled then coated and (2) coated then boiled. Edible coating made from surimi with various concentrations which were 2, 6, 10 and 14% of distillated water. The analysis were done using hedonic test and microscopic observation with microscope photographs. Effect of surimi edible coating on cooked shrimp based on the hedonic and colour test results showed that the 14% surimi concentration, added by sappan wood (Caesalpinia sappan Linn) extract on edible coating was the most preferable by panellist and giving the highest shrimp colour. The edible coating surimi application on cooked shrimp which gave the best result was processed by boiling followed by coating.

  13. Coat Protein Mutations That Alter the Flux of Morphogenetic Intermediates through the ϕX174 Early Assembly Pathway.

    Science.gov (United States)

    Blackburn, Brody J; Li, Shuaizhi; Roznowski, Aaron P; Perez, Alexis R; Villarreal, Rodrigo H; Johnson, Curtis J; Hardy, Margaret; Tuckerman, Edward C; Burch, April D; Fane, Bentley A

    2017-12-15

    Two scaffolding proteins orchestrate ϕX174 morphogenesis. The internal scaffolding protein B mediates the formation of pentameric assembly intermediates, whereas the external scaffolding protein D organizes 12 of these intermediates into procapsids. Aromatic amino acid side chains mediate most coat-internal scaffolding protein interactions. One residue in the internal scaffolding protein and three in the coat protein constitute the core of the B protein binding cleft. The three coat gene codons were randomized separately to ascertain the chemical requirements of the encoded amino acids and the morphogenetic consequences of mutation. The resulting mutants exhibited a wide range of recessive phenotypes, which could generally be explained within a structural context. Mutants with phenylalanine, tyrosine, and methionine substitutions were phenotypically indistinguishable from the wild type. However, tryptophan substitutions were detrimental at two sites. Charged residues were poorly tolerated, conferring extreme temperature-sensitive and lethal phenotypes. Eighteen lethal and conditional lethal mutants were genetically and biochemically characterized. The primary defect associated with the missense substitutions ranged from inefficient internal scaffolding protein B binding to faulty procapsid elongation reactions mediated by external scaffolding protein D. Elevating B protein concentrations above wild-type levels via exogenous, cloned-gene expression compensated for inefficient B protein binding, as did suppressing mutations within gene B. Similarly, elevating D protein concentrations above wild-type levels or compensatory mutations within gene D suppressed faulty elongation. Some of the parental mutations were pleiotropic, affecting multiple morphogenetic reactions. This progressively reduced the flux of intermediates through the pathway. Accordingly, multiple mechanisms, which may be unrelated, could restore viability. IMPORTANCE Genetic analyses have been

  14. Protein sequencing via nanopore based devices: a nanofluidics perspective

    Science.gov (United States)

    Chinappi, Mauro; Cecconi, Fabio

    2018-05-01

    Proteins perform a huge number of central functions in living organisms, thus all the new techniques allowing their precise, fast and accurate characterization at single-molecule level certainly represent a burst in proteomics with important biomedical impact. In this review, we describe the recent progresses in the developing of nanopore based devices for protein sequencing. We start with a critical analysis of the main technical requirements for nanopore protein sequencing, summarizing some ideas and methodologies that have recently appeared in the literature. In the last sections, we focus on the physical modelling of the transport phenomena occurring in nanopore based devices. The multiscale nature of the problem is discussed and, in this respect, some of the main possible computational approaches are illustrated.

  15. Multiple functional roles of the accessory I-domain of bacteriophage P22 coat protein revealed by NMR structure and CryoEM modeling.

    Science.gov (United States)

    Rizzo, Alessandro A; Suhanovsky, Margaret M; Baker, Matthew L; Fraser, LaTasha C R; Jones, Lisa M; Rempel, Don L; Gross, Michael L; Chiu, Wah; Alexandrescu, Andrei T; Teschke, Carolyn M

    2014-06-10

    Some capsid proteins built on the ubiquitous HK97-fold have accessory domains imparting specific functions. Bacteriophage P22 coat protein has a unique insertion domain (I-domain). Two prior I-domain models from subnanometer cryoelectron microscopy (cryoEM) reconstructions differed substantially. Therefore, the I-domain's nuclear magnetic resonance structure was determined and also used to improve cryoEM models of coat protein. The I-domain has an antiparallel six-stranded β-barrel fold, not previously observed in HK97-fold accessory domains. The D-loop, which is dynamic in the isolated I-domain and intact monomeric coat protein, forms stabilizing salt bridges between adjacent capsomers in procapsids. The S-loop is important for capsid size determination, likely through intrasubunit interactions. Ten of 18 coat protein temperature-sensitive-folding substitutions are in the I-domain, indicating its importance in folding and stability. Several are found on a positively charged face of the β-barrel that anchors the I-domain to a negatively charged surface of the coat protein HK97-core. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures.

    Science.gov (United States)

    Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier

    2016-01-04

    The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Expression of tomato yellow leaf curl virus coat protein using baculovirus expression system and evaluation of its utility as a viral antigen.

    Science.gov (United States)

    Elgaied, Lamiaa; Salem, Reda; Elmenofy, Wael

    2017-08-01

    DNA encoding the coat protein (CP) of an Egyptian isolate of tomato yellow leaf curl virus (TYLCV) was inserted into the genome of Autographa californica nucleopolyhedrovirus (AcNPV) under the control of polyhedrin promoter. The generated recombinant baculovirus construct harboring the coat protein gene was characterized using PCR analysis. The recombinant coat protein expressed in infected insect cells was used as a coating antigen in an indirect Enzyme-linked immunosorbent assay (ELISA) and dot blot to test its utility for the detection of antibody generated against TYLCV virus particles. The results of ELISA and dot blot showed that the TYLCV-antibodies reacted positively with extracts of infected cells using the recombinant virus as a coating antigen with strong signals as well as the TYLCV infected tomato and beat plant extracts as positive samples. Scanning electron microscope examination showed that the expressed TYLCV coat protein was self-assembled into virus-like particles (VLPs) similar in size and morphology to TYLCV virus particles. These results concluded that, the expressed coat protein of TYLCV using baculovirus vector system is a reliable candidate for generation of anti-CP antibody for inexpensive detection of TYLCV-infected plants using indirect CP-ELISA or dot blot with high specificity.

  18. Small proteins link coat and cortex assembly during sporulation in Bacillus subtilis

    Science.gov (United States)

    Ebmeier, Sarah E.; Tan, Irene S.; Clapham, Katie Rose; Ramamurthi, Kumaran S.

    2015-01-01

    Summary Mature spores of the bacterium Bacillus subtilis are encased by two concentric shells: an inner shell (the ‘cortex’), made of peptidoglycan; and an outer proteinaceous shell (the ‘coat’), whose basement layer is anchored to the surface of the developing spore via a 26-amino-acid-long protein called SpoVM. During sporulation, initiation of cortex assembly depends on the successful initiation of coat assembly, but the mechanisms that co-ordinate the morphogenesis of both structures are largely unknown. Here, we describe a sporulation pathway involving SpoVM and a 37-amino-acid-long protein named ‘CmpA’ that is encoded by a previously un-annotated gene and is expressed under control of two sporulation-specific transcription factors (σE and SpoIIID). CmpA localized to the surface of the developing spore and deletion of cmpA resulted in cells progressing through the sporulation programme more quickly. Overproduction of CmpA did not affect normal growth or cell division, but delayed entry into sporulation and abrogated cortex assembly. In those cells that had successfully initiated coat assembly, CmpA was removed by a posttranslational mechanism, presumably in order to overcome the sporulation inhibition it imposed. We propose a model in which CmpA participates in a developmental checkpoint that ensures the proper orchestration of coat and cortex morphogenesis by repressing cortex assembly until coat assembly successfully initiates. PMID:22463703

  19. Expression and the antigenicity of recombinant coat proteins of tungro viruses expressed in Escherichia coli.

    Science.gov (United States)

    Yee, Siew Fung; Chu, Chia Huay; Poili, Evenni; Sum, Magdline Sia Henry

    2017-02-01

    Rice tungro disease (RTD) is a recurring disease affecting rice farming especially in the South and Southeast Asia. The disease is commonly diagnosed by visual observation of the symptoms on diseased plants in paddy fields and by polymerase chain reaction (PCR). However, visual observation is unreliable and PCR can be costly. High-throughput as well as relatively cheap detection methods are important for RTD management for screening large number of samples. Due to this, detection by serological assays such as immunoblotting assays and enzyme-linked immunosorbent assay are preferred. However, these serological assays are limited by lack of continuous supply of antibodies as reagents due to the difficulty in preparing sufficient purified virions as antigens. This study aimed to generate and evaluate the reactivity of the recombinant coat proteins of Rice tungro bacilliform virus (RTBV) and Rice tungro spherical virus (RTSV) as alternative antigens to generate antibodies. The genes encoding the coat proteins of both viruses, RTBV (CP), and RTSV (CP1, CP2 and CP3) were cloned and expressed as recombinant fusion proteins in Escherichia coli. All of the recombinant fusion proteins, with the exception of the recombinant fusion protein of the CP2 of RTSV, were reactive against our in-house anti-tungro rabbit serum. In conclusion, our study showed the potential use of the recombinant fusion coat proteins of the tungro viruses as alternative antigens for production of antibodies for diagnostic purposes. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Surface properties of nanocrystalline TiO2 coatings in relation to the in vitro plasma protein adsorption

    International Nuclear Information System (INIS)

    Lorenzetti, M; Kobe, S; Novak, S; Bernardini, G; Santucci, A; Luxbacher, T

    2015-01-01

    This study reports on the selective adsorption of whole plasma proteins on hydrothermally (HT) grown TiO 2 -anatase coatings and its dependence on the three main surface properties: surface charge, wettability and roughness. The influence of the photo-activation of TiO 2 by UV irradiation was also evaluated. Even though the protein adhesion onto Ti-based substrates was only moderate, better adsorption of any protein (at pH = 7.4) occurred for the most negatively charged and hydrophobic substrate (Ti non-treated) and for the most nanorough and hydrophilic surface (HT Ti3), indicating that the mutual action of the surface characteristics is responsible for the attraction and adhesion of the proteins. The HT coatings showed a higher adsorption of certain proteins (albumin ‘passivation’ layer, apolipoproteins, vitamin D-binding protein, ceruloplasmin, α-2-HS-glycoprotein) and higher ratios of albumin to fibrinogen and albumin to immunoglobulin γ-chains. The UV pre-irradiation affected the surface properties and strongly reduced the adsorption of the proteins. These results provide in-depth knowledge about the characterization of nanocrystalline TiO 2 coatings for body implants and provide a basis for future studies on the hemocompatibility and biocompatibility of such surfaces. (paper)

  1. Impact of surface coating and food-mimicking media on nanosilver-protein interaction

    Energy Technology Data Exchange (ETDEWEB)

    Burcza, Anna, E-mail: anna.burcza@mri.bund.de; Gräf, Volker; Walz, Elke; Greiner, Ralf [Max Rubner-Institute, Department of Food Technology and Bioprocess Engineering (Germany)

    2015-11-15

    The application of silver nanoparticles (AgNPs) in food contact materials has recently become a subject of dispute due to the possible migration of silver in nanoform into foods and beverages. Therefore, the analysis of the interaction of AgNPs with food components, especially proteins, is of high importance in order to increase our knowledge of the behavior of nanoparticles in food matrices. AgPURE™ W10 (20 nm), an industrially applied nanomaterial, was compared with AgNPs of similar size frequently investigated for scientific purposes differing in the surface capping agent (spherical AgNP coated with either PVP or citrate). The interactions of the AgNPs with whey proteins (BSA, α-lactalbumin and β-lactoglobulin) at different pH values (4.2, 7 or 7.4) were investigated using surface plasmon resonance, SDS-PAGE, and asymmetric flow field-flow fractionation. The data obtained by the three different methods correlated well. Besides the nature of the protein and the nanoparticle coating, the environment was shown to affect the interaction significantly. The strongest interaction was obtained with BSA and AgNPs in an acidic environment. Neutral and slightly alkaline conditions however, seemed to prevent the AgNP-protein interaction almost completely. Furthermore, the interaction of whey proteins with AgPURE™ W10 was found to be weaker compared to the interaction with the other two AgNPs under all conditions investigated.

  2. Impact of surface coating and food-mimicking media on nanosilver-protein interaction

    International Nuclear Information System (INIS)

    Burcza, Anna; Gräf, Volker; Walz, Elke; Greiner, Ralf

    2015-01-01

    The application of silver nanoparticles (AgNPs) in food contact materials has recently become a subject of dispute due to the possible migration of silver in nanoform into foods and beverages. Therefore, the analysis of the interaction of AgNPs with food components, especially proteins, is of high importance in order to increase our knowledge of the behavior of nanoparticles in food matrices. AgPURE™ W10 (20 nm), an industrially applied nanomaterial, was compared with AgNPs of similar size frequently investigated for scientific purposes differing in the surface capping agent (spherical AgNP coated with either PVP or citrate). The interactions of the AgNPs with whey proteins (BSA, α-lactalbumin and β-lactoglobulin) at different pH values (4.2, 7 or 7.4) were investigated using surface plasmon resonance, SDS-PAGE, and asymmetric flow field-flow fractionation. The data obtained by the three different methods correlated well. Besides the nature of the protein and the nanoparticle coating, the environment was shown to affect the interaction significantly. The strongest interaction was obtained with BSA and AgNPs in an acidic environment. Neutral and slightly alkaline conditions however, seemed to prevent the AgNP-protein interaction almost completely. Furthermore, the interaction of whey proteins with AgPURE™ W10 was found to be weaker compared to the interaction with the other two AgNPs under all conditions investigated

  3. Polyclonal antibodies against the recombinantly expressed coat protein of the Citrus psorosis virus

    Directory of Open Access Journals (Sweden)

    Reda Salem

    2018-05-01

    Full Text Available Psorosis is a damaging disease of citrus that is widespread in many parts of the world. Citrus psorosis virus (CPsV, the type species of the genus Ophiovirus, is the putative causal agent of psorosis. Detection of CPsV by laboratory methods, serology in particular is a primary requirement for large-scale surveys but their production has been impaired by the difficulty of obtaining sufficient clean antigen for immunization. Specific PAbs against coat protein were produced in E. coli using recombinant DNA approach. The full length CP gene fragment was amplified by RT-PCR using total RNA extracted from CPsV infected citrus leaves and CP specific primers. The obtained product (1320bp was cloned, sequenced and sub-cloned into pET-30(+ expression vector. Expression was induced and screened in different bacterial clones by the presence of the expressed protein (48kDa and optimized in one clone. Expressed CP was purified using batch chromatography under denaturing conditions. Specificity of expressed protein was demonstrated by ELISA before used as antigen for raising PAbs in mice. Specificity of the raised PAbs to CPsV was verified by ELISA and western blotting. The raised PAbs were showed highly effectiveness in screening by ELISA comparing with the commercial antibodies purchased from Agritest, Valanzano, Italy.The expression of CPsV CP gene in E. coli, production of PAbs using recombinant protein as an antigen, the suitability of these antibodies for use in immunodiagnostics against the CPsV Egyptian isolate have been accomplished in this work. Keywords: CPsV, CP, PAbs, RT-PCR, ELISA, Western blotting

  4. An isoform of eukaryotic initiation factor 4E from Chrysanthemum morifolium interacts with Chrysanthemum virus B coat protein.

    Directory of Open Access Journals (Sweden)

    Aiping Song

    Full Text Available BACKGROUND: Eukaryotic translation initiation factor 4E (eIF4E plays an important role in plant virus infection as well as the regulation of gene translation. METHODOLOGY/PRINCIPAL FINDINGS: Here, we describe the isolation of a cDNA encoding CmeIF(iso4E (GenBank accession no. JQ904592, an isoform of eIF4E from chrysanthemum, using RACE PCR. We used the CmeIF(iso4E cDNA for expression profiling and to analyze the interaction between CmeIF(iso4E and the Chrysanthemum virus B coat protein (CVBCP. Multiple sequence alignment and phylogenetic tree analysis showed that the sequence similarity of CmeIF(iso4E with other reported plant eIF(iso4E sequences varied between 69.12% and 89.18%, indicating that CmeIF(iso4E belongs to the eIF(iso4E subfamily of the eIF4E family. CmeIF(iso4E was present in all chrysanthemum organs, but was particularly abundant in the roots and flowers. Confocal microscopy showed that a transiently transfected CmeIF(iso4E-GFP fusion protein distributed throughout the whole cell in onion epidermis cells. A yeast two hybrid assay showed CVBCP interacted with CmeIF(iso4E but not with CmeIF4E. BiFC assay further demonstrated the interaction between CmeIF(iso4E and CVBCP. Luminescence assay showed that CVBCP increased the RLU of Luc-CVB, suggesting CVBCP might participate in the translation of viral proteins. CONCLUSIONS/SIGNIFICANCE: These results inferred that CmeIF(iso4E as the cap-binding subunit eIF(iso4F may be involved in Chrysanthemum Virus B infection in chrysanthemum through its interaction with CVBCP in spatial.

  5. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-01-01

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment

  6. Poly(oligoethylene glycol methacrylate) dip-coating: turning cellulose paper into a protein-repellent platform for biosensors.

    Science.gov (United States)

    Deng, Xudong; Smeets, Niels M B; Sicard, Clémence; Wang, Jingyun; Brennan, John D; Filipe, Carlos D M; Hoare, Todd

    2014-09-17

    The passivation of nonspecific protein adsorption to paper is a major barrier to the use of paper as a platform for microfluidic bioassays. Herein we describe a simple, scalable protocol based on adsorption and cross-linking of poly(oligoethylene glycol methacrylate) (POEGMA) derivatives that reduces nonspecific adsorption of a range of proteins to filter paper by at least 1 order of magnitude without significantly changing the fiber morphology or paper macroporosity. A lateral-flow test strip coated with POEGMA facilitates effective protein transport while also confining the colorimetric reporting signal for easier detection, giving improved performance relative to bovine serum albumin (BSA)-blocked paper. Enzyme-linked immunosorbent assays based on POEGMA-coated paper also achieve lower blank values, higher sensitivities, and lower detection limits relative to ones based on paper blocked with BSA or skim milk. We anticipate that POEGMA-coated paper can function as a platform for the design of portable, disposable, and low-cost paper-based biosensors.

  7. Blocking of bacterial biofilm formation by a fish protein coating

    DEFF Research Database (Denmark)

    Vejborg, Rebecca Munk; Klemm, Per

    2008-01-01

    Bacterial biofilm formation on inert surfaces is a significant health and economic problem in a wide range of environmental, industrial, and medical areas. Bacterial adhesion is generally a prerequisite for this colonization process and, thus, represents an attractive target for the development......, this proteinaceous coating is characterized with regards to its biofilm-reducing properties by using a range of urinary tract infectious isolates with various pathogenic and adhesive properties. The antiadhesive coating significantly reduced or delayed biofilm formation by all these isolates under every condition...... examined. The biofilm-reducing activity did, however, vary depending on the substratum physicochemical characteristics and the environmental conditions studied. These data illustrate the importance of protein conditioning layers with respect to bacterial biofilm formation and suggest that antiadhesive...

  8. Improving pairwise comparison of protein sequences with domain co-occurrence

    Science.gov (United States)

    Gascuel, Olivier

    2018-01-01

    Comparing and aligning protein sequences is an essential task in bioinformatics. More specifically, local alignment tools like BLAST are widely used for identifying conserved protein sub-sequences, which likely correspond to protein domains or functional motifs. However, to limit the number of false positives, these tools are used with stringent sequence-similarity thresholds and hence can miss several hits, especially for species that are phylogenetically distant from reference organisms. A solution to this problem is then to integrate additional contextual information to the procedure. Here, we propose to use domain co-occurrence to increase the sensitivity of pairwise sequence comparisons. Domain co-occurrence is a strong feature of proteins, since most protein domains tend to appear with a limited number of other domains on the same protein. We propose a method to take this information into account in a typical BLAST analysis and to construct new domain families on the basis of these results. We used Plasmodium falciparum as a case study to evaluate our method. The experimental findings showed an increase of 14% of the number of significant BLAST hits and an increase of 25% of the proteome area that can be covered with a domain. Our method identified 2240 new domains for which, in most cases, no model of the Pfam database could be linked. Moreover, our study of the quality of the new domains in terms of alignment and physicochemical properties show that they are close to that of standard Pfam domains. Source code of the proposed approach and supplementary data are available at: https://gite.lirmm.fr/menichelli/pairwise-comparison-with-cooccurrence PMID:29293498

  9. Polyclonal Antibodies to a Recombinant Coat Protein of Potato Virus A

    Czech Academy of Sciences Publication Activity Database

    Čeřovská, Noemi; Moravec, Tomáš; Velemínský, Jiří

    2002-01-01

    Roč. 46, - (2002), s. 147-151 ISSN 0001-723X R&D Projects: GA ČR GA310/00/0381 Institutional research plan: CEZ:AV0Z5038910 Keywords : Potato virus A * recombinant coat protein * Escherichia coli Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 0.660, year: 2002

  10. Discriminating Microbial Species Using Protein Sequence Properties and Machine Learning

    NARCIS (Netherlands)

    Shahib, Ali Al-; Gilbert, David; Breitling, Rainer

    2007-01-01

    Much work has been done to identify species-specific proteins in sequenced genomes and hence to determine their function. We assumed that such proteins have specific physico-chemical properties that will discriminate them from proteins in other species. In this paper, we examine the validity of this

  11. Impact of whey protein coating incorporated with Bifidobacterium and Lactobacillus on sliced ham properties.

    Science.gov (United States)

    Odila Pereira, Joana; Soares, José; J P Monteiro, Maria; Gomes, Ana; Pintado, Manuela

    2018-05-01

    Edible coatings/films with functional ingredients may be a solution to consumers' demands for high-quality food products and an extended shelf-life. The aim of this work was to evaluate the antimicrobial efficiency of edible coatings incorporated with probiotics on sliced ham preservation. Coatings was developed based on whey protein isolates with incorporation of Bifidobacterium animalis Bb-12® or Lactobacillus casei-01. The physicochemical analyses showed that coating decreased water and weight loss on the ham. Furthermore, color analysis showed that coated sliced ham, exhibited no color change, comparatively to uncoated slices. The edible coatings incorporating the probiotic strains inhibited detectable growth of Staphylococcus spp., Pseudomonas spp., Enterobacteriaceae and yeasts/molds, at least, for 45days of storage at 4°C. The sensory evaluation demonstrated that there was a preference for the sliced coated ham. Probiotic bacteria viable cell numbers were maintained at ca. 10 8 CFU/g throughout storage time, enabling the slice of ham to act as a suitable carrier for the beneficial bacteria. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Protein secondary structure prediction for a single-sequence using hidden semi-Markov models

    Directory of Open Access Journals (Sweden)

    Borodovsky Mark

    2006-03-01

    Full Text Available Abstract Background The accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present. Results In this paper, we further refine and extend the hidden semi-Markov model (HSMM initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition. Conclusions We have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable

  13. Production of Polyclonal Antibodies to the Recombinant Potato virus M (PVM) Non-structural Triple Gene Block Protein 1 and Coat Protein

    Czech Academy of Sciences Publication Activity Database

    Čeřovská, Noemi; Moravec, Tomáš; Plchová, Helena; Hoffmeisterová, Hana; Dědič, P.

    2012-01-01

    Roč. 160, č. 5 (2012), s. 251-254 ISSN 0931-1785 R&D Projects: GA MŠk 1M06030 Institutional research plan: CEZ:AV0Z50380511 Keywords : Potato virus M * recombinant protein * coat protein Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 1.000, year: 2012

  14. Adhesive proteins of stalked and acorn barnacles display homology with low sequence similarities.

    Directory of Open Access Journals (Sweden)

    Jaimie-Leigh Jonker

    Full Text Available Barnacle adhesion underwater is an important phenomenon to understand for the prevention of biofouling and potential biotechnological innovations, yet so far, identifying what makes barnacle glue proteins 'sticky' has proved elusive. Examination of a broad range of species within the barnacles may be instructive to identify conserved adhesive domains. We add to extensive information from the acorn barnacles (order Sessilia by providing the first protein analysis of a stalked barnacle adhesive, Lepas anatifera (order Lepadiformes. It was possible to separate the L. anatifera adhesive into at least 10 protein bands using SDS-PAGE. Intense bands were present at approximately 30, 70, 90 and 110 kilodaltons (kDa. Mass spectrometry for protein identification was followed by de novo sequencing which detected 52 peptides of 7-16 amino acids in length. None of the peptides matched published or unpublished transcriptome sequences, but some amino acid sequence similarity was apparent between L. anatifera and closely-related Dosima fascicularis. Antibodies against two acorn barnacle proteins (ab-cp-52k and ab-cp-68k showed cross-reactivity in the adhesive glands of L. anatifera. We also analysed the similarity of adhesive proteins across several barnacle taxa, including Pollicipes pollicipes (a stalked barnacle in the order Scalpelliformes. Sequence alignment of published expressed sequence tags clearly indicated that P. pollicipes possesses homologues for the 19 kDa and 100 kDa proteins in acorn barnacles. Homology aside, sequence similarity in amino acid and gene sequences tended to decline as taxonomic distance increased, with minimum similarities of 18-26%, depending on the gene. The results indicate that some adhesive proteins (e.g. 100 kDa are more conserved within barnacles than others (20 kDa.

  15. Relationships between residue Voronoi volume and sequence conservation in proteins.

    Science.gov (United States)

    Liu, Jen-Wei; Cheng, Chih-Wen; Lin, Yu-Feng; Chen, Shao-Yu; Hwang, Jenn-Kang; Yen, Shih-Chung

    2018-02-01

    Functional and biophysical constraints can cause different levels of sequence conservation in proteins. Previously, structural properties, e.g., relative solvent accessibility (RSA) and packing density of the weighted contact number (WCN), have been found to be related to protein sequence conservation (CS). The Voronoi volume has recently been recognized as a new structural property of the local protein structural environment reflecting CS. However, for surface residues, it is sensitive to water molecules surrounding the protein structure. Herein, we present a simple structural determinant termed the relative space of Voronoi volume (RSV); it uses the Voronoi volume and the van der Waals volume of particular residues to quantify the local structural environment. RSV (range, 0-1) is defined as (Voronoi volume-van der Waals volume)/Voronoi volume of the target residue. The concept of RSV describes the extent of available space for every protein residue. RSV and Voronoi profiles with and without water molecules (RSVw, RSV, VOw, and VO) were compared for 554 non-homologous proteins. RSV (without water) showed better Pearson's correlations with CS than did RSVw, VO, or VOw values. The mean correlation coefficient between RSV and CS was 0.51, which is comparable to the correlation between RSA and CS (0.49) and that between WCN and CS (0.56). RSV is a robust structural descriptor with and without water molecules and can quantitatively reflect evolutionary information in a single protein structure. Therefore, it may represent a practical structural determinant to study protein sequence, structure, and function relationships. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. On-Chip Manipulation of Protein-Coated Magnetic Beads via Domain-Wall Conduits

    DEFF Research Database (Denmark)

    Donolato, Marco; Vavassori, Paolo; Gobbi, Marco

    2010-01-01

    Geometrically constrained magnetic domain walls (DWs) in magnetic nanowires can be manipulated at the nanometer scale. The inhomogeneous magnetic stray field generated by a DW can capture a magnetic nanoparticle in solution. On-chip nanomanipulation of individual magnetic beads coated with proteins...

  17. Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics

    Science.gov (United States)

    Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas

    2014-01-01

    Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727

  18. The Coat Protein and NIa Protease of Two Potyviridae Family Members Independently Confer Superinfection Exclusion

    Science.gov (United States)

    French, Roy

    2016-01-01

    ABSTRACT Superinfection exclusion (SIE) is an antagonistic virus-virus interaction whereby initial infection by one virus prevents subsequent infection by closely related viruses. Although SIE has been described in diverse viruses infecting plants, humans, and animals, its mechanisms, including involvement of specific viral determinants, are just beginning to be elucidated. In this study, SIE determinants encoded by two economically important wheat viruses, Wheat streak mosaic virus (WSMV; genus Tritimovirus, family Potyviridae) and Triticum mosaic virus (TriMV; genus Poacevirus, family Potyviridae), were identified in gain-of-function experiments that used heterologous viruses to express individual virus-encoded proteins in wheat. Wheat plants infected with TriMV expressing WSMV P1, HC-Pro, P3, 6K1, CI, 6K2, NIa-VPg, or NIb cistrons permitted efficient superinfection by WSMV expressing green fluorescent protein (WSMV-GFP). In contrast, wheat infected with TriMV expressing WSMV NIa-Pro or coat protein (CP) substantially excluded superinfection by WSMV-GFP, suggesting that both of these cistrons are SIE effectors encoded by WSMV. Importantly, SIE is due to functional WSMV NIa-Pro or CP rather than their encoding RNAs, as altering the coded protein products by minimally changing RNA sequences led to abolishment of SIE. Deletion mutagenesis further revealed that elicitation of SIE by NIa-Pro requires the entire protein while CP requires only a 200-amino-acid (aa) middle fragment (aa 101 to 300) of the 349 aa. Strikingly, reciprocal experiments with WSMV-mediated expression of TriMV proteins showed that TriMV CP, and TriMV NIa-Pro to a lesser extent, likewise excluded superinfection by TriMV-GFP. Collectively, these data demonstrate that WSMV- and TriMV-encoded CP and NIa-Pro proteins are effectors of SIE and that these two proteins trigger SIE independently of each other. IMPORTANCE Superinfection exclusion (SIE) is an antagonistic virus-virus interaction that

  19. Expression and purification of coat protein of citrus tristeza virus ...

    African Journals Online (AJOL)

    CTV coat protein gene (CTV-cp) cloned in pQE30 vector and transformed to DH5α containing 666bp long from Thailand MK-50 isolate was amplified with a forward primer CTV-CP1 (5' CAC CGA CGA AAC AAA GAA ATT GAA GAA CA 3') and a reverse primer CTVCP2 (5' TCA ACG TGT GTT AAA TTT CCC AAG C 3') and ...

  20. Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations.

    Science.gov (United States)

    Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir

    2018-01-01

    Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.

  1. Functional improvement of antibody fragments using a novel phage coat protein III fusion system

    DEFF Research Database (Denmark)

    Jensen, Kim Bak; Larsen, Martin; Pedersen, Jesper Søndergaard

    2002-01-01

    Functional expressions of proteins often depend on the presence of host specific factors. Frequently recombinant expression strategies of proteins in foreign hosts, such as bacteria, have been associated with poor yields or significant loss of functionality. Improvements in the performance of het......(s) of the filamentous phage coat protein III. Furthermore, it will be shown that the observed effect is neither due to improved stability nor increased avidity....

  2. Sequence heterogeneity accelerates protein search for targets on DNA

    International Nuclear Information System (INIS)

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-01-01

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome

  3. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  4. Protein sequences clustering of herpes virus by using Tribe Markov clustering (Tribe-MCL)

    Science.gov (United States)

    Bustamam, A.; Siswantining, T.; Febriyani, N. L.; Novitasari, I. D.; Cahyaningrum, R. D.

    2017-07-01

    The herpes virus can be found anywhere and one of the important characteristics is its ability to cause acute and chronic infection at certain times so as a result of the infection allows severe complications occurred. The herpes virus is composed of DNA containing protein and wrapped by glycoproteins. In this work, the Herpes viruses family is classified and analyzed by clustering their protein-sequence using Tribe Markov Clustering (Tribe-MCL) algorithm. Tribe-MCL is an efficient clustering method based on the theory of Markov chains, to classify protein families from protein sequences using pre-computed sequence similarity information. We implement the Tribe-MCL algorithm using an open source program of R. We select 24 protein sequences of Herpes virus obtained from NCBI database. The dataset consists of three types of glycoprotein B, F, and H. Each type has eight herpes virus that infected humans. Based on our simulation using different inflation factor r=1.5, 2, 3 we find a various number of the clusters results. The greater the inflation factor the greater the number of their clusters. Each protein will grouped together in the same type of protein.

  5. A machine learning approach for the identification of odorant binding proteins from sequence-derived properties

    Directory of Open Access Journals (Sweden)

    Suganthan PN

    2007-09-01

    Full Text Available Abstract Background Odorant binding proteins (OBPs are believed to shuttle odorants from the environment to the underlying odorant receptors, for which they could potentially serve as odorant presenters. Although several sequence based search methods have been exploited for protein family prediction, less effort has been devoted to the prediction of OBPs from sequence data and this area is more challenging due to poor sequence identity between these proteins. Results In this paper, we propose a new algorithm that uses Regularized Least Squares Classifier (RLSC in conjunction with multiple physicochemical properties of amino acids to predict odorant-binding proteins. The algorithm was applied to the dataset derived from Pfam and GenDiS database and we obtained overall prediction accuracy of 97.7% (94.5% and 98.4% for positive and negative classes respectively. Conclusion Our study suggests that RLSC is potentially useful for predicting the odorant binding proteins from sequence-derived properties irrespective of sequence similarity. Our method predicts 92.8% of 56 odorant binding proteins non-homologous to any protein in the swissprot database and 97.1% of the 414 independent dataset proteins, suggesting the usefulness of RLSC method for facilitating the prediction of odorant binding proteins from sequence information.

  6. Aligning protein sequence and analysing substitution pattern using ...

    Indian Academy of Sciences (India)

    Prakash

    Aligning protein sequences using a score matrix has became a routine but valuable method in modern biological ..... the amino acids according to their substitution behaviour ...... which may cause great change (e.g. prolonging the helix) in.

  7. Chaos game representation of functional protein sequences, and simulation and multifractal analysis of induced measures

    International Nuclear Information System (INIS)

    Zu-Guo, Yu; Qian-Jun, Xiao; Long, Shi; Jun-Wu, Yu; Anh, Vo

    2010-01-01

    Investigating the biological function of proteins is a key aspect of protein studies. Bioinformatic methods become important for studying the biological function of proteins. In this paper, we first give the chaos game representation (CGR) of randomly-linked functional protein sequences, then propose the use of the recurrent iterated function systems (RIFS) in fractal theory to simulate the measure based on their chaos game representations. This method helps to extract some features of functional protein sequences, and furthermore the biological functions of these proteins. Then multifractal analysis of the measures based on the CGRs of randomly-linked functional protein sequences are performed. We find that the CGRs have clear fractal patterns. The numerical results show that the RIFS can simulate the measure based on the CGR very well. The relative standard error and the estimated probability matrix in the RIFS do not depend on the order to link the functional protein sequences. The estimated probability matrices in the RIFS with different biological functions are evidently different. Hence the estimated probability matrices in the RIFS can be used to characterise the difference among linked functional protein sequences with different biological functions. From the values of the D q curves, one sees that these functional protein sequences are not completely random. The D q of all linked functional proteins studied are multifractal-like and sufficiently smooth for the C q (analogous to specific heat) curves to be meaningful. Furthermore, the D q curves of the measure μ based on their CGRs for different orders to link the functional protein sequences are almost identical if q ≥ 0. Finally, the C q curves of all linked functional proteins resemble a classical phase transition at a critical point. (cross-disciplinary physics and related areas of science and technology)

  8. The SBASE protein domain library, release 8.0: a collection of annotated protein sequence segments.

    Science.gov (United States)

    Murvai, J; Vlahovicek, K; Barta, E; Pongor, S

    2001-01-01

    SBASE 8.0 is the eighth release of the SBASE library of protein domain sequences that contains 294 898 annotated structural, functional, ligand-binding and topogenic segments of proteins, cross-referenced to most major sequence databases and sequence pattern collections. The entries are clustered into over 2005 statistically validated domain groups (SBASE-A) and 595 non-validated groups (SBASE-B), provided with several WWW-based search and browsing facilities for online use. A domain-search facility was developed, based on non-parametric pattern recognition methods, including artificial neural networks. SBASE 8.0 is freely available by anonymous 'ftp' file transfer from ftp.icgeb.trieste.it. Automated searching of SBASE can be carried out with the WWW servers http://www.icgeb.trieste.it/sbase/ and http://sbase.abc. hu/sbase/.

  9. The nucleotide sequence of human transition protein 1 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Luerssen, H; Hoyer-Fender, S; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors have screened a human testis cDNA library with an oligonucleotide of 81 mer prepared according to a part of the published nucleotide sequence of the rat transition protein TP 1. They have isolated a cDNA clone with the length of 441 bp containing the coding region of 162 bp for human transition protein 1. There is about 84% homology in the coding region of the sequence compared to rat. The human cDNA-clone encodes a polypeptide of 54 amino acids of which 7 are different to that of rat.

  10. Humidity control and hydrophilic glue coating applied to mounted protein crystals improves X-ray diffraction experiments

    Science.gov (United States)

    Baba, Seiki; Hoshino, Takeshi; Ito, Len; Kumasaka, Takashi

    2013-01-01

    Protein crystals are fragile, and it is sometimes difficult to find conditions suitable for handling and cryocooling the crystals before conducting X-ray diffraction experiments. To overcome this issue, a protein crystal-mounting method has been developed that involves a water-soluble polymer and controlled humid air that can adjust the moisture content of a mounted crystal. By coating crystals with polymer glue and exposing them to controlled humid air, the crystals were stable at room temperature and were cryocooled under optimized humidity. Moreover, the glue-coated crystals reproducibly showed gradual transformations of their lattice constants in response to a change in humidity; thus, using this method, a series of isomorphous crystals can be prepared. This technique is valuable when working on fragile protein crystals, including membrane proteins, and will also be useful for multi-crystal data collection. PMID:23999307

  11. Rapid evolution of the sequences and gene repertoires of secreted proteins in bacteria.

    Directory of Open Access Journals (Sweden)

    Teresa Nogueira

    Full Text Available Proteins secreted to the extracellular environment or to the periphery of the cell envelope, the secretome, play essential roles in foraging, antagonistic and mutualistic interactions. We hypothesize that arms races, genetic conflicts and varying selective pressures should lead to the rapid change of sequences and gene repertoires of the secretome. The analysis of 42 bacterial pan-genomes shows that secreted, and especially extracellular proteins, are predominantly encoded in the accessory genome, i.e. among genes not ubiquitous within the clade. Genes encoding outer membrane proteins might engage more frequently in intra-chromosomal gene conversion because they are more often in multi-genic families. The gene sequences encoding the secretome evolve faster than the rest of the genome and in particular at non-synonymous positions. Cell wall proteins in Firmicutes evolve particularly fast when compared with outer membrane proteins of Proteobacteria. Virulence factors are over-represented in the secretome, notably in outer membrane proteins, but cell localization explains more of the variance in substitution rates and gene repertoires than sequence homology to known virulence factors. Accordingly, the repertoires and sequences of the genes encoding the secretome change fast in the clades of obligatory and facultative pathogens and also in the clades of mutualists and free-living bacteria. Our study shows that cell localization shapes genome evolution. In agreement with our hypothesis, the repertoires and the sequences of genes encoding secreted proteins evolve fast. The particularly rapid change of extracellular proteins suggests that these public goods are key players in bacterial adaptation.

  12. Cloning and sequence analysis of cDNA coding for rat nucleolar protein C23

    International Nuclear Information System (INIS)

    Ghaffari, S.H.; Olson, M.O.J.

    1986-01-01

    Using synthetic oligonucleotides as primers and probes, the authors have isolated and sequenced cDNA clones encoding protein C23, a putative nucleolus organizer protein. Poly(A + ) RNA was isolated from rat Novikoff hepatoma cells and enriched in C23 mRNA by sucrose density gradient ultracentrifugation. Two deoxyoligonuleotides, a 48- and a 27-mer, were synthesized on the basis of amino acid sequence from the C-terminal half of protein C23 and cDNA sequence data from CHO cell protein. The 48-mer was used a primer for synthesis of cDNA which was then inserted into plasmid pUC9. Transformed bacterial colonies were screened by hybridization with 32 P labeled 27-mer. Two clones among 5000 gave a strong positive signal. Plasmid DNAs from these clones were purified and characterized by blotting and nucleotide sequence analysis. The length of C23 mRNA was estimated to be 3200 bases in a northern blot analysis. The sequence of a 267 b.p. insert shows high homology with the CHO cDNA with only 9 nucleotide differences and an identical amino acid sequence. These studies indicate that this region of the protein is highly conserved

  13. Integrated analysis of RNA-binding protein complexes using in vitro selection and high-throughput sequencing and sequence specificity landscapes (SEQRS).

    Science.gov (United States)

    Lou, Tzu-Fang; Weidmann, Chase A; Killingsworth, Jordan; Tanaka Hall, Traci M; Goldstrohm, Aaron C; Campbell, Zachary T

    2017-04-15

    RNA-binding proteins (RBPs) collaborate to control virtually every aspect of RNA function. Tremendous progress has been made in the area of global assessment of RBP specificity using next-generation sequencing approaches both in vivo and in vitro. Understanding how protein-protein interactions enable precise combinatorial regulation of RNA remains a significant problem. Addressing this challenge requires tools that can quantitatively determine the specificities of both individual proteins and multimeric complexes in an unbiased and comprehensive way. One approach utilizes in vitro selection, high-throughput sequencing, and sequence-specificity landscapes (SEQRS). We outline a SEQRS experiment focused on obtaining the specificity of a multi-protein complex between Drosophila RBPs Pumilio (Pum) and Nanos (Nos). We discuss the necessary controls in this type of experiment and examine how the resulting data can be complemented with structural and cell-based reporter assays. Additionally, SEQRS data can be integrated with functional genomics data to uncover biological function. Finally, we propose extensions of the technique that will enhance our understanding of multi-protein regulatory complexes assembled onto RNA. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Next-Generation Sequencing for Binary Protein-Protein Interactions

    Directory of Open Access Journals (Sweden)

    Bernhard eSuter

    2015-12-01

    Full Text Available The yeast two-hybrid (Y2H system exploits host cell genetics in order to display binary protein-protein interactions (PPIs via defined and selectable phenotypes. Numerous improvements have been made to this method, adapting the screening principle for diverse applications, including drug discovery and the scale-up for proteome wide interaction screens in human and other organisms. Here we discuss a systematic workflow and analysis scheme for screening data generated by Y2H and related assays that includes high-throughput selection procedures, readout of comprehensive results via next-generation sequencing (NGS, and the interpretation of interaction data via quantitative statistics. The novel assays and tools will serve the broader scientific community to harness the power of NGS technology to address PPI networks in health and disease. We discuss examples of how this next-generation platform can be applied to address specific questions in diverse fields of biology and medicine.

  15. Self-assembling triblock proteins for biofunctional surface modification

    Science.gov (United States)

    Fischer, Stephen E.

    Despite the tremendous promise of cell/tissue engineering, significant challenges remain in engineering functional scaffolds to precisely regulate the complex processes of tissue growth and development. As the point of contact between the cells and the scaffold, the scaffold surface plays a major role in mediating cellular behaviors. In this dissertation, the development and utility of self-assembling, artificial protein hydrogels as biofunctional surface modifiers is described. The design of these recombinant proteins is based on a telechelic triblock motif, in which a disordered polyelectrolyte central domain containing embedded bioactive ligands is flanked by two leucine zipper domains. Under moderate conditions of temperature and pH, the leucine zipper end domains form amphiphilic alpha-helices that reversibly associate into homo-trimeric aggregates, driving hydrogel formation. Moreover, the amphiphilic nature of these helical domains enables surface adsorption to a variety of scaffold materials to form biofunctional protein coatings. The nature and stability of these coatings in various solution conditions, and their interaction with mammalian cells is the primary focus of this dissertation. In particular, triblock protein coatings functionalized with cell recognition sequences are shown to produce well-defined surfaces with precise control over ligand density. The impact of this is demonstrated in multiple cell types through ligand density-dependent cell-substrate interactions. To improve the stability of these physically self-assembled coatings, two covalent crosslinking strategies are described---one in which a zero-length chemical crosslinker (EDC) is utilized and a second in which disulfide bonds are engineered into the recombinant proteins. These targeted crosslinking approaches are shown to increase the stability of surface adsorbed protein layers with minimal effect on the presentation of many bioactive ligands. Finally, to demonstrate the versatility

  16. Exploring sequence characteristics related to high-level production of secreted proteins in Aspergillus niger.

    Directory of Open Access Journals (Sweden)

    Bastiaan A van den Berg

    Full Text Available Protein sequence features are explored in relation to the production of over-expressed extracellular proteins by fungi. Knowledge on features influencing protein production and secretion could be employed to improve enzyme production levels in industrial bioprocesses via protein engineering. A large set, over 600 homologous and nearly 2,000 heterologous fungal genes, were overexpressed in Aspergillus niger using a standardized expression cassette and scored for high versus no production. Subsequently, sequence-based machine learning techniques were applied for identifying relevant DNA and protein sequence features. The amino-acid composition of the protein sequence was found to be most predictive and interpretation revealed that, for both homologous and heterologous gene expression, the same features are important: tyrosine and asparagine composition was found to have a positive correlation with high-level production, whereas for unsuccessful production, contributions were found for methionine and lysine composition. The predictor is available online at http://bioinformatics.tudelft.nl/hipsec. Subsequent work aims at validating these findings by protein engineering as a method for increasing expression levels per gene copy.

  17. Sequence variability is correlated with weak immunogenicity in Streptococcus pyogenes M protein

    Science.gov (United States)

    Lannergård, Jonas; Kristensen, Bodil M; Gustafsson, Mattias C U; Persson, Jenny J; Norrby-Teglund, Anna; Stålhammar-Carlemalm, Margaretha; Lindahl, Gunnar

    2015-01-01

    The M protein of Streptococcus pyogenes, a major bacterial virulence factor, has an amino-terminal hypervariable region (HVR) that is a target for type-specific protective antibodies. Intriguingly, the HVR elicits a weak antibody response, indicating that it escapes host immunity by two mechanisms, sequence variability and weak immunogenicity. However, the properties influencing the immunogenicity of regions in an M protein remain poorly understood. Here, we studied the antibody response to different regions of the classical M1 and M5 proteins, in which not only the HVR but also the adjacent fibrinogen-binding B repeat region exhibits extensive sequence divergence. Analysis of antisera from S. pyogenes-infected patients, infected mice, and immunized mice showed that both the HVR and the B repeat region elicited weak antibody responses, while the conserved carboxy-terminal part was immunodominant. Thus, we identified a correlation between sequence variability and weak immunogenicity for M protein regions. A potential explanation for the weak immunogenicity was provided by the demonstration that protease digestion selectively eliminated the HVR-B part from whole M protein-expressing bacteria. These data support a coherent model, in which the entire variable HVR-B part evades antibody attack, not only by sequence variability but also by weak immunogenicity resulting from protease attack. PMID:26175306

  18. JACOP: A simple and robust method for the automated classification of protein sequences with modular architecture

    Directory of Open Access Journals (Sweden)

    Pagni Marco

    2005-08-01

    Full Text Available Abstract Background Whole-genome sequencing projects are rapidly producing an enormous number of new sequences. Consequently almost every family of proteins now contains hundreds of members. It has thus become necessary to develop tools, which classify protein sequences automatically and also quickly and reliably. The difficulty of this task is intimately linked to the mechanism by which protein sequences diverge, i.e. by simultaneous residue substitutions, insertions and/or deletions and whole domain reorganisations (duplications/swapping/fusion. Results Here we present a novel approach, which is based on random sampling of sub-sequences (probes out of a set of input sequences. The probes are compared to the input sequences, after a normalisation step; the results are used to partition the input sequences into homogeneous groups of proteins. In addition, this method provides information on diagnostic parts of the proteins. The performance of this method is challenged by two data sets. The first one contains the sequences of prokaryotic lyases that could be arranged as a multiple sequence alignment. The second one contains all proteins from Swiss-Prot Release 36 with at least one Src homology 2 (SH2 domain – a classical example for proteins with modular architecture. Conclusion The outcome of our method is robust, highly reproducible as shown using bootstrap and resampling validation procedures. The results are essentially coherent with the biology. This method depends solely on well-established publicly available software and algorithms.

  19. Fabrication and characterization of gold nano-wires templated on virus-like arrays of tobacco mosaic virus coat proteins

    International Nuclear Information System (INIS)

    Wnęk, M; Stockley, P G; Górzny, M Ł; Evans, S D; Ward, M B; Brydson, R; Wälti, C; Davies, A G

    2013-01-01

    The rod-shaped plant virus tobacco mosaic virus (TMV) is widely used as a nano-fabrication template, and chimeric peptide expression on its major coat protein has extended its potential applications. Here we describe a simple bacterial expression system for production and rapid purification of recombinant chimeric TMV coat protein carrying C-terminal peptide tags. These proteins do not bind TMV RNA or form disks at pH 7. However, they retain the ability to self-assemble into virus-like arrays at acidic pH. C-terminal peptide tags in such arrays are exposed on the protein surface, allowing interaction with target species. We have utilized a C-terminal His-tag to create virus coat protein-templated nano-rods able to bind gold nanoparticles uniformly. These can be transformed into gold nano-wires by deposition of additional gold atoms from solution, followed by thermal annealing. The resistivity of a typical annealed wire created by this approach is significantly less than values reported for other nano-wires made using different bio-templates. This expression construct is therefore a useful additional tool for the creation of chimeric TMV-like nano-rods for bio-templating. (paper)

  20. Fabrication and characterization of gold nano-wires templated on virus-like arrays of tobacco mosaic virus coat proteins

    Science.gov (United States)

    Wnęk, M.; Górzny, M. Ł.; Ward, M. B.; Wälti, C.; Davies, A. G.; Brydson, R.; Evans, S. D.; Stockley, P. G.

    2013-01-01

    The rod-shaped plant virus tobacco mosaic virus (TMV) is widely used as a nano-fabrication template, and chimeric peptide expression on its major coat protein has extended its potential applications. Here we describe a simple bacterial expression system for production and rapid purification of recombinant chimeric TMV coat protein carrying C-terminal peptide tags. These proteins do not bind TMV RNA or form disks at pH 7. However, they retain the ability to self-assemble into virus-like arrays at acidic pH. C-terminal peptide tags in such arrays are exposed on the protein surface, allowing interaction with target species. We have utilized a C-terminal His-tag to create virus coat protein-templated nano-rods able to bind gold nanoparticles uniformly. These can be transformed into gold nano-wires by deposition of additional gold atoms from solution, followed by thermal annealing. The resistivity of a typical annealed wire created by this approach is significantly less than values reported for other nano-wires made using different bio-templates. This expression construct is therefore a useful additional tool for the creation of chimeric TMV-like nano-rods for bio-templating.

  1. Mutations that alter a conserved element upstream of the potato virus X triple block and coat protein genes affect subgenomic RNA accumulation.

    Science.gov (United States)

    Kim, K H; Hemenway, C

    1997-05-26

    The putative subgenomic RNA (sgRNA) promoter regions upstream of the potato virus X (PVX) triple block and coat protein (CP) genes contain sequences common to other potexviruses. The importance of these sequences to PVX sgRNA accumulation was determined by inoculation of Nicotiana tabacum NT1 cell suspension protoplasts with transcripts derived from wild-type and modified PVX cDNA clones. Analyses of RNA accumulation by S1 nuclease digestion and primer extension indicated that a conserved octanucleotide sequence element and the spacing between this element and the start-site for sgRNA synthesis are critical for accumulation of the two major sgRNA species. The impact of mutations on CP sgRNA levels was also reflected in the accumulation of CP. In contrast, genomic minus- and plus-strand RNA accumulation were not significantly affected by mutations in these regions. Studies involving inoculation of tobacco plants with the modified transcripts suggested that the conserved octanucleotide element functions in sgRNA accumulation and some other aspect of the infection process.

  2. Sequence charge decoration dictates coil-globule transition in intrinsically disordered proteins.

    Science.gov (United States)

    Firman, Taylor; Ghosh, Kingshuk

    2018-03-28

    We present an analytical theory to compute conformations of heteropolymers-applicable to describe disordered proteins-as a function of temperature and charge sequence. The theory describes coil-globule transition for a given protein sequence when temperature is varied and has been benchmarked against the all-atom Monte Carlo simulation (using CAMPARI) of intrinsically disordered proteins (IDPs). In addition, the model quantitatively shows how subtle alterations of charge placement in the primary sequence-while maintaining the same charge composition-can lead to significant changes in conformation, even as drastic as a coil (swelled above a purely random coil) to globule (collapsed below a random coil) and vice versa. The theory provides insights on how to control (enhance or suppress) these changes by tuning the temperature (or solution condition) and charge decoration. As an application, we predict the distribution of conformations (at room temperature) of all naturally occurring IDPs in the DisProt database and notice significant size variation even among IDPs with a similar composition of positive and negative charges. Based on this, we provide a new diagram-of-states delineating the sequence-conformation relation for proteins in the DisProt database. Next, we study the effect of post-translational modification, e.g., phosphorylation, on IDP conformations. Modifications as little as two-site phosphorylation can significantly alter the size of an IDP with everything else being constant (temperature, salt concentration, etc.). However, not all possible modification sites have the same effect on protein conformations; there are certain "hot spots" that can cause maximal change in conformation. The location of these "hot spots" in the parent sequence can readily be identified by using a sequence charge decoration metric originally introduced by Sawle and Ghosh. The ability of our model to predict conformations (both expanded and collapsed states) of IDPs at a high

  3. Preparation of recombinant coat protein of Prunus necrotic ringspot virus.

    Science.gov (United States)

    Petrzik, K; Mráz, I; Kubelková, D

    2001-02-01

    The coat protein (CP) gene of Prunus necrotic ringspot virus (PNRSV) was cloned into pET 16b vector and expressed in Escherichia coli. CP-enriched fractions were prepared from whole cell lysate by differential centrifugation. The fraction sedimenting at 20,000 x g for 30 mins was used for preparation of a rabbit antiserum to CP. This antiserum had a titer of 1:2048 and reacted in a double-antibody sandwich ELISA (DAS-ELISA).

  4. TMV nanorods with programmed longitudinal domains of differently addressable coat proteins

    Science.gov (United States)

    Geiger, Fania C.; Eber, Fabian J.; Eiben, Sabine; Mueller, Anna; Jeske, Holger; Spatz, Joachim P.; Wege, Christina

    2013-04-01

    The spacing of functional nanoscopic elements may play a fundamental role in nanotechnological and biomedical applications, but is so far rarely achieved on this scale. In this study we show that tobacco mosaic virus (TMV) and the RNA-guided self-assembly process of its coat protein (CP) can be used to establish new nanorod scaffolds that can be loaded not only with homogeneously distributed functionalities, but with distinct molecule species grouped and ordered along the longitudinal axis. The arrangement of the resulting domains and final carrier rod length both were governed by RNA-templated two-step in vitro assembly. Two selectively addressable TMV CP mutants carrying either thiol (TMVCys) or amino (TMVLys) groups on the exposed surface were engineered and shown to retain reactivity towards maleimides or NHS esters, respectively, after acetic acid-based purification and re-assembly to novel carrier rod types. Stepwise combination of CPCys and CPLys with RNA allowed fabrication of TMV-like nanorods with a controlled total length of 300 or 330 nm, respectively, consisting of adjacent longitudinal 100-to-200 nm domains of differently addressable CP species. This technology paves the way towards rod-shaped scaffolds with pre-defined, selectively reactive barcode patterns on the nanometer scale.The spacing of functional nanoscopic elements may play a fundamental role in nanotechnological and biomedical applications, but is so far rarely achieved on this scale. In this study we show that tobacco mosaic virus (TMV) and the RNA-guided self-assembly process of its coat protein (CP) can be used to establish new nanorod scaffolds that can be loaded not only with homogeneously distributed functionalities, but with distinct molecule species grouped and ordered along the longitudinal axis. The arrangement of the resulting domains and final carrier rod length both were governed by RNA-templated two-step in vitro assembly. Two selectively addressable TMV CP mutants carrying

  5. Transfer in SDS of biotinylated proteins from acrylamide gels to an avidin-coated membrane filter.

    Science.gov (United States)

    Karlin, Arthur; Wang, Chaojian; Li, Jing; Xu, Qiang

    2004-06-01

    Avidin was covalently linked to aldehyde-derivatized polyethersulfone membrane filters. These filters were used in Western blot analysis of proteins reacted with biotinylation reagents and electrophoresed in sodium dodecyl sulfate (SDS) on polyacrylamide gels. Electrophoretic transfer from the gels to these filters was in 0.1% SDS, in which the covalently bound avidin retained its biotin-binding capacity. We compared Western blots on avidin-coated membrane filters of biotinylated and nonbiotinylated forms of mouse immunoglobulin G (IgG), mouse IgG heavy chain, muscle-type acetylcholine receptor alpha subunit, and fused alpha and beta subunits of receptor. Biotinylated proteins were captured with high specificity compared to their nonbiotinylated counterparts and sensitively detected on the avidin-coated membranes.

  6. Rapidly evolving zona pellucida domain proteins are a major component of the vitelline envelope of abalone eggs

    Science.gov (United States)

    Aagaard, Jan E.; Yi, Xianhua; MacCoss, Michael J.; Swanson, Willie J.

    2006-01-01

    Proteins harboring a zona pellucida (ZP) domain are prominent components of vertebrate egg coats. Although less well characterized, the egg coat of the non-vertebrate marine gastropod abalone (Haliotis spp.) is also known to contain a ZP domain protein, raising the possibility of a common molecular basis of metazoan egg coat structures. Egg coat proteins from vertebrate as well as non-vertebrate taxa have been shown to evolve under positive selection. Studied most extensively in the abalone system, coevolution between adaptively diverging egg coat and sperm proteins may contribute to the rapid development of reproductive isolation. Thus, identifying the pattern of evolution among egg coat proteins is important in understanding the role these genes may play in the speciation process. The purpose of the present study is to characterize the constituent proteins of the egg coat [vitelline envelope (VE)] of abalone eggs and to provide preliminary evidence regarding how selection has acted on VE proteins during abalone evolution. A proteomic approach is used to match tandem mass spectra of peptides from purified VE proteins with abalone ovary EST sequences, identifying 9 of 10 ZP domain proteins as components of the VE. Maximum likelihood models of codon evolution suggest positive selection has acted among a subset of amino acids for 6 of these genes. This work provides further evidence of the prominence of ZP proteins as constituents of the egg coat, as well as the prominent role of positive selection in diversification of these reproductive proteins. PMID:17085584

  7. Glial cell adhesion and protein adsorption on SAM coated semiconductor and glass surfaces of a microfluidic structure

    Science.gov (United States)

    Sasaki, Darryl Y.; Cox, Jimmy D.; Follstaedt, Susan C.; Curry, Mark S.; Skirboll, Steven K.; Gourley, Paul L.

    2001-05-01

    The development of microsystems that merge biological materials with microfabricated structures is highly dependent on the successful interfacial interactions between these innately incompatible materials. Surface passivation of semiconductor and glass surfaces with thin organic films can attenuate the adhesion of proteins and cells that lead to biofilm formation and biofouling of fluidic structures. We have examined the adhesion of glial cells and serum albumin proteins to microfabricated glass and semiconductor surfaces coated with self-assembled monolayers of octadecyltrimethoxysilane and N-(triethoxysilylpropyl)-O- polyethylene oxide urethane, to evaluate the biocompatibility and surface passivation those coatings provide.

  8. Preparation of non-aggregated fluorescent nanodiamonds (FNDs) by non-covalent coating with a block copolymer and proteins for enhancement of intracellular uptake.

    Science.gov (United States)

    Lee, Jong Woo; Lee, Seonju; Jang, Sangmok; Han, Kyu Young; Kim, Younggyu; Hyun, Jaekyung; Kim, Seong Keun; Lee, Yan

    2013-05-01

    Fluorescent nanodiamonds (FNDs) are very promising fluorophores for use in biosystems due to their high biocompatibility and photostability. To overcome their tendency to aggregate in physiological solutions, which severely limits the biological applications of FNDs, we developed a new non-covalent coating method using a block copolymer, PEG-b-P(DMAEMA-co-BMA), or proteins such as BSA and HSA. By simple mixing of the block copolymer with FNDs, the cationic DMAEMA and hydrophobic BMA moieties can strongly interact with the anionic and hydrophobic moieties on the FND surface, while the PEG block can form a shell to prevent the direct contact between FNDs. The polymer-coated FNDs, along with BSA- and HSA-coated FNDs, showed non-aggregation characteristics and maintained their size at the physiological salt concentration. The well-dispersed, polymer- or protein-coated FNDs in physiological solutions showed enhanced intracellular uptake, which was confirmed by CLSM. In addition, the biocompatibility of the coated FNDs was expressly supported by a cytotoxicity assay. Our simple non-covalent coating with the block copolymer, which can be easily modified by various chemical methods, projects a very promising outlook for future biomedical applications, especially in comparison with covalent coating or protein-based coating.

  9. MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships.

    KAUST Repository

    Kosinski, Jan; Barbato, Alessandro; Tramontano, Anna

    2013-01-01

    SUMMARY: MODexplorer is an integrated tool aimed at exploring the sequence, structural and functional diversity in protein families useful in homology modeling and in analyzing protein families in general. It takes as input either the sequence or the structure of a protein and provides alignments with its homologs along with a variety of structural and functional annotations through an interactive interface. The annotations include sequence conservation, similarity scores, ligand-, DNA- and RNA-binding sites, secondary structure, disorder, crystallographic structure resolution and quality scores of models implied by the alignments to the homologs of known structure. MODexplorer can be used to analyze sequence and structural conservation among the structures of similar proteins, to find structures of homologs solved in different conformational state or with different ligands and to transfer functional annotations. Furthermore, if the structure of the query is not known, MODexplorer can be used to select the modeling templates taking all this information into account and to build a comparative model. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at http://modorama.biocomputing.it/modexplorer. Website implemented in HTML and JavaScript with all major browsers supported. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  10. MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships.

    KAUST Repository

    Kosinski, Jan

    2013-02-08

    SUMMARY: MODexplorer is an integrated tool aimed at exploring the sequence, structural and functional diversity in protein families useful in homology modeling and in analyzing protein families in general. It takes as input either the sequence or the structure of a protein and provides alignments with its homologs along with a variety of structural and functional annotations through an interactive interface. The annotations include sequence conservation, similarity scores, ligand-, DNA- and RNA-binding sites, secondary structure, disorder, crystallographic structure resolution and quality scores of models implied by the alignments to the homologs of known structure. MODexplorer can be used to analyze sequence and structural conservation among the structures of similar proteins, to find structures of homologs solved in different conformational state or with different ligands and to transfer functional annotations. Furthermore, if the structure of the query is not known, MODexplorer can be used to select the modeling templates taking all this information into account and to build a comparative model. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at http://modorama.biocomputing.it/modexplorer. Website implemented in HTML and JavaScript with all major browsers supported. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  11. Application of native signal sequences for recombinant proteins secretion in Pichia pastoris

    DEFF Research Database (Denmark)

    Borodina, Irina; Do, Duy Duc; Eriksen, Jens C.

    Background Methylotrophic yeast Pichia pastoris is widely used for recombinant protein production, largely due to its ability to secrete correctly folded heterologous proteins to the fermentation medium. Secretion is usually achieved by cloning the recombinant gene after a leader sequence, where...... alpha‐mating factor (MF) prepropeptide from Saccharomyces cerevisiae is most commonly used. Our aim was to test whether signal peptides from P. pastoris native secreted proteins could be used to direct secretion of recombinant proteins. Results Eleven native signal peptides from P. pastoris were tested...... by optimization of expression of three different proteins in P. pastoris. Conclusions Native signal peptides from P. pastoris can be used to direct secretion of recombinant proteins. A novel USER‐based P. pastoris system allows easy cloning of protein‐coding gene with the promoter and leader sequence of choice....

  12. Prediction of glutathionylation sites in proteins using minimal sequence information and their experimental validation.

    Science.gov (United States)

    Pal, Debojyoti; Sharma, Deepak; Kumar, Mukesh; Sandur, Santosh K

    2016-09-01

    S-glutathionylation of proteins plays an important role in various biological processes and is known to be protective modification during oxidative stress. Since, experimental detection of S-glutathionylation is labor intensive and time consuming, bioinformatics based approach is a viable alternative. Available methods require relatively longer sequence information, which may prevent prediction if sequence information is incomplete. Here, we present a model to predict glutathionylation sites from pentapeptide sequences. It is based upon differential association of amino acids with glutathionylated and non-glutathionylated cysteines from a database of experimentally verified sequences. This data was used to calculate position dependent F-scores, which measure how a particular amino acid at a particular position may affect the likelihood of glutathionylation event. Glutathionylation-score (G-score), indicating propensity of a sequence to undergo glutathionylation, was calculated using position-dependent F-scores for each amino-acid. Cut-off values were used for prediction. Our model returned an accuracy of 58% with Matthew's correlation-coefficient (MCC) value of 0.165. On an independent dataset, our model outperformed the currently available model, in spite of needing much less sequence information. Pentapeptide motifs having high abundance among glutathionylated proteins were identified. A list of potential glutathionylation hotspot sequences were obtained by assigning G-scores and subsequent Protein-BLAST analysis revealed a total of 254 putative glutathionable proteins, a number of which were already known to be glutathionylated. Our model predicted glutathionylation sites in 93.93% of experimentally verified glutathionylated proteins. Outcome of this study may assist in discovering novel glutathionylation sites and finding candidate proteins for glutathionylation.

  13. RSARF: Prediction of residue solvent accessibility from protein sequence using random forest method

    KAUST Repository

    Ganesan, Pugalenthi; Kandaswamy, Krishna Kumar Umar; Chou -, Kuochen; Vivekanandan, Saravanan; Kolatkar, Prasanna R.

    2012-01-01

    Prediction of protein structure from its amino acid sequence is still a challenging problem. The complete physicochemical understanding of protein folding is essential for the accurate structure prediction. Knowledge of residue solvent accessibility gives useful insights into protein structure prediction and function prediction. In this work, we propose a random forest method, RSARF, to predict residue accessible surface area from protein sequence information. The training and testing was performed using 120 proteins containing 22006 residues. For each residue, buried and exposed state was computed using five thresholds (0%, 5%, 10%, 25%, and 50%). The prediction accuracy for 0%, 5%, 10%, 25%, and 50% thresholds are 72.9%, 78.25%, 78.12%, 77.57% and 72.07% respectively. Further, comparison of RSARF with other methods using a benchmark dataset containing 20 proteins shows that our approach is useful for prediction of residue solvent accessibility from protein sequence without using structural information. The RSARF program, datasets and supplementary data are available at http://caps.ncbs.res.in/download/pugal/RSARF/. - See more at: http://www.eurekaselect.com/89216/article#sthash.pwVGFUjq.dpuf

  14. Alternative splicing affects the targeting sequence of peroxisome proteins in Arabidopsis.

    Science.gov (United States)

    An, Chuanjing; Gao, Yuefang; Li, Jinyu; Liu, Xiaomin; Gao, Fuli; Gao, Hongbo

    2017-07-01

    A systematic analysis of the Arabidopsis genome in combination with localization experiments indicates that alternative splicing affects the peroxisomal targeting sequence of at least 71 genes in Arabidopsis. Peroxisomes are ubiquitous eukaryotic cellular organelles that play a key role in diverse metabolic functions. All peroxisome proteins are encoded by nuclear genes and target to peroxisomes mainly through two types of targeting signals: peroxisomal targeting signal type 1 (PTS1) and PTS2. Alternative splicing (AS) is a process occurring in all eukaryotes by which a single pre-mRNA can generate multiple mRNA variants, often encoding proteins with functional differences. However, the effects of AS on the PTS1 or PTS2 and the targeting of the protein were rarely studied, especially in plants. Here, we systematically analyzed the genome of Arabidopsis, and found that the C-terminal targeting sequence PTS1 of 66 genes and the N-terminal targeting sequence PTS2 of 5 genes are affected by AS. Experimental determination of the targeting of selected protein isoforms further demonstrated that AS at both the 5' and 3' region of a gene can affect the inclusion of PTS2 and PTS1, respectively. This work underscores the importance of AS on the global regulation of peroxisome protein targeting.

  15. Efficacy of various protein-based coating on enhancing the shelf life of fresh eggs during storage.

    Science.gov (United States)

    Caner, Cengiz; Yüceer, Muhammed

    2015-07-01

    The effectiveness of various coatings (whey protein isolate [WPI], whey protein concentrate [WPC], zein, and shellac) on functional properties, interior quality, and eggshell breaking strength of fresh eggs were evaluated during storage at 24 °: C for 6 weeks. Coatings and storage time had significant effects on Haugh unit, yolk index, albumen pH, dry matter (DMA), relative whipping capacity (RWC), and albumen viscosity. Uncoated eggs had higher albumen pH (9.56) and weight loss, and lower albumen viscosity (5.73), Haugh unit (HU), and yolk index (YI) during storage. Among the coated eggs, the shellac and zein coated eggs had the highest value of albumen viscosity (27.26 to 26.90), HU (74.10 to 73.61), and YI (44.84 to 44.63) after storage. Shellac (1.44%) was more effective in preventing weight loss than WPC (4.59%), WPI (4.60%), and zein (2.13%) coatings. Uncoated eggs had the higest value (6.71%) of weight lost. All coatings increased shell strength (5.18 to 5.73 for top and 3.58 to 4.71 for bottom) significantly (P eggs (4.70 for top and 3.15 for bottom). The functional properties such as albumen DMA (14.50 to 16.66 and 18.97 for uncoated) and albumen RWC (841 to 891 and 475 for uncoated) of fresh eggs can be preserved during storage when they are coated. The shellac and zein coatings were more effective for maintaining the internal quality of fresh eggs during storage. Fourier transform near infrared (FT-NIR) in the 800 to 2500 nm reflection spectra were used to quantify the contents of the fresh eggs at the end of storage. Eggs coated with shellac or zein displayed a higher absorbance at 970 and 1,197 nm respectively (OH vibration of water) compared with those coated with WPI or WPC and the uncoated group at the end of storage. The coatings improved functional properties and also shell strength and could be a viable alternative technology for maintaining the internal quality of eggs during long-term storage. This study highlights the promising use of

  16. Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT+TREMBL.

    Science.gov (United States)

    Apweiler, R; Gateau, A; Contrino, S; Martin, M J; Junker, V; O'Donovan, C; Lang, F; Mitaritonna, N; Kappus, S; Bairoch, A

    1997-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. Ongoing genome sequencing projects have dramatically increased the number of protein sequences to be incorporated into SWISS-PROT. Since we do not want to dilute the quality standards of SWISS-PROT by incorporating sequences without proper sequence analysis and annotation, we cannot speed up the incorporation of new incoming data indefinitely. However, as we also want to make the sequences available as fast as possible, we introduced TREMBL (TRanslation of EMBL nucleotide sequence database), a supplement to SWISS-PROT. TREMBL consists of computer-annotated entries in SWISS-PROT format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except for CDS already included in SWISS-PROT. While TREMBL is already of immense value, its computer-generated annotation does not match the quality of SWISS-PROTs. The main difference is in the protein functional information attached to sequences. With this in mind, we are dedicating substantial effort to develop and apply computer methods to enhance the functional information attached to TREMBL entries.

  17. A method for partitioning the information contained in a protein sequence between its structure and function.

    Science.gov (United States)

    Possenti, Andrea; Vendruscolo, Michele; Camilloni, Carlo; Tiana, Guido

    2018-05-23

    Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.

  18. Sequence variability is correlated with weak immunogenicity in Streptococcus pyogenes M protein.

    Science.gov (United States)

    Lannergård, Jonas; Kristensen, Bodil M; Gustafsson, Mattias C U; Persson, Jenny J; Norrby-Teglund, Anna; Stålhammar-Carlemalm, Margaretha; Lindahl, Gunnar

    2015-10-01

    The M protein of Streptococcus pyogenes, a major bacterial virulence factor, has an amino-terminal hypervariable region (HVR) that is a target for type-specific protective antibodies. Intriguingly, the HVR elicits a weak antibody response, indicating that it escapes host immunity by two mechanisms, sequence variability and weak immunogenicity. However, the properties influencing the immunogenicity of regions in an M protein remain poorly understood. Here, we studied the antibody response to different regions of the classical M1 and M5 proteins, in which not only the HVR but also the adjacent fibrinogen-binding B repeat region exhibits extensive sequence divergence. Analysis of antisera from S. pyogenes-infected patients, infected mice, and immunized mice showed that both the HVR and the B repeat region elicited weak antibody responses, while the conserved carboxy-terminal part was immunodominant. Thus, we identified a correlation between sequence variability and weak immunogenicity for M protein regions. A potential explanation for the weak immunogenicity was provided by the demonstration that protease digestion selectively eliminated the HVR-B part from whole M protein-expressing bacteria. These data support a coherent model, in which the entire variable HVR-B part evades antibody attack, not only by sequence variability but also by weak immunogenicity resulting from protease attack. © 2015 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.

  19. Seeing the trees through the forest : sequence-based homo- and heteromeric protein-protein interaction sites prediction using random forest

    NARCIS (Netherlands)

    Hou, Qingzhen; De Geest, Paul F.G.; Vranken, Wim F.; Heringa, Jaap; Feenstra, K. Anton

    2017-01-01

    Motivation: Genome sequencing is producing an ever-increasing amount of associated protein sequences. Few of these sequences have experimentally validated annotations, however, and computational predictions are becoming increasingly successful in producing such annotations. One key challenge remains

  20. Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

    Science.gov (United States)

    Dong, Zheng; Zhou, Hongyu; Tao, Peng

    2018-02-01

    PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.

  1. Milk protein-gum tragacanth mixed gels: effect of heat-treatment sequence.

    Science.gov (United States)

    Hatami, Masoud; Nejatian, Mohammad; Mohammadifar, Mohammad Amin; Pourmand, Hanieh

    2014-01-30

    The aim of this study was to investigate the role of the heat-treatment sequence of biopolymer mixtures as a formulation parameter on the acid-induced gelation of tri-polymeric systems composed of sodium caseinate (Na-caseinate), whey protein concentrate (WPC), and gum tragacanth (GT). This was studied by applying four sequences of heat treatment: (A) co-heating all three biopolymers; (B) heating the milk-protein dispersion and the GT dispersion separately; (C) heating the dispersion containing Na-caseinate and GT together and heating whey protein alone; and (D) co-heating whey protein with GT and heating Na-caseinate alone. According to small-deformation rheological measurements, the strength of the mixed-gel network decreased in the order: C>B>D>A samples. SEM micrographs show that the network of sample C is much more homogenous, coarse and dense than sample A, while the networks of samples B and D are of intermediate density. The heat-treatment sequence of the biopolymer mixtures as a formulation parameter thus offers an opportunity to control the microstructure and rheological properties of mixed gels. Copyright © 2013 Elsevier Ltd. All rights reserved.

  2. Structure and barrier properties of human embryonic stem cell-derived retinal pigment epithelial cells are affected by extracellular matrix protein coating.

    Science.gov (United States)

    Sorkio, Anni; Hongisto, Heidi; Kaarniranta, Kai; Uusitalo, Hannu; Juuti-Uusitalo, Kati; Skottman, Heli

    2014-02-01

    Extracellular matrix (ECM) interactions play a vital role in cell morphology, migration, proliferation, and differentiation of cells. We investigated the role of ECM proteins on the structure and function of human embryonic stem cell-derived retinal pigment epithelial (hESC-RPE) cells during their differentiation and maturation from hESCs into RPE cells in adherent differentiation cultures on several human ECM proteins found in native human Bruch's membrane, namely, collagen I, collagen IV, laminin, fibronectin, and vitronectin, as well as on commercial substrates of xeno-free CELLstart™ and Matrigel™. Cell pigmentation, expression of RPE-specific proteins, fine structure, as well as the production of basal lamina by hESC-RPE on different protein coatings were evaluated after 140 days of differentiation. The integrity of hESC-RPE epithelium and barrier properties on different coatings were investigated by measuring transepithelial resistance. All coatings supported the differentiation of hESC-RPE cells as demonstrated by early onset of cell pigmentation and further maturation to RPE monolayers after enrichment. Mature RPE phenotype was verified by RPE-specific gene and protein expression, correct epithelial polarization, and phagocytic activity. Significant differences were found in the degree of RPE cell pigmentation and tightness of epithelial barrier between different coatings. Further, the thickness of self-assembled basal lamina and secretion of the key ECM proteins found in the basement membrane of the native RPE varied between hESC-RPE cultured on compared protein coatings. In conclusion, this study shows that the cell culture substrate has a major effect on the structure and basal lamina production during the differentiation and maturation of hESC-RPE potentially influencing the success of cell integrations and survival after cell transplantation.

  3. Complete genome sequence of a proposed new tymovirus, tomato blistering mosaic virus.

    Science.gov (United States)

    Nicolini, Cícero; Inoue-Nagata, Alice Kazuko; Nagata, Tatsuya

    2015-02-01

    In a previous work, a distinct tymovirus infecting tomato plants in Brazil was reported and tentatively named tomato blistering mosaic virus (ToBMV). In this study, the complete genome sequence of ToBMV was determined and shown to have a size of 6277 nucleotides and three ORFs: ORF 1 encodes the replication-complex polyprotein, ORF 2 the movement protein, and ORF 3 the coat protein. The cleavage sites of the replication-complex polyprotein (GS/LP and VAG/QSP) of ToBMV were predicted by alignment analysis of amino acid sequences of other tymoviruses. In the phylogenetic tree, ToBMV clustered with the tymoviruses that infect solanaceous hosts.

  4. Identification of a seed coat-specific promoter fragment from the Arabidopsis MUCILAGE-MODIFIED4 gene.

    Science.gov (United States)

    Dean, Gillian H; Jin, Zhaoqing; Shi, Lin; Esfandiari, Elahe; McGee, Robert; Nabata, Kylie; Lee, Tiffany; Kunst, Ljerka; Western, Tamara L; Haughn, George W

    2017-09-01

    The Arabidopsis seed coat-specific promoter fragment described is an important tool for basic and applied research in Brassicaceae species. During differentiation, the epidermal cells of the Arabidopsis seed coat produce and secrete large quantities of mucilage. On hydration of mature seeds, this mucilage becomes easily accessible as it is extruded to form a tightly attached halo at the seed surface. Mucilage is composed mainly of pectin, and also contains the key cell wall components cellulose, hemicellulose, and proteins, making it a valuable model for studying numerous aspects of cell wall biology. Seed coat-specific promoters are an important tool that can be used to assess the effects of expressing biosynthetic enzymes and diverse cell wall-modifying proteins on mucilage structure and function. Additionally, they can be used for production of easily accessible recombinant proteins of commercial interest. The MUCILAGE-MODIFIED4 (MUM4) gene is expressed in a wide variety of plant tissues and is strongly up-regulated in the seed coat during mucilage synthesis, implying the presence of a seed coat-specific region in its promoter. Promoter deletion analysis facilitated isolation of a 308 base pair sequence (MUM4 0.3Pro ) that directs reporter gene expression in the seed coat cells of both Arabidopsis and Camelina sativa, and is regulated by the same transcription factor cascade as endogenous MUM4. Therefore, MUM4 0.3Pro is a promoter fragment that serves as a new tool for seed coat biology research.

  5. A scalable double-barcode sequencing platform for characterization of dynamic protein-protein interactions.

    Science.gov (United States)

    Schlecht, Ulrich; Liu, Zhimin; Blundell, Jamie R; St Onge, Robert P; Levy, Sasha F

    2017-05-25

    Several large-scale efforts have systematically catalogued protein-protein interactions (PPIs) of a cell in a single environment. However, little is known about how the protein interactome changes across environmental perturbations. Current technologies, which assay one PPI at a time, are too low throughput to make it practical to study protein interactome dynamics. Here, we develop a highly parallel protein-protein interaction sequencing (PPiSeq) platform that uses a novel double barcoding system in conjunction with the dihydrofolate reductase protein-fragment complementation assay in Saccharomyces cerevisiae. PPiSeq detects PPIs at a rate that is on par with current assays and, in contrast with current methods, quantitatively scores PPIs with enough accuracy and sensitivity to detect changes across environments. Both PPI scoring and the bulk of strain construction can be performed with cell pools, making the assay scalable and easily reproduced across environments. PPiSeq is therefore a powerful new tool for large-scale investigations of dynamic PPIs.

  6. OPAL: prediction of MoRF regions in intrinsically disordered protein sequences.

    Science.gov (United States)

    Sharma, Ronesh; Raicar, Gaurav; Tsunoda, Tatsuhiko; Patil, Ashwini; Sharma, Alok

    2018-06-01

    Intrinsically disordered proteins lack stable 3-dimensional structure and play a crucial role in performing various biological functions. Key to their biological function are the molecular recognition features (MoRFs) located within long disordered regions. Computationally identifying these MoRFs from disordered protein sequences is a challenging task. In this study, we present a new MoRF predictor, OPAL, to identify MoRFs in disordered protein sequences. OPAL utilizes two independent sources of information computed using different component predictors. The scores are processed and combined using common averaging method. The first score is computed using a component MoRF predictor which utilizes composition and sequence similarity of MoRF and non-MoRF regions to detect MoRFs. The second score is calculated using half-sphere exposure (HSE), solvent accessible surface area (ASA) and backbone angle information of the disordered protein sequence, using information from the amino acid properties of flanks surrounding the MoRFs to distinguish MoRF and non-MoRF residues. OPAL is evaluated using test sets that were previously used to evaluate MoRF predictors, MoRFpred, MoRFchibi and MoRFchibi-web. The results demonstrate that OPAL outperforms all the available MoRF predictors and is the most accurate predictor available for MoRF prediction. It is available at http://www.alok-ai-lab.com/tools/opal/. ashwini@hgc.jp or alok.sharma@griffith.edu.au. Supplementary data are available at Bioinformatics online.

  7. Chemistry and stability of thiol based polyethylene glycol surface coatings on colloidal gold and their relationship to protein adsorption and clearance in vivo

    Science.gov (United States)

    Carpinone, Paul

    Nanomaterials have presented a wide range of novel biomedical applications, with particular emphasis placed on advances in imaging and treatment delivery. Of the many particulate nanomaterials researched for biomedical applications, gold is one of the most widely used. Colloidal gold has been of great interest due to its chemical inertness and its ability to perform multiple functions, such as drug delivery, localized heating of tissues (hyperthermia), and imaging (as a contrast agent). It is also readily functionalized through the use of thiols, which spontaneously form sulfur to gold bonds with the surface. Polyethylene glycol (PEG) is the most widely used coating material for these particles as it provides both steric stability to the suspension and protein resistance. These properties extend the circulation time of the particles in blood, and consequently the efficacy of the treatment. Despite widespread use of PEG coated gold particles, the coating chemistry and stability of these particles are largely unknown. The goal of this work was to identify the mechanisms leading to degradation and stability of thiol based polyethylene glycol coatings on gold particles and to relate this behavior to protein adsorption and clearance in vivo. The results indicate that the protective PEG coating is susceptible to sources of oxidation (including dissolved oxygen) and competing adsorbates, among other factors. The quality of commercially available thiolated PEG reagents was also found to play a key role in the quality and protein resistance of the final PEG coating. Analysis of the stability of these coatings indicated that they rapidly degrade under physiological conditions, leading to the onset of protein adsorption when exposed to plasma or blood. Paralleling the protein adsorption behavior and onset of coating degradation observed in vitro, blood clearance of parenterally administered PEG coated particles in mice began after approximately 2h of circulation time. Taken

  8. EST-PAC a web package for EST annotation and protein sequence prediction

    Directory of Open Access Journals (Sweden)

    Strahm Yvan

    2006-10-01

    Full Text Available Abstract With the decreasing cost of DNA sequencing technology and the vast diversity of biological resources, researchers increasingly face the basic challenge of annotating a larger number of expressed sequences tags (EST from a variety of species. This typically consists of a series of repetitive tasks, which should be automated and easy to use. The results of these annotation tasks need to be stored and organized in a consistent way. All these operations should be self-installing, platform independent, easy to customize and amenable to using distributed bioinformatics resources available on the Internet. In order to address these issues, we present EST-PAC a web oriented multi-platform software package for expressed sequences tag (EST annotation. EST-PAC provides a solution for the administration of EST and protein sequence annotations accessible through a web interface. Three aspects of EST annotation are automated: 1 searching local or remote biological databases for sequence similarities using Blast services, 2 predicting protein coding sequence from EST data and, 3 annotating predicted protein sequences with functional domain predictions. In practice, EST-PAC integrates the BLASTALL suite, EST-Scan2 and HMMER in a relational database system accessible through a simple web interface. EST-PAC also takes advantage of the relational database to allow consistent storage, powerful queries of results and, management of the annotation process. The system allows users to customize annotation strategies and provides an open-source data-management environment for research and education in bioinformatics.

  9. Analysis of the epitope structure of Plum pox virus coat protein.

    Science.gov (United States)

    Candresse, Thierry; Saenz, Pilar; García, Juan Antonio; Boscia, Donato; Navratil, Milan; Gorris, Maria Teresa; Cambra, Mariano

    2011-05-01

    Typing of the particular Plum pox virus (PPV) strain responsible in an outbreak has important practical implications and is frequently performed using strain-specific monoclonal antibodies (MAbs). Analysis in Western blots of the reactivity of 24 MAbs to a 112-amino-acid N-terminal fragment of the PPV coat protein (CP) expressed in Escherichia coli showed that 21 of the 24 MAbs recognized linear or denaturation-insensitive epitopes. A series of eight C-truncated CP fragments allowed the mapping of the epitopes recognized by the MAbs. In all, 14 of them reacted to the N-terminal hypervariable region, defining a minimum of six epitopes, while 7 reacted to the beginning of the core region, defining a minimum of three epitopes. Sequence comparisons allowed the more precise positioning of regions recognized by several MAbs, including those recognized by the 5B-IVIA universal MAb (amino acids 94 to 100) and by the 4DG5 and 4DG11 D serogroup-specific MAbs (amino acids 43 to 64). A similar approach coupled with infectious cDNA clone mutagenesis showed that a V74T mutation in the N-terminus of the CP abolished the binding of the M serogroup-specific AL MAb. Taken together, these results provide a detailed positioning of the epitopes recognized by the most widely used PPV detection and typing MAbs.

  10. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

    KAUST Repository

    Kulmanov, Maxat

    2017-09-27

    Motivation A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. Results We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein–protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations.

  11. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

    Directory of Open Access Journals (Sweden)

    Li Weizhong

    2008-04-01

    Full Text Available Abstract Background The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. Results We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, (http://camera.calit2.net. Conclusion The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.

  12. Amino acid sequences of ribosomal proteins S11 from Bacillus stearothermophilus and S19 from Halobacterium marismortui. Comparison of the ribosomal protein S11 family.

    Science.gov (United States)

    Kimura, M; Kimura, J; Hatakeyama, T

    1988-11-21

    The complete amino acid sequences of ribosomal proteins S11 from the Gram-positive eubacterium Bacillus stearothermophilus and of S19 from the archaebacterium Halobacterium marismortui have been determined. A search for homologous sequences of these proteins revealed that they belong to the ribosomal protein S11 family. Homologous proteins have previously been sequenced from Escherichia coli as well as from chloroplast, yeast and mammalian ribosomes. A pairwise comparison of the amino acid sequences showed that Bacillus protein S11 shares 68% identical residues with S11 from Escherichia coli and a slightly lower homology (52%) with the homologous chloroplast protein. The halophilic protein S19 is more related to the eukaryotic (45-49%) than to the eubacterial counterparts (35%).

  13. Efficient use of unlabeled data for protein sequence classification: a comparative study.

    Science.gov (United States)

    Kuksa, Pavel; Huang, Pai-Hsi; Pavlovic, Vladimir

    2009-04-29

    Recent studies in computational primary protein sequence analysis have leveraged the power of unlabeled data. For example, predictive models based on string kernels trained on sequences known to belong to particular folds or superfamilies, the so-called labeled data set, can attain significantly improved accuracy if this data is supplemented with protein sequences that lack any class tags-the unlabeled data. In this study, we present a principled and biologically motivated computational framework that more effectively exploits the unlabeled data by only using the sequence regions that are more likely to be biologically relevant for better prediction accuracy. As overly-represented sequences in large uncurated databases may bias the estimation of computational models that rely on unlabeled data, we also propose a method to remove this bias and improve performance of the resulting classifiers. Combined with state-of-the-art string kernels, our proposed computational framework achieves very accurate semi-supervised protein remote fold and homology detection on three large unlabeled databases. It outperforms current state-of-the-art methods and exhibits significant reduction in running time. The unlabeled sequences used under the semi-supervised setting resemble the unpolished gemstones; when used as-is, they may carry unnecessary features and hence compromise the classification accuracy but once cut and polished, they improve the accuracy of the classifiers considerably.

  14. TRDistiller: a rapid filter for enrichment of sequence datasets with proteins containing tandem repeats.

    Science.gov (United States)

    Richard, François D; Kajava, Andrey V

    2014-06-01

    The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.

  15. Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource.

    Science.gov (United States)

    Sharpton, Thomas J; Jospin, Guillaume; Wu, Dongying; Langille, Morgan G I; Pollard, Katherine S; Eisen, Jonathan A

    2012-10-13

    New computational resources are needed to manage the increasing volume of biological data from genome sequencing projects. One fundamental challenge is the ability to maintain a complete and current catalog of protein diversity. We developed a new approach for the identification of protein families that focuses on the rapid discovery of homologous protein sequences. We implemented fully automated and high-throughput procedures to de novo cluster proteins into families based upon global alignment similarity. Our approach employs an iterative clustering strategy in which homologs of known families are sifted out of the search for new families. The resulting reduction in computational complexity enables us to rapidly identify novel protein families found in new genomes and to perform efficient, automated updates that keep pace with genome sequencing. We refer to protein families identified through this approach as "Sifting Families," or SFams. Our analysis of ~10.5 million protein sequences from 2,928 genomes identified 436,360 SFams, many of which are not represented in other protein family databases. We validated the quality of SFam clustering through statistical as well as network topology-based analyses. We describe the rapid identification of SFams and demonstrate how they can be used to annotate genomes and metagenomes. The SFam database catalogs protein-family quality metrics, multiple sequence alignments, hidden Markov models, and phylogenetic trees. Our source code and database are publicly available and will be subject to frequent updates (http://edhar.genomecenter.ucdavis.edu/sifting_families/).

  16. Extreme sequence divergence but conserved ligand-binding specificity in Streptococcus pyogenes M protein.

    Directory of Open Access Journals (Sweden)

    2006-05-01

    Full Text Available Many pathogenic microorganisms evade host immunity through extensive sequence variability in a protein region targeted by protective antibodies. In spite of the sequence variability, a variable region commonly retains an important ligand-binding function, reflected in the presence of a highly conserved sequence motif. Here, we analyze the limits of sequence divergence in a ligand-binding region by characterizing the hypervariable region (HVR of Streptococcus pyogenes M protein. Our studies were focused on HVRs that bind the human complement regulator C4b-binding protein (C4BP, a ligand that confers phagocytosis resistance. A previous comparison of C4BP-binding HVRs identified residue identities that could be part of a binding motif, but the extended analysis reported here shows that no residue identities remain when additional C4BP-binding HVRs are included. Characterization of the HVR in the M22 protein indicated that two relatively conserved Leu residues are essential for C4BP binding, but these residues are probably core residues in a coiled-coil, implying that they do not directly contribute to binding. In contrast, substitution of either of two relatively conserved Glu residues, predicted to be solvent-exposed, had no effect on C4BP binding, although each of these changes had a major effect on the antigenic properties of the HVR. Together, these findings show that HVRs of M proteins have an extraordinary capacity for sequence divergence and antigenic variability while retaining a specific ligand-binding function.

  17. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

    KAUST Repository

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert

    2017-01-01

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often

  18. PCP-B class pollen coat proteins are key regulators of the hydration checkpoint in Arabidopsis thaliana pollen-stigma interactions.

    Science.gov (United States)

    Wang, Ludi; Clarke, Lisa A; Eason, Russell J; Parker, Christopher C; Qi, Baoxiu; Scott, Rod J; Doughty, James

    2017-01-01

    The establishment of pollen-pistil compatibility is strictly regulated by factors derived from both male and female reproductive structures. Highly diverse small cysteine-rich proteins (CRPs) have been found to play multiple roles in plant reproduction, including the earliest stages of the pollen-stigma interaction. Secreted CRPs found in the pollen coat of members of the Brassicaceae, the pollen coat proteins (PCPs), are emerging as important signalling molecules that regulate the pollen-stigma interaction. Using a combination of protein characterization, expression and phylogenetic analyses we identified a novel class of Arabidopsis thaliana pollen-borne CRPs, the PCP-Bs (for pollen coat protein B-class) that are related to embryo surrounding factor (ESF1) developmental regulators. Single and multiple PCP-B mutant lines were utilized in bioassays to assess effects on pollen hydration, adhesion and pollen tube growth. Our results revealed that pollen hydration is severely impaired when multiple PCP-Bs are lost from the pollen coat. The hydration defect also resulted in reduced pollen adhesion and delayed pollen tube growth in all mutants studied. These results demonstrate that AtPCP-Bs are key regulators of the hydration 'checkpoint' in establishment of pollen-stigma compatibility. In addition, we propose that interspecies diversity of PCP-Bs may contribute to reproductive barriers in the Brassicaceae. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  19. RTA, a candidate G protein-coupled receptor: Cloning, sequencing, and tissue distribution

    International Nuclear Information System (INIS)

    Ross, P.C.; Figler, R.A.; Corjay, M.H.; Barber, C.M.; Adam, N.; Harcus, D.R.; Lynch, K.R.

    1990-01-01

    Genomic and cDNA clones, encoding a protein that is a member of the guanine nucleotide-binding regulatory protein (G protein)-coupled receptor superfamily, were isolated by screening rat genomic and thoracic aorta cDNA libraries with an oligonucleotide encoding a highly conserved region of the M 1 muscarinic acetylcholine receptor. Sequence analyses of these clones showed that they encode a 343-amino acid protein (named RTA). The RTA gene is single copy, as demonstrated by restriction mapping and Southern blotting of genomic clones and rat genomic DNA. RTA RNA sequences are relatively abundant throughout the gut, vas deferens, uterus, and aorta but are only barely detectable (on Northern blots) in liver, kidney, lung, and salivary gland. In the rat brain, RTA sequences are markedly abundant in the cerebellum. TRA is most closely related to the mas oncogene (34% identity), which has been suggested to be a forebrain angiotensin receptor. They conclude that RTA is not an angiotensin receptor; to date, they have been unable to identify its ligand

  20. Denatured protein-coated docetaxel nanoparticles: Alterable drug state and cytosolic delivery.

    Science.gov (United States)

    Zhang, Li; Xiao, Qingqing; Wang, Yiran; Zhang, Chenshuang; He, Wei; Yin, Lifang

    2017-05-15

    Many lead compounds have a low solubility in water, which substantially hinders their clinical application. Nanosuspensions have been considered a promising strategy for the delivery of water-insoluble drugs. Here, denatured soy protein isolate (SPI)-coated docetaxel nanosuspensions (DTX-NS) were developed using an anti-solvent precipitation-ultrasonication method to improve the water-solubility of DTX, thus improving its intracellular delivery. DTX-NS, with a diameter of 150-250nm and drug-loading up to 18.18%, were successfully prepared by coating drug particles with SPI. Interestingly, the drug state of DTX-NS was alterable. Amorphous drug nanoparticles were obtained at low drug-loading, whereas at a high drug-loading, the DTX-NS drug was mainly present in the crystalline state. Moreover, DTX-NS could be internalized at high levels by cancer cells and enter the cytosol by lysosomal escape, enhancing cell cytotoxicity and apoptosis compared with free DTX. Taken together, denatured SPI has a strong stabilization effect on nanosuspensions, and the drug state in SPI-coated nanosuspensions is alterable by changing the drug-loading. Moreover, DTX-NS could achieve cytosolic delivery, generating enhanced cell cytotoxicity against cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Sequence-specific capture of protein-DNA complexes for mass spectrometric protein identification.

    Directory of Open Access Journals (Sweden)

    Cheng-Hsien Wu

    Full Text Available The regulation of gene transcription is fundamental to the existence of complex multicellular organisms such as humans. Although it is widely recognized that much of gene regulation is controlled by gene-specific protein-DNA interactions, there presently exists little in the way of tools to identify proteins that interact with the genome at locations of interest. We have developed a novel strategy to address this problem, which we refer to as GENECAPP, for Global ExoNuclease-based Enrichment of Chromatin-Associated Proteins for Proteomics. In this approach, formaldehyde cross-linking is employed to covalently link DNA to its associated proteins; subsequent fragmentation of the DNA, followed by exonuclease digestion, produces a single-stranded region of the DNA that enables sequence-specific hybridization capture of the protein-DNA complex on a solid support. Mass spectrometric (MS analysis of the captured proteins is then used for their identification and/or quantification. We show here the development and optimization of GENECAPP for an in vitro model system, comprised of the murine insulin-like growth factor-binding protein 1 (IGFBP1 promoter region and FoxO1, a member of the forkhead rhabdomyosarcoma (FoxO subfamily of transcription factors, which binds specifically to the IGFBP1 promoter. This novel strategy provides a powerful tool for studies of protein-DNA and protein-protein interactions.

  2. Automated sequence-specific protein NMR assignment using the memetic algorithm MATCH

    International Nuclear Information System (INIS)

    Volk, Jochen; Herrmann, Torsten; Wuethrich, Kurt

    2008-01-01

    MATCH (Memetic Algorithm and Combinatorial Optimization Heuristics) is a new memetic algorithm for automated sequence-specific polypeptide backbone NMR assignment of proteins. MATCH employs local optimization for tracing partial sequence-specific assignments within a global, population-based search environment, where the simultaneous application of local and global optimization heuristics guarantees high efficiency and robustness. MATCH thus makes combined use of the two predominant concepts in use for automated NMR assignment of proteins. Dynamic transition and inherent mutation are new techniques that enable automatic adaptation to variable quality of the experimental input data. The concept of dynamic transition is incorporated in all major building blocks of the algorithm, where it enables switching between local and global optimization heuristics at any time during the assignment process. Inherent mutation restricts the intrinsically required randomness of the evolutionary algorithm to those regions of the conformation space that are compatible with the experimental input data. Using intact and artificially deteriorated APSY-NMR input data of proteins, MATCH performed sequence-specific resonance assignment with high efficiency and robustness

  3. Cell wall proteins in seedling cotyledons of Prosopis chilensis.

    Science.gov (United States)

    Rodríguez, J G; Cardemil, L

    1994-01-01

    Four cell wall proteins of cotyledons of Prosopis chilensis seedlings were characterized by PAGE and Western analyses using a polyclonal antibody, generated against soybean seed coat extensin. These proteins had M(r)s of 180,000, 126,000, 107,000 and 63,000, as determined by SDS-PAGE. The proteins exhibited a fluorescent positive reaction with dansylhydrazine suggesting that they are glycoproteins; they did not show peroxidase activity. The cell wall proteins were also characterized by their amino acid composition and by their amino-terminal sequence. These analyses revealed that there are two groups of related cell wall proteins in the cotyledons. The first group comprises the proteins of M(r)s 180,000, 126,000, 107,000 which are rich in glutamic acid/glutamine and aspartic acid/asparagine and they have almost identical NH2-terminal sequences. The second group comprises the M(r) 63,000 protein which is rich in proline, glycine, valine and tyrosine, with an NH2-terminal sequence which was very similar to that of soybean proline-rich proteins.

  4. Sorting of a HaloTag protein that has only a signal peptide sequence into exocrine secretory granules without protein aggregation.

    Science.gov (United States)

    Fujita-Yoshigaki, Junko; Matsuki-Fukushima, Miwako; Yokoyama, Megumi; Katsumata-Kato, Osamu

    2013-11-15

    The mechanism involved in the sorting and accumulation of secretory cargo proteins, such as amylase, into secretory granules of exocrine cells remains to be solved. To clarify that sorting mechanism, we expressed a reporter protein HaloTag fused with partial sequences of salivary amylase protein in primary cultured parotid acinar cells. We found that a HaloTag protein fused with only the signal peptide sequence (Met(1)-Ala(25)) of amylase, termed SS25H, colocalized well with endogenous amylase, which was confirmed by immunofluorescence microscopy. Percoll-density gradient centrifugation of secretory granule fractions shows that the distributions of amylase and SS25H were similar. These results suggest that SS25H is transported to secretory granules and is not discriminated from endogenous amylase by the machinery that functions to remove proteins other than granule cargo from immature granules. Another reporter protein, DsRed2, that has the same signal peptide sequence also colocalized with amylase, suggesting that the sorting to secretory granules is not dependent on a characteristic of the HaloTag protein. Whereas Blue Native PAGE demonstrates that endogenous amylase forms a high-molecular-weight complex, SS25H does not participate in the complex and does not form self-aggregates. Nevertheless, SS25H was released from cells by the addition of a β-adrenergic agonist, isoproterenol, which also induces amylase secretion. These results indicate that addition of the signal peptide sequence, which is necessary for the translocation in the endoplasmic reticulum, is sufficient for the transportation and storage of cargo proteins in secretory granules of exocrine cells.

  5. Apoptosis inhibitor of macrophage (AIM) diminishes lipid droplet-coating proteins leading to lipolysis in adipocytes

    International Nuclear Information System (INIS)

    Iwamura, Yoshihiro; Mori, Mayumi; Nakashima, Katsuhiko; Mikami, Toshiyuki; Murayama, Katsuhisa; Arai, Satoko; Miyazaki, Toru

    2012-01-01

    Highlights: ► AIM induces lipolysis in a distinct manner from that of hormone-dependent lipolysis. ► AIM ablates activity of peroxisome proliferator-activated receptor in adipocytes. ► AIM reduces mRNA levels of lipid-droplet coating proteins leading to lipolysis. -- Abstract: Under fasting conditions, triacylglycerol in adipose tissue undergoes lipolysis to supply fatty acids as energy substrates. Such lipolysis is regulated by hormones, which activate lipases via stimulation of specific signalling cascades. We previously showed that macrophage-derived soluble protein, AIM induces obesity-associated lipolysis, triggering chronic inflammation in fat tissue which causes insulin resistance. However, the mechanism of how AIM mediates lipolysis remains unknown. Here we show that AIM induces lipolysis in a manner distinct from that of hormone-dependent lipolysis, without activation or augmentation of lipases. In vivo and in vitro, AIM did not enhance phosphorylation of hormone-sensitive lipase (HSL) in adipocytes, a hallmark of hormone-dependent lipolysis activation. Similarly, adipose tissue from obese AIM-deficient and wild-type mice showed comparable HSL phosphorylation. Consistent with the suppressive effect of AIM on fatty acid synthase activity, the amount of saturated and unsaturated fatty acids was reduced in adipocytes treated with AIM. This response ablated transcriptional activity of peroxisome proliferator-activated receptor (PPARγ), leading to diminished gene expression of lipid-droplet coating proteins including fat-specific protein 27 (FSP27) and Perilipin, which are indispensable for triacylglycerol storage in adipocytes. Accordingly, the lipolytic effect of AIM was overcome by a PPARγ-agonist or forced expression of FSP27, while it was synergized by a PPARγ-antagonist. Overall, distinct modes of lipolysis appear to take place in different physiological situations; one is a supportive response against nutritional deprivation achieved by

  6. The N-terminal sequence of ribosomal protein L10 from the archaebacterium Halobacterium marismortui and its relationship to eubacterial protein L6 and other ribosomal proteins.

    Science.gov (United States)

    Dijk, J; van den Broek, R; Nasiulas, G; Beck, A; Reinhardt, R; Wittmann-Liebold, B

    1987-08-01

    The amino-terminal sequence of ribosomal protein L10 from Halobacterium marismortui has been determined up to residue 54, using both a liquid- and a gas-phase sequenator. The two sequences are in good agreement. The protein is clearly homologous to protein HcuL10 from the related strain Halobacterium cutirubrum. Furthermore, a weaker but distinct homology to ribosomal protein L6 from Escherichia coli and Bacillus stearothermophilus can be detected. In addition to 7 identical amino acids in the first 36 residues in all four sequences a number of conservative replacements occurs, of mainly hydrophobic amino acids. In this common region the pattern of conserved amino acids suggests the presence of a beta-alpha fold as it occurs in ribosomal proteins L12 and L30. Furthermore, several potential cases of homology to other ribosomal components of the three ur-kingdoms have been found.

  7. Sequence of a cDNA encoding turtle high mobility group 1 protein.

    Science.gov (United States)

    Zheng, Jifang; Hu, Bi; Wu, Duansheng

    2005-07-01

    In order to understand sequence information about turtle HMG1 gene, a cDNA encoding HMG1 protein of the Chinese soft-shell turtle (Pelodiscus sinensis) was amplified by RT-PCR from kidney total RNA, and was cloned, sequenced and analyzed. The results revealed that the open reading frame (ORF) of turtle HMG1 cDNA is 606 bp long. The ORF codifies 202 amino acid residues, from which two DNA-binding domains and one polyacidic region are derived. The DNA-binding domains share higher amino acid identity with homologues sequences of chicken (96.5%) and mammalian (74%) than homologues sequence of rainbow trout (67%). The polyacidic region shows 84.6% amino acid homology with the equivalent region of chicken HMG1 cDNA. Turtle HMG1 protein contains 3 Cys residues located at completely conserved positions. Conservation in sequence and structure suggests that the functions of turtle HMG1 cDNA may be highly conserved during evolution. To our knowledge, this is the first report of HMG1 cDNA sequence in any reptilian.

  8. Effects of protein-coated nanofibers on conformation of gingival fibroblast spheroids: potential utility for connective tissues regeneration.

    Science.gov (United States)

    Kaufman, Gili; Whitescarver, Ryan; Nunes, Laiz; Palmer, Xavier-Lewis; Skrtic, Drago; Tutak, Wojtek

    2017-10-09

    Deep wounds in the gingiva caused by trauma or surgery require a rapid and robust healing of connective tissues. We propose utilizing gas-brushed nanofibers coated with collagen and fibrin for that purpose. Our hypotheses are that protein-coated nanofibers will: (i) attract and mobilize cells in various spatial orientations, and (ii) regulate the expression levels of specific extracellular matrix (ECM)-associated proteins, determining the initial conformational nature of dense and soft connective tissues. Gingival fibroblast monolayers and 3D spheroids were cultured on ECM substrate and covered with gas-blown poly-(DL-lactide-co-glycolide) (PLGA) nanofibers (uncoated/coated with collagen and fibrin). Cell attraction and rearrangement was followed by F-actin staining and confocal microscopy. Thicknesses of the cell layers, developed within the nanofibers, were quantified by imageJ software. The expression of collagen1α1 chain (Col1α1), fibronectin, and metalloproteinase 2 (MMP2) encoding genes was determined by quantitative reverse transcription analysis. Collagen- and fibrin- coated nanofibers induced cell migration toward fibers and supported cellular growth within the scaffolds. Both proteins affected the spatial rearrangement of fibroblasts by favoring packed cell clusters or intermittent cell spreading. These cell arrangements resembled the structural characteristic of dense and soft connective tissues, respectively. Within 3 days of incubation, fibroblast spheroids interacted with the fibers and grew robustly by increasing their thickness compared to monolayers. While the ECM key components, such as fibronectin and MMP2 encoding genes, were expressed in both protein groups, Col1α1 was predominantly expressed in bundled fibroblasts grown on collagen fibers. This enhanced expression of collagen1 is typical for dense connective tissue. Based on results of this study, our gas-blown, collagen- and fibrin-coated PLGA nanofibers are viable candidates for

  9. Effects of protein-coated nanofibers on conformation of gingival fibroblast spheroids: potential utility for connective tissue regeneration.

    Science.gov (United States)

    Kaufman, Gili; Whitescarver, Ryan A; Nunes, Laiz; Palmer, Xavier-Lewis; Skrtic, Drago; Tutak, Wojtek

    2018-01-24

    Deep wounds in the gingiva caused by trauma or surgery require a rapid and robust healing of connective tissues. We propose utilizing gas-brushed nanofibers coated with collagen and fibrin for that purpose. Our hypotheses are that protein-coated nanofibers will: (i) attract and mobilize cells in various spatial orientations, and (ii) regulate the expression levels of specific extracellular matrix (ECM)-associated proteins, determining the initial conformational nature of dense and soft connective tissues. Gingival fibroblast monolayers and 3D spheroids were cultured on ECM substrate and covered with gas-blown poly-(DL-lactide-co-glycolide) (PLGA) nanofibers (uncoated/coated with collagen and fibrin). Cell attraction and rearrangement was followed by F-actin staining and confocal microscopy. Thicknesses of the cell layers, developed within the nanofibers, were quantified by ImageJ software. The expression of collagen1α1 chain (Col1α1), fibronectin, and metalloproteinase 2 (MMP2) encoding genes was determined by quantitative reverse transcription analysis. Collagen- and fibrin- coated nanofibers induced cell migration toward fibers and supported cellular growth within the scaffolds. Both proteins affected the spatial rearrangement of fibroblasts by favoring packed cell clusters or intermittent cell spreading. These cell arrangements resembled the structural characteristic of dense and soft connective tissues, respectively. Within three days of incubation, fibroblast spheroids interacted with the fibers, and grew robustly by increasing their thickness compared to monolayers. While the ECM key components, such as fibronectin and MMP2 encoding genes, were expressed in both protein groups, Col1α1 was predominantly expressed in bundled fibroblasts grown on collagen fibers. This enhanced expression of collagen1 is typical for dense connective tissue. Based on results of this study, our gas-blown, collagen- and fibrin-coated PLGA nanofibers are viable candidates for

  10. Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource

    Directory of Open Access Journals (Sweden)

    Sharpton Thomas J

    2012-10-01

    Full Text Available Abstract Background New computational resources are needed to manage the increasing volume of biological data from genome sequencing projects. One fundamental challenge is the ability to maintain a complete and current catalog of protein diversity. We developed a new approach for the identification of protein families that focuses on the rapid discovery of homologous protein sequences. Results We implemented fully automated and high-throughput procedures to de novo cluster proteins into families based upon global alignment similarity. Our approach employs an iterative clustering strategy in which homologs of known families are sifted out of the search for new families. The resulting reduction in computational complexity enables us to rapidly identify novel protein families found in new genomes and to perform efficient, automated updates that keep pace with genome sequencing. We refer to protein families identified through this approach as “Sifting Families,” or SFams. Our analysis of ~10.5 million protein sequences from 2,928 genomes identified 436,360 SFams, many of which are not represented in other protein family databases. We validated the quality of SFam clustering through statistical as well as network topology–based analyses. Conclusions We describe the rapid identification of SFams and demonstrate how they can be used to annotate genomes and metagenomes. The SFam database catalogs protein-family quality metrics, multiple sequence alignments, hidden Markov models, and phylogenetic trees. Our source code and database are publicly available and will be subject to frequent updates (http://edhar.genomecenter.ucdavis.edu/sifting_families/.

  11. Effect of surfactant-coated iron oxide nanoparticles on the effluent water quality from a simulated sequencing batch reactor treating domestic wastewater

    International Nuclear Information System (INIS)

    Hwang, Sangchul; Martinez, Diana; Perez, Priscilla; Rinaldi, Carlos

    2011-01-01

    This study was conducted to evaluate the effect of commercially available engineered iron oxide nanoparticles coated with a surfactant (ENP Fe-surf ) on effluent water quality from a lab-scale sequencing batch reactor as a model secondary biological wastewater treatment. Results showed that ∼8.7% of ENP Fe-surf applied were present in the effluent stream. The stable presence of ENP Fe-surf was confirmed by analyzing the mean particle diameter and iron concentration in the effluent. Consequently, aqueous ENP Fe-surf deteriorated the effluent water quality at a statistically significant level (p Fe-surf would be introduced into environmental receptors through the treated effluent and could potentially impact them. - Highlights: → Surfactant-coated engineered iron oxide nanoparticles (ENP Fe-surf ) were assessed. → Effluent quality was analyzed from a sequencing batch reactor with ENP Fe-surf . → ∼8.7% of ENP Fe-surf applied was present in the effluent. → ENP Fe-surf significantly (p Fe-surf will be introduced into environmental receptors. - Stable presence of surfactant-coated engineered iron oxides nanoparticles deteriorated the effluent water quality at a statistically significant level (p < 0.05).

  12. Screening of transgenic proteins expressed in transgenic food crops for the presence of short amino acid sequences identical to potential, IgE – binding linear epitopes of allergens

    Directory of Open Access Journals (Sweden)

    Peijnenburg Ad ACM

    2002-12-01

    Full Text Available Abstract Background Transgenic proteins expressed by genetically modified food crops are evaluated for their potential allergenic properties prior to marketing, among others by identification of short identical amino acid sequences that occur both in the transgenic protein and allergenic proteins. A strategy is proposed, in which the positive outcomes of the sequence comparison with a minimal length of six amino acids are further screened for the presence of potential linear IgE-epitopes. This double track approach involves the use of literature data on IgE-epitopes and an antigenicity prediction algorithm. Results Thirty-three transgenic proteins have been screened for identities of at least six contiguous amino acids shared with allergenic proteins. Twenty-two transgenic proteins showed positive results of six- or seven-contiguous amino acids length. Only a limited number of identical stretches shared by transgenic proteins (papaya ringspot virus coat protein, acetolactate synthase GH50, and glyphosate oxidoreductase and allergenic proteins could be identified as (part of potential linear epitopes. Conclusion Many transgenic proteins have identical stretches of six or seven amino acids in common with allergenic proteins. Most identical stretches are likely to be false positives. As shown in this study, identical stretches can be further screened for relevance by comparison with linear IgE-binding epitopes described in literature. In the absence of literature data on epitopes, antigenicity prediction by computer aids to select potential antibody binding sites that will need verification of IgE binding by sera binding tests. Finally, the positive outcomes of this approach warrant further clinical testing for potential allergenicity.

  13. Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently

    Science.gov (United States)

    Currin, Andrew; Swainston, Neil; Day, Philip J.

    2015-01-01

    The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the ‘search space’ of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (K d) and catalytic (k cat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving k cat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the ‘best’ amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole

  14. Improvement of food packaging related properties in whey protein isolate‑based nanocomposite films and coatings by addition of montmorillonite nanoplatelets

    Science.gov (United States)

    Schmid, Markus; Merzbacher, Sarah; Brzoska, Nicola; Müller, Kerstin; Jesdinszki, Marius

    2017-11-01

    In the present study the effects of the addition of montmorillonite (MMT) nanoplatelets on whey protein isolate (WPI)-based nanocomposite films and coatings were investigated. The main objective was the development of WPI-based MMT-nanocomposites with enhanced barrier and mechanical properties. WPI-based nanocomposite cast-films and coatings were prepared by dispersing 0 % (reference sample), 3 %, 6 %, 9 % (w/w protein) MMT, or, depending on the protein concentration, also 12 % and 15 % (w/w protein) MMT into native WPI-based dispersions, followed by subsequent denaturation during the drying and curing process. The natural MMT nanofillers could be randomly dispersed into film-forming WPI-based nanodispersions, displaying good compatibility with the hydrophilic biopolymer matrix. As a result, by addition of 15 % (w/w protein) MMT into 10 % (w/w dispersion) WPI-based cast-films or coatings, the oxygen permeability (OP) was reduced by 91 % for glycerol-plasticized and 84 % for sorbitol-plasticized coatings, water vapor transmission rate (WVTR) was reduced by 58 % for sorbitol-plasticized cast-films. Due to the addition of MMT- nanofillers the Young’s modulus and tensile strength improved by 315 % and 129 %, respectively, whereas elongation at break declined by 77 % for glycerol-plasticized cast-films. In addition, comparison of plasticizer type revealed that sorbitol-plasticized cast-films were generally stiffer and stronger, but less flexible compared glycerol-plasticized cast-films. Viscosity measurements demonstrated good processability and suitability for up-scaled industrial processes of native WPI-based nanocomposite dispersions, even at high nanofiller-loadings. These results suggest that the addition of natural MMT- nanofillers into native WPI-based matrices to form nanocomposite films and coatings holds great potential to replace well-established, fossil-based packaging materials for at least certain applications such as oxygen barriers as part of

  15. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

    Science.gov (United States)

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-11-16

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. CISAPS: Complex Informational Spectrum for the Analysis of Protein Sequences

    Directory of Open Access Journals (Sweden)

    Charalambos Chrysostomou

    2015-01-01

    Full Text Available Complex informational spectrum analysis for protein sequences (CISAPS and its web-based server are developed and presented. As recent studies show, only the use of the absolute spectrum in the analysis of protein sequences using the informational spectrum analysis is proven to be insufficient. Therefore, CISAPS is developed to consider and provide results in three forms including absolute, real, and imaginary spectrum. Biologically related features to the analysis of influenza A subtypes as presented as a case study in this study can also appear individually either in the real or imaginary spectrum. As the results presented, protein classes can present similarities or differences according to the features extracted from CISAPS web server. These associations are probable to be related with the protein feature that the specific amino acid index represents. In addition, various technical issues such as zero-padding and windowing that may affect the analysis are also addressed. CISAPS uses an expanded list of 611 unique amino acid indices where each one represents a different property to perform the analysis. This web-based server enables researchers with little knowledge of signal processing methods to apply and include complex informational spectrum analysis to their work.

  17. Antifouling coatings: recent developments in the design of surfaces that prevent fouling by proteins, bacteria, and marine organisms

    Energy Technology Data Exchange (ETDEWEB)

    Banerjee, Indrani; Pangule, Ravindra C.; Kane, Ravi S. [Howard P. Isermann Department of Chemical and Biological Engineering, Rensselaer Polytechnic Institute, 110 8th Street, Ricketts Building, Troy, NY 12180 (United States)

    2011-02-08

    The major strategies for designing surfaces that prevent fouling due to proteins, bacteria, and marine organisms are reviewed. Biofouling is of great concern in numerous applications ranging from biosensors to biomedical implants and devices, and from food packaging to industrial and marine equipment. The two major approaches to combat surface fouling are based on either preventing biofoulants from attaching or degrading them. One of the key strategies for imparting adhesion resistance involves the functionalization of surfaces with poly(ethylene glycol) (PEG) or oligo(ethylene glycol). Several alternatives to PEG-based coatings have also been designed over the past decade. While protein-resistant coatings may also resist bacterial attachment and subsequent biofilm formation, in order to overcome the fouling-mediated risk of bacterial infection it is highly desirable to design coatings that are bactericidal. Traditional techniques involve the design of coatings that release biocidal agents, including antibiotics, quaternary ammonium salts (QAS), and silver, into the surrounding aqueous environment. However, the emergence of antibiotic- and silver-resistant pathogenic strains has necessitated the development of alternative strategies. Therefore, other techniques based on the use of polycations, enzymes, nanomaterials, and photoactive agents are being investigated. With regard to marine antifouling coatings, restrictions on the use of biocide-releasing coatings have made the generation of nontoxic antifouling surfaces more important. While considerable progress has been made in the design of antifouling coatings, ongoing research in this area should result in the development of even better antifouling materials in the future. (Copyright copyright 2011 WILEY-VCH Verlag GmbH and Co. KGaA, Weinheim)

  18. Identification of a novel Plasmopara halstedii elicitor protein combining de novo peptide sequencing algorithms and RACE-PCR

    Directory of Open Access Journals (Sweden)

    Madlung Johannes

    2010-05-01

    Full Text Available Abstract Background Often high-quality MS/MS spectra of tryptic peptides do not match to any database entry because of only partially sequenced genomes and therefore, protein identification requires de novo peptide sequencing. To achieve protein identification of the economically important but still unsequenced plant pathogenic oomycete Plasmopara halstedii, we first evaluated the performance of three different de novo peptide sequencing algorithms applied to a protein digests of standard proteins using a quadrupole TOF (QStar Pulsar i. Results The performance order of the algorithms was PEAKS online > PepNovo > CompNovo. In summary, PEAKS online correctly predicted 45% of measured peptides for a protein test data set. All three de novo peptide sequencing algorithms were used to identify MS/MS spectra of tryptic peptides of an unknown 57 kDa protein of P. halstedii. We found ten de novo sequenced peptides that showed homology to a Phytophthora infestans protein, a closely related organism of P. halstedii. Employing a second complementary approach, verification of peptide prediction and protein identification was performed by creation of degenerate primers for RACE-PCR and led to an ORF of 1,589 bp for a hypothetical phosphoenolpyruvate carboxykinase. Conclusions Our study demonstrated that identification of proteins within minute amounts of sample material improved significantly by combining sensitive LC-MS methods with different de novo peptide sequencing algorithms. In addition, this is the first study that verified protein prediction from MS data by also employing a second complementary approach, in which RACE-PCR led to identification of a novel elicitor protein in P. halstedii.

  19. Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs

    Directory of Open Access Journals (Sweden)

    Ruan Jishou

    2007-04-01

    Full Text Available Abstract Background Traditionally, it is believed that the native structure of a protein corresponds to a global minimum of its free energy. However, with the growing number of known tertiary (3D protein structures, researchers have discovered that some proteins can alter their structures in response to a change in their surroundings or with the help of other proteins or ligands. Such structural shifts play a crucial role with respect to the protein function. To this end, we propose a machine learning method for the prediction of the flexible/rigid regions of proteins (referred to as FlexRP; the method is based on a novel sequence representation and feature selection. Knowledge of the flexible/rigid regions may provide insights into the protein folding process and the 3D structure prediction. Results The flexible/rigid regions were defined based on a dataset, which includes protein sequences that have multiple experimental structures, and which was previously used to study the structural conservation of proteins. Sequences drawn from this dataset were represented based on feature sets that were proposed in prior research, such as PSI-BLAST profiles, composition vector and binary sequence encoding, and a newly proposed representation based on frequencies of k-spaced amino acid pairs. These representations were processed by feature selection to reduce the dimensionality. Several machine learning methods for the prediction of flexible/rigid regions and two recently proposed methods for the prediction of conformational changes and unstructured regions were compared with the proposed method. The FlexRP method, which applies Logistic Regression and collocation-based representation with 95 features, obtained 79.5% accuracy. The two runner-up methods, which apply the same sequence representation and Support Vector Machines (SVM and Naïve Bayes classifiers, obtained 79.2% and 78.4% accuracy, respectively. The remaining considered methods are

  20. Rapid detection and purification of sequence specific DNA binding proteins using magnetic separation

    Directory of Open Access Journals (Sweden)

    TIJANA SAVIC

    2006-02-01

    Full Text Available In this paper, a method for the rapid identification and purification of sequence specific DNA binding proteins based on magnetic separation is presented. This method was applied to confirm the binding of the human recombinant USF1 protein to its putative binding site (E-box within the human SOX3 protomer. It has been shown that biotinylated DNA attached to streptavidin magnetic particles specifically binds the USF1 protein in the presence of competitor DNA. It has also been demonstrated that the protein could be successfully eluted from the beads, in high yield and with restored DNA binding activity. The advantage of these procedures is that they could be applied for the identification and purification of any high-affinity sequence-specific DNA binding protein with only minor modifications.

  1. Fast computational methods for predicting protein structure from primary amino acid sequence

    Science.gov (United States)

    Agarwal, Pratul Kumar [Knoxville, TN

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  2. Analysis of correlations between sites in models of protein sequences

    International Nuclear Information System (INIS)

    Giraud, B.G.; Lapedes, A.; Liu, L.C.

    1998-01-01

    A criterion based on conditional probabilities, related to the concept of algorithmic distance, is used to detect correlated mutations at noncontiguous sites on sequences. We apply this criterion to the problem of analyzing correlations between sites in protein sequences; however, the analysis applies generally to networks of interacting sites with discrete states at each site. Elementary models, where explicit results can be derived easily, are introduced. The number of states per site considered ranges from 2, illustrating the relation to familiar classical spin systems, to 20 states, suitable for representing amino acids. Numerical simulations show that the criterion remains valid even when the genetic history of the data samples (e.g., protein sequences), as represented by a phylogenetic tree, introduces nonindependence between samples. Statistical fluctuations due to finite sampling are also investigated and do not invalidate the criterion. A subsidiary result is found: The more homogeneous a population, the more easily its average properties can drift from the properties of its ancestor. copyright 1998 The American Physical Society

  3. Sequence walkers: a graphical method to display how binding proteins interact with DNA or RNA sequences | Center for Cancer Research

    Science.gov (United States)

    A graphical method is presented for displaying how binding proteins and other macromolecules interact with individual bases of nucleotide sequences. Characters representing the sequence are either oriented normally and placed above a line indicating favorable contact, or upside-down and placed below the line indicating unfavorable contact. The positive or negative height of each letter shows the contribution of that base to the average sequence conservation of the binding site, as represented by a sequence logo.

  4. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.

    Science.gov (United States)

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert; Wren, Jonathan

    2018-02-15

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations. Web server: http://deepgo.bio2vec.net, Source code: https://github.com/bio-ontology-research-group/deepgo. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  5. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    of this class have very little homology to other known genomes making functional annotation based on sequence similarity very difficult. Inspired in part by this analysis, an approach for comparative functional annotation was created based public sequenced genomes, CMGfunc. Functionally related groups......In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... annotation of genes – the descriptions assigned to genes that describe the likely function of the encoded proteins. This process is limited by several factors, including the definition of a function which can be more or less specific as well as how many genes can actually be assigned a function based...

  6. FASTERp: A Feature Array Search Tool for Estimating Resemblance of Protein Sequences

    Energy Technology Data Exchange (ETDEWEB)

    Macklin, Derek; Egan, Rob; Wang, Zhong

    2014-03-14

    Metagenome sequencing efforts have provided a large pool of billions of genes for identifying enzymes with desirable biochemical traits. However, homology search with billions of genes in a rapidly growing database has become increasingly computationally impractical. Here we present our pilot efforts to develop a novel alignment-free algorithm for homology search. Specifically, we represent individual proteins as feature vectors that denote the presence or absence of short kmers in the protein sequence. Similarity between feature vectors is then computed using the Tanimoto score, a distance metric that can be rapidly computed on bit string representations of feature vectors. Preliminary results indicate good correlation with optimal alignment algorithms (Spearman r of 0.87, ~;;1,000,000 proteins from Pfam), as well as with heuristic algorithms such as BLAST (Spearman r of 0.86, ~;;1,000,000 proteins). Furthermore, a prototype of FASTERp implemented in Python runs approximately four times faster than BLAST on a small scale dataset (~;;1000 proteins). We are optimizing and scaling to improve FASTERp to enable rapid homology searches against billion-protein databases, thereby enabling more comprehensive gene annotation efforts.

  7. muBLASTP: database-indexed protein sequence search on multicore CPUs.

    Science.gov (United States)

    Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

    2016-11-04

    The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.

  8. Isolation and N-terminal sequencing of a novel cadmium-binding protein from Boletus edulis

    Science.gov (United States)

    Collin-Hansen, C.; Andersen, R. A.; Steinnes, E.

    2003-05-01

    A Cd-binding protein was isolated from the popular edible mushroom Boletus edulis, which is a hyperaccumulator of both Cd and Hg. Wild-growing samples of B. edulis were collected from soils rich in Cd. Cd radiotracer was added to the crude protein preparation obtained from ethanol precipitation of heat-treated cytosol. Proteins were then further separated in two consecutive steps; gel filtration and anion exchange chromatography. In both steps the Cd radiotracer profile showed only one distinct peak, which corresponded well with the profiles of endogenous Cd obtained by atomic absorption spectrophotometry (AAS). Concentrations of the essential elements Cu and Zn were low in the protein fractions high in Cd. N-terminal sequencing performed on the Cd-binding protein fractions revealed a protein with a novel amino acid sequence, which contained aromatic amino acids as well as proline. Both the N-terminal sequencing and spectrofluorimetric analysis with EDTA and ABD-F (4-aminosulfonyl-7-fluoro-2, 1, 3-benzoxadiazole) failed to detect cysteine in the Cd-binding fractions. These findings conclude that the novel protein does not belong to the metallothionein family. The results suggest a role for the protein in Cd transport and storage, and they are of importance in view of toxicology and food chemistry, but also for environmental protection.

  9. Sequence charge decoration dictates coil-globule transition in intrinsically disordered proteins

    Science.gov (United States)

    Firman, Taylor; Ghosh, Kingshuk

    2018-03-01

    We present an analytical theory to compute conformations of heteropolymers—applicable to describe disordered proteins—as a function of temperature and charge sequence. The theory describes coil-globule transition for a given protein sequence when temperature is varied and has been benchmarked against the all-atom Monte Carlo simulation (using CAMPARI) of intrinsically disordered proteins (IDPs). In addition, the model quantitatively shows how subtle alterations of charge placement in the primary sequence—while maintaining the same charge composition—can lead to significant changes in conformation, even as drastic as a coil (swelled above a purely random coil) to globule (collapsed below a random coil) and vice versa. The theory provides insights on how to control (enhance or suppress) these changes by tuning the temperature (or solution condition) and charge decoration. As an application, we predict the distribution of conformations (at room temperature) of all naturally occurring IDPs in the DisProt database and notice significant size variation even among IDPs with a similar composition of positive and negative charges. Based on this, we provide a new diagram-of-states delineating the sequence-conformation relation for proteins in the DisProt database. Next, we study the effect of post-translational modification, e.g., phosphorylation, on IDP conformations. Modifications as little as two-site phosphorylation can significantly alter the size of an IDP with everything else being constant (temperature, salt concentration, etc.). However, not all possible modification sites have the same effect on protein conformations; there are certain "hot spots" that can cause maximal change in conformation. The location of these "hot spots" in the parent sequence can readily be identified by using a sequence charge decoration metric originally introduced by Sawle and Ghosh. The ability of our model to predict conformations (both expanded and collapsed states) of IDPs at

  10. All-atom normal-mode analysis reveals an RNA-induced allostery in a bacteriophage coat protein.

    Science.gov (United States)

    Dykeman, Eric C; Twarock, Reidun

    2010-03-01

    Assembly of the T=3 bacteriophage MS2 is initiated by the binding of a 19 nucleotide RNA stem loop from within the phage genome to a symmetric coat protein dimer. This binding event effects a folding of the FG loop in one of the protein subunits of the dimer and results in the formation of an asymmetric dimer. Since both the symmetric and asymmetric forms of the dimer are needed for the assembly of the protein container, this allosteric switch plays an important role in the life cycle of the phage. We provide here details of an all-atom normal-mode analysis of this allosteric effect. The results suggest that asymmetric contacts between the A -duplex RNA phosphodiester backbone of the stem loop with the EF loop in one coat protein subunit results in an increased dynamic behavior of its FG loop. The four lowest-frequency modes, which encompass motions predominantly on the FG loops, account for over 90% of the increased dynamic behavior due to a localization of the vibrational pattern on a single FG loop. Finally, we show that an analysis of the allosteric effect using an elastic network model fails to predict this localization effect, highlighting the importance of using an all-atom full force field method for this problem.

  11. Exploring Sequence Characteristics Related to High- Level Production of Secreted Proteins in Aspergillus niger

    NARCIS (Netherlands)

    Van den Berg, B.A.; Reinders, M.J.T.; Hulsman, M.; Wu, L.; Pel, H.J.; Roubos, J.A.; De Ridder, D.

    2012-01-01

    Protein sequence features are explored in relation to the production of over-expressed extracellular proteins by fungi. Knowledge on features influencing protein production and secretion could be employed to improve enzyme production levels in industrial bioprocesses via protein engineering. A large

  12. Decreased Bacterial Attachment and Protein Adsorption to Coatings Produced by Low Enegy Plasma Polymerization

    DEFF Research Database (Denmark)

    Andersen, T.E.; Kingshott, Peter; Benter, M.

    .figure) .and E. coli grown on uncoated silicone compared to PP-PVP coated silicone (right figure). Results from the flow chamber analysis shows PP-PVP to be very good at preventing E. coli colonization during prolonged growth in flow chamber. At this point other surfaces and bacteria remains to be tested...... adsorption and bacteria attachment/colonization. This is emphasized by the fact that long dwelling urinary catheters, which is a typical silicone medical device, causes 5% per day incidence of urinary tract infection [1,2]. A demand therefore exists for surface modifications providing the silicone material......-coated crystals were then treated with one of the plasma polymerized coatings. Adsorption of fibrinogen, human serum albumin or immunoglobulin G was measured using a QCM-D instrument [5] (model E4, Q-Sense AB, Vastra Frolunda, Sweden) using a solution of 50llg/1 protein in PBS buffer. Results and Discussion: Our...

  13. Revised Mimivirus major capsid protein sequence reveals intron-containing gene structure and extra domain

    Directory of Open Access Journals (Sweden)

    Suzan-Monti Marie

    2009-05-01

    Full Text Available Abstract Background Acanthamoebae polyphaga Mimivirus (APM is the largest known dsDNA virus. The viral particle has a nearly icosahedral structure with an internal capsid shell surrounded with a dense layer of fibrils. A Capsid protein sequence, D13L, was deduced from the APM L425 coding gene and was shown to be the most abundant protein found within the viral particle. However this protein remained poorly characterised until now. A revised protein sequence deposited in a database suggested an additional N-terminal stretch of 142 amino acids missing from the original deduced sequence. This result led us to investigate the L425 gene structure and the biochemical properties of the complete APM major Capsid protein. Results This study describes the full length 3430 bp Capsid coding gene and characterises the 593 amino acids long corresponding Capsid protein 1. The recombinant full length protein allowed the production of a specific monoclonal antibody able to detect the Capsid protein 1 within the viral particle. This protein appeared to be post-translationnally modified by glycosylation and phosphorylation. We proposed a secondary structure prediction of APM Capsid protein 1 compared to the Capsid protein structure of Paramecium Bursaria Chlorella Virus 1, another member of the Nucleo-Cytoplasmic Large DNA virus family. Conclusion The characterisation of the full length L425 Capsid coding gene of Acanthamoebae polyphaga Mimivirus provides new insights into the structure of the main Capsid protein. The production of a full length recombinant protein will be useful for further structural studies.

  14. Interactions of rat repetitive sequence MspI8 with nuclear matrix proteins during spermatogenesis

    International Nuclear Information System (INIS)

    Rogolinski, J.; Widlak, P.; Rzeszowska-Wolny, J.

    1996-01-01

    Using the Southwestern blot analysis we have studied the interactions between rat repetitive sequence MspI8 and the nuclear matrix proteins of rats testis cells. Starting from 2 weeks the young to adult animal showed differences in type of testis nuclear matrix proteins recognizing the MspI8 sequence. The same sets of nuclear matrix proteins were detected in some enriched in spermatocytes and spermatids and obtained after fractionation of cells of adult animal by the velocity sedimentation technique. (author). 21 refs, 5 figs

  15. Protein model discrimination using mutational sensitivity derived from deep sequencing.

    Science.gov (United States)

    Adkar, Bharat V; Tripathi, Arti; Sahoo, Anusmita; Bajaj, Kanika; Goswami, Devrishi; Chakrabarti, Purbani; Swarnkar, Mohit K; Gokhale, Rajesh S; Varadarajan, Raghavan

    2012-02-08

    A major bottleneck in protein structure prediction is the selection of correct models from a pool of decoys. Relative activities of ∼1,200 individual single-site mutants in a saturation library of the bacterial toxin CcdB were estimated by determining their relative populations using deep sequencing. This phenotypic information was used to define an empirical score for each residue (RankScore), which correlated with the residue depth, and identify active-site residues. Using these correlations, ∼98% of correct models of CcdB (RMSD ≤ 4Å) were identified from a large set of decoys. The model-discrimination methodology was further validated on eleven different monomeric proteins using simulated RankScore values. The methodology is also a rapid, accurate way to obtain relative activities of each mutant in a large pool and derive sequence-structure-function relationships without protein isolation or characterization. It can be applied to any system in which mutational effects can be monitored by a phenotypic readout. Copyright © 2012 Elsevier Ltd. All rights reserved.

  16. Mapping a nucleolar targeting sequence of an RNA binding nucleolar protein, Nop25

    International Nuclear Information System (INIS)

    Fujiwara, Takashi; Suzuki, Shunji; Kanno, Motoko; Sugiyama, Hironobu; Takahashi, Hisaaki; Tanaka, Junya

    2006-01-01

    Nop25 is a putative RNA binding nucleolar protein associated with rRNA transcription. The present study was undertaken to determine the mechanism of Nop25 localization in the nucleolus. Deletion experiments of Nop25 amino acid sequence showed Nop25 to contain a nuclear targeting sequence in the N-terminal and a nucleolar targeting sequence in the C-terminal. By expressing derivative peptides from the C-terminal as GFP-fusion proteins in the cells, a lysine and arginine residue-enriched peptide (KRKHPRRAQDSTKKPPSATRTSKTQRRRR) allowed a GFP-fusion protein to be transported and fully retained in the nucleolus. When the peptide was fused with cMyc epitope and expressed in the cells, a cMyc epitope was then detected in the nucleolus. Nop25 did not localize in the nucleolus by deletion of the peptide from Nop25. Furthermore, deletion of a subdomain (KRKHPRRAQ) in the peptide or amino acid substitution of lysine and arginine residues in the subdomain resulted in the loss of Nop25 nucleolar localization. These results suggest that the lysine and arginine residue-enriched peptide is the most prominent nucleolar targeting sequence of Nop25 and that the long stretch of basic residues might play an important role in the nucleolar localization of Nop25. Although Nop25 contained putative SUMOylation, phosphorylation and glycosylation sites, the amino acid substitution in these sites had no effect on the nucleolar localization, thus suggesting that these post-translational modifications did not contribute to the localization of Nop25 in the nucleolus. The treatment of the cells, which expressed a GFP-fusion protein with a nucleolar targeting sequence of Nop25, with RNase A resulted in a complete dislocation of the protein from the nucleolus. These data suggested that the nucleolar targeting sequence might therefore play an important role in the binding of Nop25 to RNA molecules and that the RNA binding of Nop25 might be essential for the nucleolar localization of Nop25

  17. Prediction of Carbohydrate-Binding Proteins from Sequences Using Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Seizi Someya

    2010-01-01

    Full Text Available Carbohydrate-binding proteins are proteins that can interact with sugar chains but do not modify them. They are involved in many physiological functions, and we have developed a method for predicting them from their amino acid sequences. Our method is based on support vector machines (SVMs. We first clarified the definition of carbohydrate-binding proteins and then constructed positive and negative datasets with which the SVMs were trained. By applying the leave-one-out test to these datasets, our method delivered 0.92 of the area under the receiver operating characteristic (ROC curve. We also examined two amino acid grouping methods that enable effective learning of sequence patterns and evaluated the performance of these methods. When we applied our method in combination with the homology-based prediction method to the annotated human genome database, H-invDB, we found that the true positive rate of prediction was improved.

  18. Apoptosis inhibitor of macrophage (AIM) diminishes lipid droplet-coating proteins leading to lipolysis in adipocytes

    Energy Technology Data Exchange (ETDEWEB)

    Iwamura, Yoshihiro; Mori, Mayumi; Nakashima, Katsuhiko [Laboratory of Molecular Biomedicine for Pathogenesis, Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033 (Japan); Mikami, Toshiyuki; Murayama, Katsuhisa [Genomic Science Laboratories, Dainippon Sumitomo Pharma Co. Ltd., 3-1-98 Kasugadenaka, Konohana-ku, Osaka 554-0022 (Japan); Arai, Satoko [Laboratory of Molecular Biomedicine for Pathogenesis, Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033 (Japan); Miyazaki, Toru, E-mail: tm@m.u-tokyo.ac.jp [Laboratory of Molecular Biomedicine for Pathogenesis, Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033 (Japan)

    2012-06-08

    Highlights: Black-Right-Pointing-Pointer AIM induces lipolysis in a distinct manner from that of hormone-dependent lipolysis. Black-Right-Pointing-Pointer AIM ablates activity of peroxisome proliferator-activated receptor in adipocytes. Black-Right-Pointing-Pointer AIM reduces mRNA levels of lipid-droplet coating proteins leading to lipolysis. -- Abstract: Under fasting conditions, triacylglycerol in adipose tissue undergoes lipolysis to supply fatty acids as energy substrates. Such lipolysis is regulated by hormones, which activate lipases via stimulation of specific signalling cascades. We previously showed that macrophage-derived soluble protein, AIM induces obesity-associated lipolysis, triggering chronic inflammation in fat tissue which causes insulin resistance. However, the mechanism of how AIM mediates lipolysis remains unknown. Here we show that AIM induces lipolysis in a manner distinct from that of hormone-dependent lipolysis, without activation or augmentation of lipases. In vivo and in vitro, AIM did not enhance phosphorylation of hormone-sensitive lipase (HSL) in adipocytes, a hallmark of hormone-dependent lipolysis activation. Similarly, adipose tissue from obese AIM-deficient and wild-type mice showed comparable HSL phosphorylation. Consistent with the suppressive effect of AIM on fatty acid synthase activity, the amount of saturated and unsaturated fatty acids was reduced in adipocytes treated with AIM. This response ablated transcriptional activity of peroxisome proliferator-activated receptor (PPAR{gamma}), leading to diminished gene expression of lipid-droplet coating proteins including fat-specific protein 27 (FSP27) and Perilipin, which are indispensable for triacylglycerol storage in adipocytes. Accordingly, the lipolytic effect of AIM was overcome by a PPAR{gamma}-agonist or forced expression of FSP27, while it was synergized by a PPAR{gamma}-antagonist. Overall, distinct modes of lipolysis appear to take place in different physiological

  19. Formation of a Multiple Protein Complex on the Adenovirus Packaging Sequence by the IVa2 Protein▿

    OpenAIRE

    Tyler, Ryan E.; Ewing, Sean G.; Imperiale, Michael J.

    2007-01-01

    During adenovirus virion assembly, the packaging sequence mediates the encapsidation of the viral genome. This sequence is composed of seven functional units, termed A repeats. Recent evidence suggests that the adenovirus IVa2 protein binds the packaging sequence and is involved in packaging of the genome. Study of the IVa2-packaging sequence interaction has been hindered by difficulty in purifying the protein produced in virus-infected cells or by recombinant techniques. We report the first ...

  20. Flagellin based biomimetic coatings: From cell-repellent surfaces to highly adhesive coatings.

    Science.gov (United States)

    Kovacs, Boglarka; Patko, Daniel; Szekacs, Inna; Orgovan, Norbert; Kurunczi, Sandor; Sulyok, Attila; Khanh, Nguyen Quoc; Toth, Balazs; Vonderviszt, Ferenc; Horvath, Robert

    2016-09-15

    Biomimetic coatings with cell-adhesion-regulating functionalities are intensively researched today. For example, cell-based biosensing for drug development, biomedical implants, and tissue engineering require that the surface adhesion of living cells is well controlled. Recently, we have shown that the bacterial flagellar protein, flagellin, adsorbs through its terminal segments to hydrophobic surfaces, forming an oriented monolayer and exposing its variable D3 domain to the solution. Here, we hypothesized that this nanostructured layer is highly cell-repellent since it mimics the surface of the flagellar filaments. Moreover, we proposed flagellin as a carrier molecule to display the cell-adhesive RGD (Arg-Gly-Asp) peptide sequence and induce cell adhesion on the coated surface. The D3 domain of flagellin was replaced with one or more RGD motifs linked by various oligopeptides modulating flexibility and accessibility of the inserted segment. The obtained flagellin variants were applied to create surface coatings inducing cell adhesion and spreading to different levels, while wild-type flagellin was shown to form a surface layer with strong anti-adhesive properties. As reference surfaces synthetic polymers were applied which have anti-adhesive (PLL-g-PEG poly(l-lysine)-graft-poly(ethylene glycol)) or adhesion inducing properties (RGD-functionalized PLL-g-PEG). Quantitative adhesion data was obtained by employing optical biochips and microscopy. Cell-adhesion-regulating coatings can be simply formed on hydrophobic surfaces by using the developed flagellin-based constructs. The developed novel RGD-displaying flagellin variants can be easily obtained by bacterial production and can serve as alternatives to create cell-adhesion-regulating biomimetic coatings. In the present work, we show for the first time that. Copyright © 2016 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.

  1. Dynamics of Spore Coat Morphogenesis in Bacillus subtilis

    Science.gov (United States)

    McKenney, Peter T.; Eichenberger, Patrick

    2011-01-01

    SUMMARY Spores of Bacillus subtilis are encased in a protective coat made up of at least 70 proteins. The structure of the spore coat has been examined using a variety of genetic, imaging and biochemical techniques, however, the majority of these studies have focused on mature spores. In this study we use a library of 41 spore coat proteins fused to the Green Fluorescent Protein (GFP) to examine spore coat morphogenesis over the time-course of sporulation. We found considerable diversity in the localization dynamics of coat proteins and were able to establish 6 classes based on localization kinetics. Localization dynamics correlate well with the known transcriptional regulators of coat gene expression. Previously, we described the existence of multiple layers in the mature spore coat. Here, we find that the spore coat initially assembles a scaffold that is organized into multiple layers on one pole of the spore. The coat then encases the spore in multiple coordinated waves. Encasement is driven, at least partially, by transcription of coat genes and deletion of sporulation transcription factors arrests encasement. We also identify the trans-compartment SpoIIIAH-SpoIIQ channel as necessary for encasement. This is the first demonstration of a forespore contribution to spore coat morphogenesis. PMID:22171814

  2. Production of Polyclonal Antibodies to a Recombinant Coat Protein of Potato mop-top virus

    Czech Academy of Sciences Publication Activity Database

    Čeřovská, Noemi; Moravec, Tomáš; Rosecká, Pavla; Dědič, P.; Filigarová, Marie

    2003-01-01

    Roč. 151, č. 4 (2003), s. 195-200 ISSN 0931-1785 R&D Projects: GA ČR GA522/01/1121 Institutional research plan: CEZ:AV0Z5038910 Keywords : potato mop-top virus * recombinant coat protein * Escherichia Coli Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 0.557, year: 2003

  3. Discovering approximate-associated sequence patterns for protein-DNA interactions

    KAUST Repository

    Chan, Tak Ming

    2010-12-30

    Motivation: The bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) are fundamental protein-DNA interactions in transcriptional regulation. Extensive efforts have been made to better understand the protein-DNA interactions. Recent mining on exact TF-TFBS-associated sequence patterns (rules) has shown great potentials and achieved very promising results. However, exact rules cannot handle variations in real data, resulting in limited informative rules. In this article, we generalize the exact rules to approximate ones for both TFs and TFBSs, which are essential for biological variations. Results: A progressive approach is proposed to address the approximation to alleviate the computational requirements. Firstly, similar TFBSs are grouped from the available TF-TFBS data (TRANSFAC database). Secondly, approximate and highly conserved binding cores are discovered from TF sequences corresponding to each TFBS group. A customized algorithm is developed for the specific objective. We discover the approximate TF-TFBS rules by associating the grouped TFBS consensuses and TF cores. The rules discovered are evaluated by matching (verifying with) the actual protein-DNA binding pairs from Protein Data Bank (PDB) 3D structures. The approximate results exhibit many more verified rules and up to 300% better verification ratios than the exact ones. The customized algorithm achieves over 73% better verification ratios than traditional methods. Approximate rules (64-79%) are shown statistically significant. Detailed variation analysis and conservation verification on NCBI records demonstrate that the approximate rules reveal both the flexible and specific protein-DNA interactions accurately. The approximate TF-TFBS rules discovered show great generalized capability of exploring more informative binding rules. © The Author 2010. Published by Oxford University Press. All rights reserved.

  4. Large-scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe.

    Science.gov (United States)

    Necci, Marco; Piovesan, Damiano; Tosatto, Silvio C E

    2016-12-01

    Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large-scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence-based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures. © 2016 The Protein Society.

  5. Nucleotide sequence of a chickpea chlorotic stunt virus relative that infects pea and faba bean in China.

    Science.gov (United States)

    Zhou, Cui-Ji; Xiang, Hai-Ying; Zhuo, Tao; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

    2012-07-01

    We determined the genome sequence of a new polerovirus that infects field pea and faba bean in China. Its entire nucleotide sequence (6021 nt) was most closely related (83.3% identity) to that of an Ethiopian isolate of chickpea chlorotic stunt virus (CpCSV-Eth). With the exception of the coat protein (encoded by ORF3), amino acid sequence identities of all gene products of this virus to those of CpCSV-Eth and other poleroviruses were Polerovirus, and the name pea mild chlorosis virus is proposed.

  6. Representation of protein-sequence information by amino acid subalphabets

    DEFF Research Database (Denmark)

    Andersen, C.A.F.; Brunak, Søren

    2004-01-01

    -sequence information, using machine learning strategies, where the primary goal is the discovery of novel powerful representations for use in AI techniques. In the case of proteins and the 20 different amino acids they typically contain, it is also a secondary goal to discover how the current selection of amino acids...

  7. RStrucFam: a web server to associate structure and cognate RNA for RNA-binding proteins from sequence information.

    Science.gov (United States)

    Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan

    2016-10-07

    RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential

  8. Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

    Science.gov (United States)

    Truong, Kevin; Ikura, Mitsuhiko

    2003-05-06

    Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.

  9. Adaptive GDDA-BLAST: fast and efficient algorithm for protein sequence embedding.

    Directory of Open Access Journals (Sweden)

    Yoojin Hong

    2010-10-01

    Full Text Available A major computational challenge in the genomic era is annotating structure/function to the vast quantities of sequence information that is now available. This problem is illustrated by the fact that most proteins lack comprehensive annotations, even when experimental evidence exists. We previously theorized that embedded-alignment profiles (simply "alignment profiles" hereafter provide a quantitative method that is capable of relating the structural and functional properties of proteins, as well as their evolutionary relationships. A key feature of alignment profiles lies in the interoperability of data format (e.g., alignment information, physio-chemical information, genomic information, etc.. Indeed, we have demonstrated that the Position Specific Scoring Matrices (PSSMs are an informative M-dimension that is scored by quantitatively measuring the embedded or unmodified sequence alignments. Moreover, the information obtained from these alignments is informative, and remains so even in the "twilight zone" of sequence similarity (<25% identity. Although our previous embedding strategy was powerful, it suffered from contaminating alignments (embedded AND unmodified and high computational costs. Herein, we describe the logic and algorithmic process for a heuristic embedding strategy named "Adaptive GDDA-BLAST." Adaptive GDDA-BLAST is, on average, up to 19 times faster than, but has similar sensitivity to our previous method. Further, data are provided to demonstrate the benefits of embedded-alignment measurements in terms of detecting structural homology in highly divergent protein sequences and isolating secondary structural elements of transmembrane and ankyrin-repeat domains. Together, these advances allow further exploration of the embedded alignment data space within sufficiently large data sets to eventually induce relevant statistical inferences. We show that sequence embedding could serve as one of the vehicles for measurement of low

  10. Generalizing and learning protein-DNA binding sequence representations by an evolutionary algorithm

    KAUST Repository

    Wong, Ka Chun

    2011-02-05

    Protein-DNA bindings are essential activities. Understanding them forms the basis for further deciphering of biological and genetic systems. In particular, the protein-DNA bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) play a central role in gene transcription. Comprehensive TF-TFBS binding sequence pairs have been found in a recent study. However, they are in one-to-one mappings which cannot fully reflect the many-to-many mappings within the bindings. An evolutionary algorithm is proposed to learn generalized representations (many-to-many mappings) from the TF-TFBS binding sequence pairs (one-to-one mappings). The generalized pairs are shown to be more meaningful than the original TF-TFBS binding sequence pairs. Some representative examples have been analyzed in this study. In particular, it shows that the TF-TFBS binding sequence pairs are not presumably in one-to-one mappings. They can also exhibit many-to-many mappings. The proposed method can help us extract such many-to-many information from the one-to-one TF-TFBS binding sequence pairs found in the previous study, providing further knowledge in understanding the bindings between TFs and TFBSs. © 2011 Springer-Verlag.

  11. Generalizing and learning protein-DNA binding sequence representations by an evolutionary algorithm

    KAUST Repository

    Wong, Ka Chun; Peng, Chengbin; Wong, Manhon; Leung, Kwongsak

    2011-01-01

    Protein-DNA bindings are essential activities. Understanding them forms the basis for further deciphering of biological and genetic systems. In particular, the protein-DNA bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) play a central role in gene transcription. Comprehensive TF-TFBS binding sequence pairs have been found in a recent study. However, they are in one-to-one mappings which cannot fully reflect the many-to-many mappings within the bindings. An evolutionary algorithm is proposed to learn generalized representations (many-to-many mappings) from the TF-TFBS binding sequence pairs (one-to-one mappings). The generalized pairs are shown to be more meaningful than the original TF-TFBS binding sequence pairs. Some representative examples have been analyzed in this study. In particular, it shows that the TF-TFBS binding sequence pairs are not presumably in one-to-one mappings. They can also exhibit many-to-many mappings. The proposed method can help us extract such many-to-many information from the one-to-one TF-TFBS binding sequence pairs found in the previous study, providing further knowledge in understanding the bindings between TFs and TFBSs. © 2011 Springer-Verlag.

  12. SPiCE : A web-based tool for sequence-based protein classification and exploration

    NARCIS (Netherlands)

    Van den Berg, B.A.; Reinders, M.J.; Roubos, J.A.; De Ridder, D.

    2014-01-01

    Background Amino acid sequences and features extracted from such sequences have been used to predict many protein properties, such as subcellular localization or solubility, using classifier algorithms. Although software tools are available for both feature extraction and classifier construction,

  13. The influence of the N- and C- terminal modifications of Potato virus X coat protein on virus properties

    Czech Academy of Sciences Publication Activity Database

    Hoffmeisterová, Hana; Moravec, Tomáš; Plchová, Helena; Folwarczna, Jitka; Čeřovská, Noemi

    2012-01-01

    Roč. 56, č. 4 (2012), s. 775-779 ISSN 0006-3134 R&D Projects: GA ČR GA521/09/1525 Institutional research plan: CEZ:AV0Z50380511 Keywords : chimeric coat protein * expression of recombinant protein * Nicotiana benthamiana Subject RIV: EI - Biotechnology ; Bionics Impact factor: 1.692, year: 2012

  14. Rapid detection, classification and accurate alignment of up to a million or more related protein sequences.

    Science.gov (United States)

    Neuwald, Andrew F

    2009-08-01

    The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.

  15. The YPLGVG sequence of the Nipah virus matrix protein is required for budding

    Directory of Open Access Journals (Sweden)

    Yan Lianying

    2008-11-01

    Full Text Available Abstract Background Nipah virus (NiV is a recently emerged paramyxovirus capable of causing fatal disease in a broad range of mammalian hosts, including humans. Together with Hendra virus (HeV, they comprise the genus Henipavirus in the family Paramyxoviridae. Recombinant expression systems have played a crucial role in studying the cell biology of these Biosafety Level-4 restricted viruses. Henipavirus assembly and budding occurs at the plasma membrane, although the details of this process remain poorly understood. Multivesicular body (MVB proteins have been found to play a role in the budding of several enveloped viruses, including some paramyxoviruses, and the recruitment of MVB proteins by viral proteins possessing late budding domains (L-domains has become an important concept in the viral budding process. Previously we developed a system for producing NiV virus-like particles (VLPs and demonstrated that the matrix (M protein possessed an intrinsic budding ability and played a major role in assembly. Here, we have used this system to further explore the budding process by analyzing elements within the M protein that are critical for particle release. Results Using rationally targeted site-directed mutagenesis we show that a NiV M sequence YPLGVG is required for M budding and that mutation or deletion of the sequence abrogates budding ability. Replacement of the native and overlapping Ebola VP40 L-domains with the NiV sequence failed to rescue VP40 budding; however, it did induce the cellular morphology of extensive filamentous projection consistent with wild-type VP40-expressing cells. Cells expressing wild-type NiV M also displayed this morphology, which was dependent on the YPLGVG sequence, and deletion of the sequence also resulted in nuclear localization of M. Dominant-negative VPS4 proteins had no effect on NiV M budding, suggesting that unlike other viruses such as Ebola, NiV M accomplishes budding independent of MVB cellular proteins

  16. ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

    Science.gov (United States)

    Büssow, Konrad; Hoffmann, Steve; Sievert, Volker

    2002-12-19

    Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.

  17. Dependence of M13 Major Coat Protein Oligomerization and Lateral Segregation of Bilayer Composition

    NARCIS (Netherlands)

    Fernandes, F.; Loura, L.M.S.; Prieto, M.; Koehorst, R.B.M.; Spruijt, R.B.; Hemminga, M.A.

    2003-01-01

    M13 major coat protein was derivatized with BODIPY (n-(4,4-difluoro-5,7-dimethyl-4-bora-3a,4a-diaza-s-indacene-3-yl)methyl iodoacetamide), and its aggregation was studied in 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC) and DOPC/1,2-dioleoyl-sn-glycero-3-[phospho-rac-(1-glycerol)] (DOPG) or

  18. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    Science.gov (United States)

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  19. An improved classification of G-protein-coupled receptors using sequence-derived features

    Directory of Open Access Journals (Sweden)

    Peng Zhen-Ling

    2010-08-01

    Full Text Available Abstract Background G-protein-coupled receptors (GPCRs play a key role in diverse physiological processes and are the targets of almost two-thirds of the marketed drugs. The 3 D structures of GPCRs are largely unavailable; however, a large number of GPCR primary sequences are known. To facilitate the identification and characterization of novel receptors, it is therefore very valuable to develop a computational method to accurately predict GPCRs from the protein primary sequences. Results We propose a new method called PCA-GPCR, to predict GPCRs using a comprehensive set of 1497 sequence-derived features. The principal component analysis is first employed to reduce the dimension of the feature space to 32. Then, the resulting 32-dimensional feature vectors are fed into a simple yet powerful classification algorithm, called intimate sorting, to predict GPCRs at five levels. The prediction at the first level determines whether a protein is a GPCR or a non-GPCR. If it is predicted to be a GPCR, then it will be further predicted into certain family, subfamily, sub-subfamily and subtype by the classifiers at the second, third, fourth, and fifth levels, respectively. To train the classifiers applied at five levels, a non-redundant dataset is carefully constructed, which contains 3178, 1589, 4772, 4924, and 2741 protein sequences at the respective levels. Jackknife tests on this training dataset show that the overall accuracies of PCA-GPCR at five levels (from the first to the fifth can achieve up to 99.5%, 88.8%, 80.47%, 80.3%, and 92.34%, respectively. We further perform predictions on a dataset of 1238 GPCRs at the second level, and on another two datasets of 167 and 566 GPCRs respectively at the fourth level. The overall prediction accuracies of our method are consistently higher than those of the existing methods to be compared. Conclusions The comprehensive set of 1497 features is believed to be capable of capturing information about amino acid

  20. Epitope Sequences in Dengue Virus NS1 Protein Identified by Monoclonal Antibodies

    Directory of Open Access Journals (Sweden)

    Leticia Barboza Rocha

    2017-10-01

    Full Text Available Dengue nonstructural protein 1 (NS1 is a multi-functional glycoprotein with essential functions both in viral replication and modulation of host innate immune responses. NS1 has been established as a good surrogate marker for infection. In the present study, we generated four anti-NS1 monoclonal antibodies against recombinant NS1 protein from dengue virus serotype 2 (DENV2, which were used to map three NS1 epitopes. The sequence 193AVHADMGYWIESALNDT209 was recognized by monoclonal antibodies 2H5 and 4H1BC, which also cross-reacted with Zika virus (ZIKV protein. On the other hand, the sequence 25VHTWTEQYKFQPES38 was recognized by mAb 4F6 that did not cross react with ZIKV. Lastly, a previously unidentified DENV2 NS1-specific epitope, represented by the sequence 127ELHNQTFLIDGPETAEC143, is described in the present study after reaction with mAb 4H2, which also did not cross react with ZIKV. The selection and characterization of the epitope, specificity of anti-NS1 mAbs, may contribute to the development of diagnostic tools able to differentiate DENV and ZIKV infections.

  1. Monte Carlo simulation of a statistical mechanical model of multiple protein sequence alignment.

    Science.gov (United States)

    Kinjo, Akira R

    2017-01-01

    A grand canonical Monte Carlo (MC) algorithm is presented for studying the lattice gas model (LGM) of multiple protein sequence alignment, which coherently combines long-range interactions and variable-length insertions. MC simulations are used for both parameter optimization of the model and production runs to explore the sequence subspace around a given protein family. In this Note, I describe the details of the MC algorithm as well as some preliminary results of MC simulations with various temperatures and chemical potentials, and compare them with the mean-field approximation. The existence of a two-state transition in the sequence space is suggested for the SH3 domain family, and inappropriateness of the mean-field approximation for the LGM is demonstrated.

  2. Contextual Role of a Salt Bridge in the Phage P22 Coat Protein I-Domain*

    Science.gov (United States)

    Harprecht, Christina; Okifo, Oghenefejiro; Robbins, Kevin J.; Motwani, Tina; Alexandrescu, Andrei T.; Teschke, Carolyn M.

    2016-01-01

    The I-domain is a genetic insertion in the phage P22 coat protein that chaperones its folding and stability. Of 11 acidic residues in the I-domain, seven participate in stabilizing electrostatic interactions with basic residues across elements of secondary structure, fastening the β-barrel fold. A hydrogen-bonded salt bridge between Asp-302 and His-305 is particularly interesting as Asp-302 is the site of a temperature-sensitive-folding mutation. The pKa of His-305 is raised to 9.0, indicating the salt bridge stabilizes the I-domain by ∼4 kcal/mol. Consistently, urea denaturation experiments indicate the stability of the WT I-domain decreases by 4 kcal/mol between neutral and basic pH. The mutants D302A and H305A remove the pH dependence of stability. The D302A substitution destabilizes the I-domain by 4 kcal/mol, whereas H305A had smaller effects, on the order of 1–2 kcal/mol. The destabilizing effects of D302A are perpetuated in the full-length coat protein as shown by a higher sensitivity to protease digestion, decreased procapsid assembly rates, and impaired phage production in vivo. By contrast, the mutants have only minor effects on capsid expansion or stability in vitro. The effects of the Asp-302–His-305 salt bridge are thus complex and context-dependent. Substitutions that abolish the salt bridge destabilize coat protein monomers and impair capsid self-assembly, but once capsids are formed the effects of the substitutions are overcome by new quaternary interactions between subunits. PMID:27006399

  3. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng

    2015-12-03

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  4. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng; Hu, ShanShan; Zhang, Jun; Gao, Xin; Li, Jinyan; Xia, Junfeng; Wang, Bing

    2015-01-01

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  5. Protein backbone angle restraints from searching a database for chemical shift and sequence homology

    Energy Technology Data Exchange (ETDEWEB)

    Cornilescu, Gabriel; Delaglio, Frank; Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)

    1999-03-15

    Chemical shifts of backbone atoms in proteins are exquisitely sensitive to local conformation, and homologous proteins show quite similar patterns of secondary chemical shifts. The inverse of this relation is used to search a database for triplets of adjacent residues with secondary chemical shifts and sequence similarity which provide the best match to the query triplet of interest. The database contains 13C{alpha}, 13C{beta}, 13C', 1H{alpha} and 15N chemical shifts for 20 proteins for which a high resolution X-ray structure is available. The computer program TALOS was developed to search this database for strings of residues with chemical shift and residue type homology. The relative importance of the weighting factors attached to the secondary chemical shifts of the five types of resonances relative to that of sequence similarity was optimized empirically. TALOS yields the 10 triplets which have the closest similarity in secondary chemical shift and amino acid sequence to those of the query sequence. If the central residues in these 10 triplets exhibit similar {phi} and {psi} backbone angles, their averages can reliably be used as angular restraints for the protein whose structure is being studied. Tests carried out for proteins of known structure indicate that the root-mean-square difference (rmsd) between the output of TALOS and the X-ray derived backbone angles is about 15 deg. Approximately 3% of the predictions made by TALOS are found to be in error.

  6. The epsins define a family of proteins that interact with components of the clathrin coat and contain a new protein module

    DEFF Research Database (Denmark)

    Rosenthal, J A; Chen, H; Slepnev, V I

    1999-01-01

    Epsin (epsin 1) is an interacting partner for the EH domain-containing region of Eps15 and has been implicated in conjunction with Eps15 in clathrin-mediated endocytosis. We report here the characterization of a similar protein (epsin 2), which we have cloned from human and rat brain libraries. E...... fluorescent protein-epsin 2 mislocalizes components of the clathrin coat and inhibits clathrin-mediated endocytosis. The epsins define a new protein family implicated in membrane dynamics at the cell surface.......Epsin (epsin 1) is an interacting partner for the EH domain-containing region of Eps15 and has been implicated in conjunction with Eps15 in clathrin-mediated endocytosis. We report here the characterization of a similar protein (epsin 2), which we have cloned from human and rat brain libraries...

  7. Combining sequence-based prediction methods and circular dichroism and infrared spectroscopic data to improve protein secondary structure determinations

    Directory of Open Access Journals (Sweden)

    Lees Jonathan G

    2008-01-01

    Full Text Available Abstract Background A number of sequence-based methods exist for protein secondary structure prediction. Protein secondary structures can also be determined experimentally from circular dichroism, and infrared spectroscopic data using empirical analysis methods. It has been proposed that comparable accuracy can be obtained from sequence-based predictions as from these biophysical measurements. Here we have examined the secondary structure determination accuracies of sequence prediction methods with the empirically determined values from the spectroscopic data on datasets of proteins for which both crystal structures and spectroscopic data are available. Results In this study we show that the sequence prediction methods have accuracies nearly comparable to those of spectroscopic methods. However, we also demonstrate that combining the spectroscopic and sequences techniques produces significant overall improvements in secondary structure determinations. In addition, combining the extra information content available from synchrotron radiation circular dichroism data with sequence methods also shows improvements. Conclusion Combining sequence prediction with experimentally determined spectroscopic methods for protein secondary structure content significantly enhances the accuracy of the overall results obtained.

  8. DIALIGN: multiple DNA and protein sequence alignment at BiBiServ.

    OpenAIRE

    Morgenstern, Burkhard

    2004-01-01

    DIALIGN is a widely used software tool for multiple DNA and protein sequence alignment. The program combines local and global alignment features and can therefore be applied to sequence data that cannot be correctly aligned by more traditional approaches. DIALIGN is available online through Bielefeld Bioinformatics Server (BiBiServ). The downloadable version of the program offers several new program features. To compare the output of different alignment programs, we developed the program AltA...

  9. Oleosome-Associated Protein of the Oleaginous Diatom Fistulifera solaris Contains an Endoplasmic Reticulum-Targeting Signal Sequence

    Directory of Open Access Journals (Sweden)

    Yoshiaki Maeda

    2014-06-01

    Full Text Available Microalgae tend to accumulate lipids as an energy storage material in the specific organelle, oleosomes. Current studies have demonstrated that lipids derived from microalgal oleosomes are a promising source of biofuels, while the oleosome formation mechanism has not been fully elucidated. Oleosome-associated proteins have been identified from several microalgae to elucidate the fundamental mechanisms of oleosome formation, although understanding their functions is still in infancy. Recently, we discovered a diatom-oleosome-associated-protein 1 (DOAP1 from the oleaginous diatom, Fistulifera solaris JPCC DA0580. The DOAP1 sequence implied that this protein might be transported into the endoplasmic reticulum (ER due to the signal sequence. To ensure this, we fused the signal sequence to green fluorescence protein. The fusion protein distributed around the chloroplast as like a meshwork membrane structure, indicating the ER localization. This result suggests that DOAP1 could firstly localize at the ER, then move to the oleosomes. This study also demonstrated that the DOAP1 signal sequence allowed recombinant proteins to be specifically expressed in the ER of the oleaginous diatom. It would be a useful technique for engineering the lipid synthesis pathways existing in the ER, and finally controlling the biofuel quality.

  10. Sequence preservation of osteocalcin protein and mitochondrial DNA in bison bones older than 55 ka

    Science.gov (United States)

    Nielsen-Marsh, Christina M.; Ostrom, Peggy H.; Gandhi, Hasand; Shapiro, Beth; Cooper, Alan; Hauschka, Peter V.; Collins, Matthew J.

    2002-12-01

    We report the first complete sequences of the protein osteocalcin from small amounts (20 mg) of two bison bone (Bison priscus) dated to older than 55.6 ka and older than 58.9 ka. Osteocalcin was purified using new gravity columns (never exposed to protein) followed by microbore reversed-phase high-performance liquid chromatography. Sequencing of osteocalcin employed two methods of matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS): peptide mass mapping (PMM) and post-source decay (PSD). The PMM shows that ancient and modern bison osteocalcin have the same mass to charge (m/z) distribution, indicating an identical protein sequence and absence of diagenetic products. This was confirmed by PSD of the m/z 2066 tryptic peptide (residues 1 19); the mass spectra from ancient and modern peptides were identical. The 129 mass unit difference in the molecular ion between cow (Bos taurus) and bison is caused by a single amino-acid substitution between the taxa (Trp in cow is replaced by Gly in bison at residue 5). Bison mitochondrial control region DNA sequences were obtained from the older than 55.6 ka fossil. These results suggest that DNA and protein sequences can be used to directly investigate molecular phylogenies over a considerable time period, the absolute limit of which is yet to be determined.

  11. Sequence analysis corresponding to the PPE and PE proteins in ...

    Indian Academy of Sciences (India)

    Unknown

    AB repeats; Mycobacterium tuberculosis genome; PE-PPE domain; PPE, PE proteins; sequence analysis; surface antigens. J. Biosci. | Vol. ... bacterium tuberculosis genomes resulted in the identification of a previously uncharacterized 225 amino acid- ...... Vega Lopez F, Brooks L A, Dockrell H M, De Smet K A,. Thompson ...

  12. Protein sequence analysis by incorporating modified chaos game and physicochemical properties into Chou's general pseudo amino acid composition.

    Science.gov (United States)

    Xu, Chunrui; Sun, Dandan; Liu, Shenghui; Zhang, Yusen

    2016-10-07

    In this contribution we introduced a novel graphical method to compare protein sequences. By mapping a protein sequence into 3D space based on codons and physicochemical properties of 20 amino acids, we are able to get a unique P-vector from the 3D curve. This approach is consistent with wobble theory of amino acids. We compute the distance between sequences by their P-vectors to measure similarities/dissimilarities among protein sequences. Finally, we use our method to analyze four datasets and get better results compared with previous approaches. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

    Science.gov (United States)

    Hawkins, Troy; Chitale, Meghana; Luban, Stanislav; Kihara, Daisuke

    2009-02-15

    Protein function prediction is a central problem in bioinformatics, increasing in importance recently due to the rapid accumulation of biological data awaiting interpretation. Sequence data represents the bulk of this new stock and is the obvious target for consideration as input, as newly sequenced organisms often lack any other type of biological characterization. We have previously introduced PFP (Protein Function Prediction) as our sequence-based predictor of Gene Ontology (GO) functional terms. PFP interprets the results of a PSI-BLAST search by extracting and scoring individual functional attributes, searching a wide range of E-value sequence matches, and utilizing conventional data mining techniques to fill in missing information. We have shown it to be effective in predicting both specific and low-resolution functional attributes when sufficient data is unavailable. Here we describe (1) significant improvements to the PFP infrastructure, including the addition of prediction significance and confidence scores, (2) a thorough benchmark of performance and comparisons to other related prediction methods, and (3) applications of PFP predictions to genome-scale data. We applied PFP predictions to uncharacterized protein sequences from 15 organisms. Among these sequences, 60-90% could be annotated with a GO molecular function term at high confidence (>or=80%). We also applied our predictions to the protein-protein interaction network of the Malaria plasmodium (Plasmodium falciparum). High confidence GO biological process predictions (>or=90%) from PFP increased the number of fully enriched interactions in this dataset from 23% of interactions to 94%. Our benchmark comparison shows significant performance improvement of PFP relative to GOtcha, InterProScan, and PSI-BLAST predictions. This is consistent with the performance of PFP as the overall best predictor in both the AFP-SIG '05 and CASP7 function (FN) assessments. PFP is available as a web service at http

  14. Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

    Science.gov (United States)

    Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

    2014-06-01

    The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host.

  15. Variability of the protein sequences of lcrV between epidemic and atypical rhamnose-positive strains of Yersinia pestis.

    Science.gov (United States)

    Anisimov, Andrey P; Panfertsev, Evgeniy A; Svetoch, Tat'yana E; Dentovskaya, Svetlana V

    2007-01-01

    Sequencing of lcrV genes and comparison of the deduced amino acid sequences from ten Y. pestis strains belonging mostly to the group of atypical rhamnose-positive isolates (non-pestis subspecies or pestoides group) showed that the LcrV proteins analyzed could be classified into five sequence types. This classification was based on major amino acid polymorphisms among LcrV proteins in the four "hot points" of the protein sequences. Some additional minor polymorphisms were found throughout these sequence types. The "hot points" corresponded to amino acids 18 (Lys --> Asn), 72 (Lys --> Arg), 273 (Cys --> Ser), and 324-326 (Ser-Gly-Lys --> Arg) in the LcrV sequence of the reference Y. pestis strain CO92. One possible explanation for polymorphism in amino acid sequences of LcrV among different strains is that strain-specific variation resulted from adaptation of the plague pathogen to different rodent and lagomorph hosts.

  16. Visualization of protein sequence features using JavaScript and SVG with pViz.js.

    Science.gov (United States)

    Mukhyala, Kiran; Masselot, Alexandre

    2014-12-01

    pViz.js is a visualization library for displaying protein sequence features in a Web browser. By simply providing a sequence and the locations of its features, this lightweight, yet versatile, JavaScript library renders an interactive view of the protein features. Interactive exploration of protein sequence features over the Web is a common need in Bioinformatics. Although many Web sites have developed viewers to display these features, their implementations are usually focused on data from a specific source or use case. Some of these viewers can be adapted to fit other use cases but are not designed to be reusable. pViz makes it easy to display features as boxes aligned to a protein sequence with zooming functionality but also includes predefined renderings for secondary structure and post-translational modifications. The library is designed to further customize this view. We demonstrate such applications of pViz using two examples: a proteomic data visualization tool with an embedded viewer for displaying features on protein structure, and a tool to visualize the results of the variant_effect_predictor tool from Ensembl. pViz.js is a JavaScript library, available on github at https://github.com/Genentech/pviz. This site includes examples and functional applications, installation instructions and usage documentation. A Readme file, which explains how to use pViz with examples, is available as Supplementary Material A. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

    Science.gov (United States)

    Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

    2014-09-18

    Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.

  18. Hydrogen exchange kinetics in a membrane protein determined by 15N NMR spectroscopy: Use of the INEPT [insensitive nucleus enhancement by polarization transfer] experiment to follow individual amides in detergent-solubilized M13 coat protein

    International Nuclear Information System (INIS)

    Henry, G.D.; Sykes, B.D.

    1990-01-01

    The coat protein of the filamentous coliphage M13 is a 50-residue polypeptide which spans the inner membrane of the Escherichia coli host upon infection. Amide hydrogen exchange kinetics have been used to probe the structure and dynamics of M13 coat protein which has been solubilized in sodium dodecyl sulfate (SDS) micelles. In a previous 1 H nuclear magnetic resonance (NMR) study, multiple exponential analysis of the unresolved amide proton envelope revealed the existence of two slow kinetic sets containing a total of about 30 protons. The slower set (15-20 amides) originates from the hydrophobic membrane-spanning region and exchanges at least 10 5 -fold slower than the unstructured, non-H-bonded model polypeptide poly(DL-alanine). Herein the authors use 15 N NMR spectroscopy of biosynthetically labeled coat protein to follow individual, assigned, slowly exchanging amides in or near the hydrophobic segment. The INEPT (insensitive nucleus enhancement by polarization transfer) experiments can be used to transfer magnetization to the 15 N nucleus from a coupled proton; when 15 N-labeled protonated protein is dissolved in 2 H 2 O, the INEPT signal disappears with time as the amide protons are replaced by solvent deuterons. Amide hydrogen exchange is catalyzed by both H + and OH - ions. The time-dependent exchange-out experiment is suitable for slow exchange rates (k ex ). The INEPT experiment was also adapted to measure some of the more rapidly exchanging amides in the coat protein using either saturation transfer from water or exchange effects on the polarization transfer step itself. The results of all of these experiments are consistent with previous models of the coat protein in which a stable segment extends from the hydrophobic membrane-spanning region through to the C-terminus, whereas the N-terminal region is undergoing more extensive dynamic fluctuations

  19. Rapid capillary coating by epoxy-poly-(dimethylacrylamide): Performance in capillary zone electrophoresis of protein and polystyrene carboxylate

    Czech Academy of Sciences Publication Activity Database

    Chiari, M.; Cretich, M.; Šťastná, Miroslava; Radko, S. P.; Chrambach, A.

    2001-01-01

    Roč. 22, č. 4 (2001), s. 656-659 ISSN 0173-0835 Institutional research plan: CEZ:AV0Z4031919 Keywords : capillary coating * capillary zone electrophoresis * proteins Subject RIV: CB - Analytical Chemistry, Separation Impact factor: 4.282, year: 2001

  20. A two-step recognition of signal sequences determines the translocation efficiency of proteins.

    Science.gov (United States)

    Belin, D; Bost, S; Vassalli, J D; Strub, K

    1996-02-01

    The cytosolic and secreted, N-glycosylated, forms of plasminogen activator inhibitor-2 (PAI-2) are generated by facultative translocation. To study the molecular events that result in the bi-topological distribution of proteins, we determined in vitro the capacities of several signal sequences to bind the signal recognition particle (SRP) during targeting, and to promote vectorial transport of murine PAI-2 (mPAI-2). Interestingly, the six signal sequences we compared (mPAI-2 and three mutated derivatives thereof, ovalbumin and preprolactin) were found to have the differential activities in the two events. For example, the mPAI-2 signal sequence first binds SRP with moderate efficiency and secondly promotes the vectorial transport of only a fraction of the SRP-bound nascent chains. Our results provide evidence that the translocation efficiency of proteins can be controlled by the recognition of their signal sequences at two steps: during SRP-mediated targeting and during formation of a committed translocation complex. This second recognition may occur at several time points during the insertion/translocation step. In conclusion, signal sequences have a more complex structure than previously anticipated, allowing for multiple and independent interactions with the translocation machinery.

  1. Nucleotide sequence of a cDNA for branched chain acyltransferase with analysis of the deduced protein structure

    International Nuclear Information System (INIS)

    Hummel, K.B.; Litwer, S.; Bradford, A.P.; Aitken, A.; Danner, D.J.; Yeaman, S.J.

    1988-01-01

    Nucleotide sequence was determined for a 1.6-kilobase human cDNA putative for the branched chain acyltransferase protein of the branched chain α-ketoacid dehydrogenase complex. Translation of the sequence reveals an open reading frame encoding a 315-amino acid protein of molecular weight 35,759 followed by 560 bases of 3'-untranslated sequence. Three repeats of the polyadenylation signal hexamer ATTAAA are present prior to the polyadenylate tail. Within the open reading frame is a 10-amino acid fragment which matches exactly the amino acid sequence around the lipoate-lysine residue in bovine kidney branched chain acyltransferase, thus confirming the identity of the cDNA. Analysis of the deduced protein structure for the human branched chain acyltransferase revealed an organization into domains similar to that reported for the acyltransferase proteins of the pyruvate and α-ketoglutarate dehydrogenase complexes. This similarity in organization suggests that a more detailed analysis of the proteins will be required to explain the individual substrate and multienzyme complex specificity shown by these acyltransferases

  2. Changing folding and binding stability in a viral coat protein: a comparison between substitutions accessible through mutation and those fixed by natural selection.

    Science.gov (United States)

    Miller, Craig R; Lee, Kuo Hao; Wichman, Holly A; Ytreberg, F Marty

    2014-01-01

    Previous studies have shown that most random amino acid substitutions destabilize protein folding (i.e. increase the folding free energy). No analogous studies have been carried out for protein-protein binding. Here we use a structure-based model of the major coat protein in a simple virus, bacteriophage φX174, to estimate the free energy of folding of a single coat protein and binding of five coat proteins within a pentameric unit. We confirm and extend previous work in finding that most accessible substitutions destabilize both protein folding and protein-protein binding. We compare the pool of accessible substitutions with those observed among the φX174-like wild phage and in experimental evolution with φX174. We find that observed substitutions have smaller effects on stability than expected by chance. An analysis of adaptations at high temperatures suggests that selection favors either substitutions with no effect on stability or those that simultaneously stabilize protein folding and slightly destabilize protein binding. We speculate that these mutations might involve adjusting the rate of capsid assembly. At normal laboratory temperature there is little evidence of directional selection. Finally, we show that cumulative changes in stability are highly variable; sometimes they are well beyond the bounds of single substitution changes and sometimes they are not. The variation leads us to conclude that phenotype selection acts on more than just stability. Instances of larger cumulative stability change (never via a single substitution despite their availability) lead us to conclude that selection views stability at a local, not a global, level.

  3. Sputter deposited bioceramic coatings: surface characterisation and initial protein adsorption studies using surface-MALDI-MS

    DEFF Research Database (Denmark)

    Boyd, A. R.; Burke, G. A.; Duffy, H.

    2011-01-01

    Protein adsorption onto calcium phosphate (Ca–P) bioceramics utilised in hard tissue implant applications has been highlighted as one of the key events that influences the subsequent biological response, in vivo. This work reports on the use of surface-matrix assisted laser desorption ionisation...... to a combination of growth factors and lipoproteins present in serum. From the data obtained here it is evident that surface-MALDI-MS has significant utility as a tool for studying the dynamic nature of protein adsorption onto the surfaces of bioceramic coatings, which most likely plays a significant role...

  4. Coating Nanoparticles with Plant-Produced Transferrin-Hydrophobin Fusion Protein Enhances Their Uptake in Cancer Cells

    DEFF Research Database (Denmark)

    Reuter, Lauri J.; Shahbazi, Mohammad-Ali; Makila, Ermei M.

    2017-01-01

    can be expressed in Nicotiana benthamiana plants as a fusion with Trichoderma reesei hydrophobins HFBI, HFBII, or HFBIV. Transferrin-HFBIV was further expressed in tobacco BY-2 suspension cells. Both partners of the fusion protein retained their functionality; the hydrophobin moiety enabled migration...... to a surfactant phase in an aqueous two-phase system, and the transferrin moiety was able to reversibly bind iron. Coating porous silicon nanoparticles with the fusion protein resulted in uptake of the nanoparticles in human cancer cells. This study provides a proof-of concept for the functionalization...

  5. Diverse supramolecular structures formed by self‐assembling proteins of the B acillus subtilis spore coat

    Science.gov (United States)

    Jiang, Shuo; Wan, Qiang; Krajcikova, Daniela; Tang, Jilin; Tzokov, Svetomir B.; Barak, Imrich

    2015-01-01

    Summary Bacterial spores (endospores), such as those of the pathogens C lostridium difficile and B acillus anthracis, are uniquely stable cell forms, highly resistant to harsh environmental insults. B acillus subtilis is the best studied spore‐former and we have used it to address the question of how the spore coat is assembled from multiple components to form a robust, protective superstructure. B . subtilis coat proteins (CotY, CotE, CotV and CotW) expressed in E scherichia coli can arrange intracellularly into highly stable macro‐structures through processes of self‐assembly. Using electron microscopy, we demonstrate the capacity of these proteins to generate ordered one‐dimensional fibres, two‐dimensional sheets and three‐dimensional stacks. In one case (CotY), the high degree of order favours strong, cooperative intracellular disulfide cross‐linking. Assemblies of this kind could form exquisitely adapted building blocks for higher‐order assembly across all spore‐formers. These physically robust arrayed units could also have novel applications in nano‐biotechnology processes. PMID:25872412

  6. Protein sequences bound to mineral surfaces persist into deep time

    DEFF Research Database (Denmark)

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa

    2016-01-01

    of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell......, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated...

  7. Programming molecular self-assembly of intrinsically disordered proteins containing sequences of low complexity

    Science.gov (United States)

    Simon, Joseph R.; Carroll, Nick J.; Rubinstein, Michael; Chilkoti, Ashutosh; López, Gabriel P.

    2017-06-01

    Dynamic protein-rich intracellular structures that contain phase-separated intrinsically disordered proteins (IDPs) composed of sequences of low complexity (SLC) have been shown to serve a variety of important cellular functions, which include signalling, compartmentalization and stabilization. However, our understanding of these structures and our ability to synthesize models of them have been limited. We present design rules for IDPs possessing SLCs that phase separate into diverse assemblies within droplet microenvironments. Using theoretical analyses, we interpret the phase behaviour of archetypal IDP sequences and demonstrate the rational design of a vast library of multicomponent protein-rich structures that ranges from uniform nano-, meso- and microscale puncta (distinct protein droplets) to multilayered orthogonally phase-separated granular structures. The ability to predict and program IDP-rich assemblies in this fashion offers new insights into (1) genetic-to-molecular-to-macroscale relationships that encode hierarchical IDP assemblies, (2) design rules of such assemblies in cell biology and (3) molecular-level engineering of self-assembled recombinant IDP-rich materials.

  8. Genome-wide profiling of DNA-binding proteins using barcode-based multiplex Solexa sequencing.

    Science.gov (United States)

    Raghav, Sunil Kumar; Deplancke, Bart

    2012-01-01

    Chromatin immunoprecipitation (ChIP) is a commonly used technique to detect the in vivo binding of proteins to DNA. ChIP is now routinely paired to microarray analysis (ChIP-chip) or next-generation sequencing (ChIP-Seq) to profile the DNA occupancy of proteins of interest on a genome-wide level. Because ChIP-chip introduces several biases, most notably due to the use of a fixed number of probes, ChIP-Seq has quickly become the method of choice as, depending on the sequencing depth, it is more sensitive, quantitative, and provides a greater binding site location resolution. With the ever increasing number of reads that can be generated per sequencing run, it has now become possible to analyze several samples simultaneously while maintaining sufficient sequence coverage, thus significantly reducing the cost per ChIP-Seq experiment. In this chapter, we provide a step-by-step guide on how to perform multiplexed ChIP-Seq analyses. As a proof-of-concept, we focus on the genome-wide profiling of RNA Polymerase II as measuring its DNA occupancy at different stages of any biological process can provide insights into the gene regulatory mechanisms involved. However, the protocol can also be used to perform multiplexed ChIP-Seq analyses of other DNA-binding proteins such as chromatin modifiers and transcription factors.

  9. The Ising model for prediction of disordered residues from protein sequence alone

    International Nuclear Information System (INIS)

    Lobanov, Michail Yu; Galzitskaya, Oxana V

    2011-01-01

    Intrinsically disordered regions serve as molecular recognition elements, which play an important role in the control of many cellular processes and signaling pathways. It is useful to be able to predict positions of disordered residues and disordered regions in protein chains using protein sequence alone. A new method (IsUnstruct) based on the Ising model for prediction of disordered residues from protein sequence alone has been developed. According to this model, each residue can be in one of two states: ordered or disordered. The model is an approximation of the Ising model in which the interaction term between neighbors has been replaced by a penalty for changing between states (the energy of border). The IsUnstruct has been compared with other available methods and found to perform well. The method correctly finds 77% of disordered residues as well as 87% of ordered residues in the CASP8 database, and 72% of disordered residues as well as 85% of ordered residues in the DisProt database

  10. Molecular Simulations of Sequence-Specific Association of Transmembrane Proteins in Lipid Bilayers

    Science.gov (United States)

    Doxastakis, Manolis; Prakash, Anupam; Janosi, Lorant

    2011-03-01

    Association of membrane proteins is central in material and information flow across the cellular membranes. Amino-acid sequence and the membrane environment are two critical factors controlling association, however, quantitative knowledge on such contributions is limited. In this work, we study the dimerization of helices in lipid bilayers using extensive parallel Monte Carlo simulations with recently developed algorithms. The dimerization of Glycophorin A is examined employing a coarse-grain model that retains a level of amino-acid specificity, in three different phospholipid bilayers. Association is driven by a balance of protein-protein and lipid-induced interactions with the latter playing a major role at short separations. Following a different approach, the effect of amino-acid sequence is studied using the four transmembrane domains of the epidermal growth factor receptor family in identical lipid environments. Detailed characterization of dimer formation and estimates of the free energy of association reveal that these helices present significant affinity to self-associate with certain dimers forming non-specific interfaces.

  11. Silica-coated Gd(DOTA)-loaded protein nanoparticles enable magnetic resonance imaging of macrophages

    Science.gov (United States)

    Bruckman, Michael A.; Randolph, Lauren N.; Gulati, Neetu M.; Stewart, Phoebe L.; Steinmetz, Nicole F.

    2015-01-01

    The molecular imaging of in vivo targets allows non-invasive disease diagnosis. Nanoparticles offer a promising platform for molecular imaging because they can deliver large payloads of imaging reagents to the site of disease. Magnetic resonance imaging (MRI) is often preferred for clinical diagnosis because it uses non-ionizing radiation and offers both high spatial resolution and excellent penetration. We have explored the use of plant viruses as the basis of for MRI contrast reagents, specifically Tobacco mosaic virus (TMV), which can assemble to form either stiff rods or spheres. We loaded TMV particles with paramagnetic Gd ions, increasing the ionic relaxivity compared to free Gd ions. The loaded TMV particles were then coated with silica maintaining high relaxivities. Interestingly, we found that when Gd(DOTA) was loaded into the interior channel of TMV and the exterior was coated with silica, the T1 relaxivities increased by three-fold from 10.9 mM−1 s−1 to 29.7 mM−1s−1 at 60 MHz compared to uncoated Gd-loaded TMV. To test the performance of the contrast agents in a biological setting, we focused on interactions with macrophages because the active or passive targeting of immune cells is a popular strategy to investigate the cellular components involved in disease progression associated with inflammation. In vitro assays and phantom MRI experiments indicate efficient targeting and imaging of macrophages, enhanced contrast-to-noise ratio was observed by shape-engineering (SNP > TMV) and silica-coating (Si-TMV/SNP > TMV/SNP). Because plant viruses are in the food chain, antibodies may be prevalent in the population. Therefore we investigated whether the silica-coating could prevent antibody recognition; indeed our data indicate that mineralization can be used as a stealth coating option to reduce clearance. Therefore, we conclude that the silica-coated protein-based contrast agent may provide an interesting candidate material for further investigation

  12. Identification of similar regions of protein structures using integrated sequence and structure analysis tools

    Directory of Open Access Journals (Sweden)

    Heiland Randy

    2006-03-01

    Full Text Available Abstract Background Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Results Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. Conclusion With structural genomics initiatives determining structures with little, if any, functional characterization

  13. Genome-Wide Prediction and Analysis of 3D-Domain Swapped Proteins in the Human Genome from Sequence Information.

    Science.gov (United States)

    Upadhyay, Atul Kumar; Sowdhamini, Ramanathan

    2016-01-01

    3D-domain swapping is one of the mechanisms of protein oligomerization and the proteins exhibiting this phenomenon have many biological functions. These proteins, which undergo domain swapping, have acquired much attention owing to their involvement in human diseases, such as conformational diseases, amyloidosis, serpinopathies, proteionopathies etc. Early realisation of proteins in the whole human genome that retain tendency to domain swap will enable many aspects of disease control management. Predictive models were developed by using machine learning approaches with an average accuracy of 78% (85.6% of sensitivity, 87.5% of specificity and an MCC value of 0.72) to predict putative domain swapping in protein sequences. These models were applied to many complete genomes with special emphasis on the human genome. Nearly 44% of the protein sequences in the human genome were predicted positive for domain swapping. Enrichment analysis was performed on the positively predicted sequences from human genome for their domain distribution, disease association and functional importance based on Gene Ontology (GO). Enrichment analysis was also performed to infer a better understanding of the functional importance of these sequences. Finally, we developed hinge region prediction, in the given putative domain swapped sequence, by using important physicochemical properties of amino acids.

  14. Random amino acid mutations and protein misfolding lead to Shannon limit in sequence-structure communication.

    Directory of Open Access Journals (Sweden)

    Andreas Martin Lisewski

    2008-09-01

    Full Text Available The transmission of genomic information from coding sequence to protein structure during protein synthesis is subject to stochastic errors. To analyze transmission limits in the presence of spurious errors, Shannon's noisy channel theorem is applied to a communication channel between amino acid sequences and their structures established from a large-scale statistical analysis of protein atomic coordinates. While Shannon's theorem confirms that in close to native conformations information is transmitted with limited error probability, additional random errors in sequence (amino acid substitutions and in structure (structural defects trigger a decrease in communication capacity toward a Shannon limit at 0.010 bits per amino acid symbol at which communication breaks down. In several controls, simulated error rates above a critical threshold and models of unfolded structures always produce capacities below this limiting value. Thus an essential biological system can be realistically modeled as a digital communication channel that is (a sensitive to random errors and (b restricted by a Shannon error limit. This forms a novel basis for predictions consistent with observed rates of defective ribosomal products during protein synthesis, and with the estimated excess of mutual information in protein contact potentials.

  15. Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

    Science.gov (United States)

    Gupta, Radhey S

    2012-11-01

    The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been

  16. PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways

    OpenAIRE

    Mi, Huaiyu; Guo, Nan; Kejariwal, Anish; Thomas, Paul D.

    2006-01-01

    PANTHER is a freely available, comprehensive software system for relating protein sequence evolution to the evolution of specific protein functions and biological roles. Since 2005, there have been three main improvements to PANTHER. First, the sequences used to create evolutionary trees are carefully selected to provide coverage of phylogenetic as well as functional information. Second, PANTHER is now a member of the InterPro Consortium, and the PANTHER hidden markov Models (HMMs) are distri...

  17. Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences

    Directory of Open Access Journals (Sweden)

    Ji-Yong An

    2016-01-01

    Full Text Available We propose a novel computational method known as RVM-LPQ that combines the Relevance Vector Machine (RVM model and Local Phase Quantization (LPQ to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the LPQ feature representation on a Position Specific Scoring Matrix (PSSM, reducing the influence of noise using a Principal Component Analysis (PCA, and using a Relevance Vector Machine (RVM based classifier. We perform 5-fold cross-validation experiments on Yeast and Human datasets, and we achieve very high accuracies of 92.65% and 97.62%, respectively, which is significantly better than previous works. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM classifier on the Yeast dataset. The experimental results demonstrate that our RVM-LPQ method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool for future proteomics research.

  18. Complete sequence of Fig fleck-associated virus, a novel member of the family Tymoviridae.

    Science.gov (United States)

    Elbeaino, Toufic; Digiaro, Michele; Martelli, Giovanni P

    2011-11-01

    The complete nucleotide sequence and the genome organization were determined of a novel virus, tentatively named Fig fleck-associated virus (FFkaV). The viral genome is a positive-sense, single-stranded RNA 7046 nucleotides in size excluding the 3'-terminal poly(A) tract, and comprising two open reading frames. ORF1 encodes a polypeptide of 2161 amino acids (p240), which contains the signatures of replication-associated proteins and the coat protein cistron (p24) at its 3' end. ORF2 codes for a 461 amino acid protein (p50) identified as a putative movement proteins (MP). In phylogenetic trees constructed with sequences of the putative polymerase and CP proteins FFkaV consistently groups with members of the genus Maculavirus, family Tymoviridae. However, the genome organization diverges from that of the two completely sequenced maculaviruses, Grapevine fleck virus (GFkV) and Bombix mori Macula-like virus (BmMLV), as it exhibits a structure resembling that of Maize rayado fino virus (MRFV), the type species of the genus Marafivirus and of Olive latent virus 3 (OLV-3), an unclassified virus in the family Tymoviridae. FFkaV was found in field-grown figs from six Mediterranean countries with an incidence ranging from 15% to 25%. Copyright © 2011 Elsevier B.V. All rights reserved.

  19. SigniSite: Identification of residue-level genotype-phenotype correlations in protein multiple sequence alignments

    DEFF Research Database (Denmark)

    Jessen, Leon Ivar; Hoof, Ilka; Lund, Ole

    2013-01-01

    Site does not require any pre-definition of subgroups or binary classification. Input is a set of protein sequences where each sequence has an associated real number, quantifying a given phenotype. SigniSite will then identify which amino acid residues are significantly associated with the data set......) using a set of human immunodeficiency virus protease-inhibitor genotype–phenotype data and corresponding resistance mutation scores from the Stanford University HIV Drug Resistance Database, and a data set of protein families with experimentally annotated SDPs. For both data sets, SigniSite was found...

  20. Bone Morphogenetic Protein Coating on Titanium Implant Surface: a Systematic Review

    Directory of Open Access Journals (Sweden)

    Haim Haimov

    2017-06-01

    Full Text Available Objectives: The purpose of the study is to systematically review the osseointegration process improvement by bone morphogenetic protein coating on titanium implant surface. Material and Methods: An electronic literature search was conducted through the MEDLINE (PubMed and EMBASE databases. The search was restricted for articles published during the last 10 years from October 2006 to September 2016 and articles were limited to English language. Results: A total of 41 articles were reviewed, and 8 of the most relevant articles that are suitable to the criteria were selected. Articles were analysed regarding concentration of bone morphogenetic protein (BMP, delivery systems, adverse reactions and the influence of the BMP on the bone and peri-implant surface in vivo. Finally, the present data included 340 implants and 236 models. Conclusions: It’s clearly shown from most of the examined studies that bone morphogenetic protein increases bone regeneration. Further studies should be done in order to induce and sustain bone formation activity. Osteogenic agent should be gradually liberated and not rapidly released with priority to three-dimension reservoir (incorporated titanium implant surface in order to avoid following severe side effects: inflammation, bleeding, haematoma, oedema, erythema, and graft failure.

  1. Coat protein deletion mutants elicit more severe symptoms than wild-type virus in multiple cereal hosts

    Science.gov (United States)

    The coat protein (CP) of Wheat streak mosaic virus (WSMV; genus Tritimovirus, family Potyviridae) tolerates deletion of amino acids 36 to 84 for efficient systemic infection of wheat. This study demonstrates that deletion of CP amino acids 58 to 84, but not 36 to 57, from WSMV genome induced severe ...

  2. Evolutionary conservation of nuclear and nucleolar targeting sequences in yeast ribosomal protein S6A

    International Nuclear Information System (INIS)

    Lipsius, Edgar; Walter, Korden; Leicher, Torsten; Phlippen, Wolfgang; Bisotti, Marc-Angelo; Kruppa, Joachim

    2005-01-01

    Over 1 billion years ago, the animal kingdom diverged from the fungi. Nevertheless, a high sequence homology of 62% exists between human ribosomal protein S6 and S6A of Saccharomyces cerevisiae. To investigate whether this similarity in primary structure is mirrored in corresponding functional protein domains, the nuclear and nucleolar targeting signals were delineated in yeast S6A and compared to the known human S6 signals. The complete sequence of S6A and cDNA fragments was fused to the 5'-end of the LacZ gene, the constructs were transiently expressed in COS cells, and the subcellular localization of the fusion proteins was detected by indirect immunofluorescence. One bipartite and two monopartite nuclear localization signals as well as two nucleolar binding domains were identified in yeast S6A, which are located at homologous regions in human S6 protein. Remarkably, the number, nature, and position of these targeting signals have been conserved, albeit their amino acid sequences have presumably undergone a process of co-evolution with their corresponding rRNAs

  3. Convergent Evolution of Slick Coat in Cattle through Truncation Mutations in the Prolactin Receptor

    Directory of Open Access Journals (Sweden)

    Laercio R. Porto-Neto

    2018-02-01

    Full Text Available Evolutionary adaptations are occasionally convergent solutions to the same problem. A mutation contributing to a heat tolerance adaptation in Senepol cattle, a New World breed of mostly European descent, results in the distinct phenotype known as slick, where an animal has shorter hair and lower follicle density across its coat than wild type animals. The causal variant, located in the 11th exon of prolactin receptor, produces a frameshift that results in a truncated protein. However, this mutation does not explain all cases of slick coats found in criollo breeds. Here, we obtained genome sequences from slick cattle of a geographically distinct criollo breed, namely Limonero, whose ancestors were originally brought to the Americas by the Spanish. These data were used to identify new causal alleles in the 11th exon of the prolactin receptor, two of which also encode shortened proteins that remove a highly conserved tyrosine residue. These new mutations explained almost 90% of investigated cases of animals that had slick coats, but which also did not carry the Senepol slick allele. These results demonstrate convergent evolution at the molecular level in a trait important to the adaptation of an animal to its environment.

  4. The promotion of osseointegration of titanium surfaces by coating with silk protein sericin.

    Science.gov (United States)

    Nayak, Sunita; Dey, Tuli; Naskar, Deboki; Kundu, Subhas C

    2013-04-01

    A promising strategy to influence the osseointegration process around orthopaedic titanium implants is the immobilization of bioactive molecules. This recruits appropriate interaction between the surface and the tissue by directing cells adhesion, proliferation, differentiation and active matrix remodelling. In this study, we aimed to investigate the functionalization of metallic implant titanium with silk protein sericin. Titanium surface was immobilized with non-mulberry Antheraea mylitta sericin using glutaraldehyde as crosslinker. To analyse combinatorial effects the sericin immobilized titanium was further conjugated with integrin binding peptide sequence Arg-Gly-Asp (RGD) using ethyl (dimethylaminopropyl) carbodiimide and N-hydroxysulfosuccinimide as coupling agents. The surface of sericin immobilized titanium was characterized biophysically. Osteoblast-like cells were cultured on sericin and sericin/RGD functionalized titanium and found to be more viable than those on pristine titanium. The enhanced adhesion, proliferation, and differentiation of osteoblast cells were observed. RT-PCR analysis showed that mRNA expressions of bone sialoprotein, osteocalcin and alkaline phosphatase were upregulated in osteoblast cells cultured on sericin and sericin/RGD immobilized titanium substrates. Additionally, no significant amount of pro-inflammatory cytokines TNF-α, IL-1β and nitric oxide production were recorded when macrophages cells and osteoblast-macrophages co culture cells were grown on sericin immobilized titanium. The findings demonstrate that the sericin immobilized titanium surfaces are potentially useful bioactive coated materials for titanium-based medical implants. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. Effect of the sequence data deluge on the performance of methods for detecting protein functional residues.

    Science.gov (United States)

    Garrido-Martín, Diego; Pazos, Florencio

    2018-02-27

    The exponential accumulation of new sequences in public databases is expected to improve the performance of all the approaches for predicting protein structural and functional features. Nevertheless, this was never assessed or quantified for some widely used methodologies, such as those aimed at detecting functional sites and functional subfamilies in protein multiple sequence alignments. Using raw protein sequences as only input, these approaches can detect fully conserved positions, as well as those with a family-dependent conservation pattern. Both types of residues are routinely used as predictors of functional sites and, consequently, understanding how the sequence content of the databases affects them is relevant and timely. In this work we evaluate how the growth and change with time in the content of sequence databases affect five sequence-based approaches for detecting functional sites and subfamilies. We do that by recreating historical versions of the multiple sequence alignments that would have been obtained in the past based on the database contents at different time points, covering a period of 20 years. Applying the methods to these historical alignments allows quantifying the temporal variation in their performance. Our results show that the number of families to which these methods can be applied sharply increases with time, while their ability to detect potentially functional residues remains almost constant. These results are informative for the methods' developers and final users, and may have implications in the design of new sequencing initiatives.

  6. Structural and sequence analysis of imelysin-like proteins implicated in bacterial iron uptake.

    Directory of Open Access Journals (Sweden)

    Qingping Xu

    Full Text Available Imelysin-like proteins define a superfamily of bacterial proteins that are likely involved in iron uptake. Members of this superfamily were previously thought to be peptidases and were included in the MEROPS family M75. We determined the first crystal structures of two remotely related, imelysin-like proteins. The Psychrobacter arcticus structure was determined at 2.15 Å resolution and contains the canonical imelysin fold, while higher resolution structures from the gut bacteria Bacteroides ovatus, in two crystal forms (at 1.25 Å and 1.44 Å resolution, have a circularly permuted topology. Both structures are highly similar to each other despite low sequence similarity and circular permutation. The all-helical structure can be divided into two similar four-helix bundle domains. The overall structure and the GxHxxE motif region differ from known HxxE metallopeptidases, suggesting that imelysin-like proteins are not peptidases. A putative functional site is located at the domain interface. We have now organized the known homologous proteins into a superfamily, which can be separated into four families. These families share a similar functional site, but each has family-specific structural and sequence features. These results indicate that imelysin-like proteins have evolved from a common ancestor, and likely have a conserved function.

  7. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    Science.gov (United States)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  8. Albumin-coated SPIONs: an experimental and theoretical evaluation of protein conformation, binding affinity and competition with serum proteins

    Science.gov (United States)

    Yu, Siming; Perálvarez-Marín, Alex; Minelli, Caterina; Faraudo, Jordi; Roig, Anna; Laromaine, Anna

    2016-07-01

    The variety of nanoparticles (NPs) used in biological applications is increasing and the study of their interaction with biological media is becoming more important. Proteins are commonly the first biomolecules that NPs encounter when they interact with biological systems either in vitro or in vivo. Among NPs, super-paramagnetic iron oxide nanoparticles (SPIONs) show great promise for medicine. In this work, we study in detail the formation, composition, and structure of a monolayer of bovine serum albumin (BSA) on SPIONs. We determine, both by molecular simulations and experimentally, that ten molecules of BSA form a monolayer around the outside of the SPIONs and their binding strength to the SPIONs is about 3.5 × 10-4 M, ten times higher than the adsorption of fetal bovine serum (FBS) on the same SPIONs. We elucidate a strong electrostatic interaction between BSA and the SPIONs, although the secondary structure of the protein is not affected. We present data that supports the strong binding of the BSA monolayer on SPIONs and the properties of the BSA layer as a protein-resistant coating. We believe that a complete understanding of the behavior and morphology of BSA-SPIONs and how the protein interacts with SPIONs is crucial for improving NP surface design and expanding the potential applications of SPIONs in nanomedicine.The variety of nanoparticles (NPs) used in biological applications is increasing and the study of their interaction with biological media is becoming more important. Proteins are commonly the first biomolecules that NPs encounter when they interact with biological systems either in vitro or in vivo. Among NPs, super-paramagnetic iron oxide nanoparticles (SPIONs) show great promise for medicine. In this work, we study in detail the formation, composition, and structure of a monolayer of bovine serum albumin (BSA) on SPIONs. We determine, both by molecular simulations and experimentally, that ten molecules of BSA form a monolayer around the

  9. Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution.

    Science.gov (United States)

    Modahl, Cassandra M; Mackessy, Stephen P

    2016-06-01

    Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides

  10. Protein domain analysis of genomic sequence data reveals regulation of LRR related domains in plant transpiration in Ficus.

    Science.gov (United States)

    Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K

    2014-01-01

    Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.

  11. Molecular adaptation within the coat protein-encoding gene of Tunisian almond isolates of Prunus necrotic ringspot virus.

    Science.gov (United States)

    Boulila, Moncef; Ben Tiba, Sawssen; Jilani, Saoussen

    2013-04-01

    The sequence alignments of five Tunisian isolates of Prunus necrotic ringspot virus (PNRSV) were searched for evidence of recombination and diversifying selection. Since failing to account for recombination can elevate the false positive error rate in positive selection inference, a genetic algorithm (GARD) was used first and led to the detection of potential recombination events in the coat protein-encoding gene of that virus. The Recco algorithm confirmed these results by identifying, additionally, the potential recombinants. For neutrality testing and evaluation of nucleotide polymorphism in PNRSV CP gene, Tajima's D, and Fu and Li's D and F statistical tests were used. About selection inference, eight algorithms (SLAC, FEL, IFEL, REL, FUBAR, MEME, PARRIS, and GA branch) incorporated in HyPhy package were utilized to assess the selection pressure exerted on the expression of PNRSV capsid. Inferred phylogenies pointed out, in addition to the three classical groups (PE-5, PV-32, and PV-96), the delineation of a fourth cluster having the new proposed designation SW6, and a fifth clade comprising four Tunisian PNRSV isolates which underwent recombination and selective pressure and to which the name Tunisian outgroup was allocated.

  12. On the Power and Limits of Sequence Similarity Based Clustering of Proteins Into Families

    DEFF Research Database (Denmark)

    Wiwie, Christian; Röttger, Richard

    2017-01-01

    Over the last decades, we have observed an ongoing tremendous growth of available sequencing data fueled by the advancements in wet-lab technology. The sequencing information is only the beginning of the actual understanding of how organisms survive and prosper. It is, for instance, equally...... important to also unravel the proteomic repertoire of an organism. A classical computational approach for detecting protein families is a sequence-based similarity calculation coupled with a subsequent cluster analysis. In this work we have intensively analyzed various clustering tools on a large scale. We...... used the data to investigate the behavior of the tools' parameters underlining the diversity of the protein families. Furthermore, we trained regression models for predicting the expected performance of a clustering tool for an unknown data set and aimed to also suggest optimal parameters...

  13. Cloning, sequencing, and expression of dnaK-operon proteins from the thermophilic bacterium Thermus thermophilus.

    Science.gov (United States)

    Osipiuk, J; Joachimiak, A

    1997-09-12

    We propose that the dnaK operon of Thermus thermophilus HB8 is composed of three functionally linked genes: dnaK, grpE, and dnaJ. The dnaK and dnaJ gene products are most closely related to their cyanobacterial homologs. The DnaK protein sequence places T. thermophilus in the plastid Hsp70 subfamily. In contrast, the grpE translated sequence is most similar to GrpE from Clostridium acetobutylicum, a Gram-positive anaerobic bacterium. A single promoter region, with homology to the Escherichia coli consensus promoter sequences recognized by the sigma70 and sigma32 transcription factors, precedes the postulated operon. This promoter is heat-shock inducible. The dnaK mRNA level increased more than 30 times upon 10 min of heat shock (from 70 degrees C to 85 degrees C). A strong transcription terminating sequence was found between the dnaK and grpE genes. The individual genes were cloned into pET expression vectors and the thermophilic proteins were overproduced at high levels in E. coli and purified to homogeneity. The recombinant T. thermophilus DnaK protein was shown to have a weak ATP-hydrolytic activity, with an optimum at 90 degrees C. The ATPase was stimulated by the presence of GrpE and DnaJ. Another open reading frame, coding for ClpB heat-shock protein, was found downstream of the dnaK operon.

  14. An expressed sequence tag (EST) data mining strategy succeeding in the discovery of new G-protein coupled receptors.

    Science.gov (United States)

    Wittenberger, T; Schaller, H C; Hellebrand, S

    2001-03-30

    We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families. Copyright 2001 Academic Press.

  15. Highly Stable Trypsin-Aggregate Coatings on Polymer Nanofibers for Repeated Protein Digestion

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Byoung Chan; Lopez-Ferrer, Daniel; Lee, Sang-mok; Ahn, Hye-kyung; Nair, Sujith; Kim, Seong H.; Kim, Beom S.; Petritis, Konstantinos; Camp, David G.; Grate, Jay W.; Smith, Richard D.; Koo, Yoon-mo; Gu, Man Bock; Kim, Jungbae

    2009-04-01

    A stable and robust trypsin-based biocatalytic system was developed and demonstrated for proteomic applications. The system utilizes polymer nanofibers coated with trypsin aggregates for immobilized protease digestions. After covalently attaching an initial layer of trypsin to the polymer nanofibers, highly concentrated trypsin molecules are crosslinked to the layered trypsin by way of a glutaraldehyde treatment. This new process produced a 300-fold increase in trypsin activity compared with a conventional method for covalent trypsin immobilization and proved to be robust in that it still maintained a high level of activity after a year of repeated recycling. This highly stable form of immobilized trypsin was also resistant to autolysis, enabling repeated digestions of bovine serum albumin over 40 days and successful peptide identification by LC-MS/MS. Finally, the immobilized trypsin was resistant to proteolysis when exposed to other enzymes (i.e. chymotrypsin), which makes it suitable for use in “real-world” proteomic applications. Overall, the biocatalytic nanofibers with enzyme aggregate coatings proved to be an effective approach for repeated and automated protein digestion in proteomic analyses.

  16. Structure of clathrin-coated vesicles from small-angle scattering experiments

    DEFF Research Database (Denmark)

    Pedersen, J.S.

    1993-01-01

    Previously published small-angle neutron and X-ray scattering data from coated vesicles, reassembled coats, and stripped vesicles have been analyzed in terms of one common model. The neutron data sets include contrast variation measurements at three different D2O solvent concentrations. The model...... used for interpreting the data has spherical symmetry and explicitly takes into account polydispersity, which is described by a Gaussian distribution. A constant thickness of the clathrin coats is assumed. The fitting of the model shows that the coated vesicles consist of a low-density outer protein....... Thus, the membrane and the high-density protein shell overlap in space, which shows that the lipid membrane contains protein. The molecular mass of the average particle is 27 x 10(6) Da. The coated vesicles consist, on average, of approximately 85% protein and 15% lipids. About 40% of the protein mass...

  17. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses

    Directory of Open Access Journals (Sweden)

    Hironobu Yanagisawa

    2016-03-01

    Full Text Available The presence of high molecular weight double-stranded RNA (dsRNA within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV, a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as “DECS-C,” is a powerful method for detecting novel plant viruses.

  18. Sequence and conformational preferences at termini of α-helices in membrane proteins: role of the helix environment.

    Science.gov (United States)

    Shelar, Ashish; Bansal, Manju

    2014-12-01

    α-Helices are amongst the most common secondary structural elements seen in membrane proteins and are packed in the form of helix bundles. These α-helices encounter varying external environments (hydrophobic, hydrophilic) that may influence the sequence preferences at their N and C-termini. The role of the external environment in stabilization of the helix termini in membrane proteins is still unknown. Here we analyze α-helices in a high-resolution dataset of integral α-helical membrane proteins and establish that their sequence and conformational preferences differ from those in globular proteins. We specifically examine these preferences at the N and C-termini in helices initiating/terminating inside the membrane core as well as in linkers connecting these transmembrane helices. We find that the sequence preferences and structural motifs at capping (Ncap and Ccap) and near-helical (N' and C') positions are influenced by a combination of features including the membrane environment and the innate helix initiation and termination property of residues forming structural motifs. We also find that a large number of helix termini which do not form any particular capping motif are stabilized by formation of hydrogen bonds and hydrophobic interactions contributed from the neighboring helices in the membrane protein. We further validate the sequence preferences obtained from our analysis with data from an ultradeep sequencing study that identifies evolutionarily conserved amino acids in the rat neurotensin receptor. The results from our analysis provide insights for the secondary structure prediction, modeling and design of membrane proteins. © 2014 Wiley Periodicals, Inc.

  19. Analysis of agouti signaling protein (ASIP) gene polymorphisms and association with coat color in Tibetan sheep (Ovis aries).

    Science.gov (United States)

    Han, J L; Yang, M; Yue, Y J; Guo, T T; Liu, J B; Niu, C E; Yang, B H

    2015-02-06

    Tibetan sheep, an indigenous breed, have a wide variety of phenotypes and a colorful coat, which make this breed an interesting model for evaluating the effects of coat-color gene mutations on this phenotypic trait. The agouti signaling protein (ASIP) gene is a positional candidate gene, as was inferred based on previous study. In our research, ASIP gene copy numbers in genomic DNA were detected using a novel approach, and the exon 2 g.100-104 mutation and copy number variation (CNV) of ASIP were associated with coat color in 256 sheep collected from eight populations with different coat colors by high-resolution melting curve assay. We found that the relative copy numbers of ASIP ranged from one to eight in Tibetan sheep. All of the g.100-104 genotypes in the populations were in Hardy-Weinberg equilibrium, and there was no relationship between the g.100-104 genotype and coat color (P > 0.05). The single ASIP CNV allele was found to be almost entirely associated with solid-black coat color; however, not all solid-black sheep displayed the putative single ASIP CNV genotype. From our study, we speculate that the ASIP CNV is under great selective pressure and the single ASIP CNV allows selection for black coat color in Tibetan sheep, but this does not explain all black phenotypes in Tibetan sheep.

  20. The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

    Science.gov (United States)

    Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

    2016-01-01

    Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species. © 2016 S. Karger AG, Basel.

  1. Eps15R is a tyrosine kinase substrate with characteristics of a docking protein possibly involved in coated pits-mediated internalization

    DEFF Research Database (Denmark)

    Coda, L; Salcini, A E; Confalonieri, S

    1998-01-01

    in NIH-3T3 cells overexpressing the receptor, even at low levels of receptor occupancy, thus behaving as physiological substrates. A role for eps15R in clathrin-mediated endocytosis is suggested by its localization in plasma membrane-coated pits and in vivo association to the coated pits' adapter protein...... AP-2. Finally, we demonstrate that a sizable fraction of eps15R exists in the cell as a complex with eps15 and that its EH domains exhibit binding specificities that are partially distinct from those of eps15. We propose that eps15 and eps15R are multifunctional binding proteins that serve...

  2. Robust Trypsin Coating on Electrospun Polymer Nanofibers in Rigorous Conditions and Its Uses for Protein Digestion

    Energy Technology Data Exchange (ETDEWEB)

    Ahn, Hye-Kyung; Kim, Byoung Chan; Jun, Seung-Hyun; Chang, Mun Seock; Lopez-Ferrer, Daniel; Smith, Richard D.; Gu, Man Bock; Lee, Sang-Won; Kim, Beom S.; Kim, Jungbae

    2010-12-15

    An efficient protein digestion in proteomic analysis requires the stabilization of proteases such as trypsin. In the present work, trypsin was stabilized in the form of enzyme coating on electrospun polymer nanofibers (EC-TR), which crosslinks additional trypsin molecules onto covalently-attached trypsin (CA-TR). EC-TR showed better stability than CA-TR in rigorous conditions, such as at high temperatures of 40 °C and 50 °C, in the presence of organic co-solvents, and at various pH's. For example, the half-lives of CA-TR and EC-TR were 0.24 and 163.20 hours at 40 ºC, respectively. The improved stability of EC-TR can be explained by covalent-linkages on the surface of trypsin molecules, which effectively inhibits the denaturation, autolysis, and leaching of trypsin. The protein digestion was performed at 40 °C by using both CA-TR and EC-TR in digesting a model protein, enolase. EC-TR showed better performance and stability than CA-TR by maintaining good performance of enolase digestion under recycled uses for a period of one week. In the same condition, CA-TR showed poor performance from the beginning, and could not be used for digestion at all after a few usages. The enzyme coating approach is anticipated to be successfully employed not only for protein digestion in proteomic analysis, but also for various other fields where the poor enzyme stability presently hampers the practical applications of enzymes.

  3. Studies on soft centered coated snacks.

    Science.gov (United States)

    Pavithra, A S; Chetana, Ramakrishna; Babylatha, R; Archana, S N; Bhat, K K

    2013-04-01

    Roasted groundnut seeds, amaranth and dates pulp formed the center filling which was coated with sugar, breadings, desiccated coconut and roasted Bengalgram flour (BGF) to get 4 coated snacks. Physicochemical characteristics, microbiological profile, sorption behaviour and sensory quality of 4 coated snacks were determined. Centre filling to coating ratio of the products were in the range of 3:2-7:1, the product having BGF coating had the thinnest coating. Center filling had soft texture and the moisture content was 10.2-16.2% coating had lower moisture content (4.4-8.6%) except for Bengal gram coating, which had 11.1% moisture. Sugar coated snack has lowest fat (11.6%) and protein (7.2%) contents. Desiccated coconut coated snack has highest fat (25.4%) and Bengal gram flour coated snack had highest protein content (15.4%). Sorption studies showed that the coated snack had critical moisture content of 11.2-13.5%. The products were moisture sensitive and hence require packaging in films having higher moisture barrier property. In freshly prepared snacks coliforms, yeast and mold were absent. Mesophillic aerobes count did not show significant change during 90 days of storage at 27 °C and 37 °C. Sensory analysis showed that products had a unique texture due to combined effect of fairly hard coating and soft center. Flavour and overall quality of all the products were rated as very good.

  4. Wheat streak mosaic virus coat protein is a determinant for vector transmission by the wheat curl mite

    Science.gov (United States)

    Wheat streak mosaic virus (WSMV; genus Tritimovirus; family Potyviridae), is transmitted by the wheat curl mite (Aceria tosichella Keifer). The requirement of coat protein (CP) for WSMV transmission by the wheat curl mite was examined using a series of viable deletion and point mutations. Mite trans...

  5. Evaluation of an edible coating based whey protein and beeswax on the physical and chemical quality of gooseberry (Physalis peruviana L.

    Directory of Open Access Journals (Sweden)

    Oswaldo Osorio Mora

    2016-10-01

    Full Text Available It was developed and optimized an edible coating based whey protein concentrate and beeswax. An experimental design 32 was used, this was evaluated by response surface methodology, it was obtained that the optimal formulation with a concentration of 15% beeswax and 10% whey protein, reduced by 35.49% weight loss of the fruit with respect to weight loss of control treatments. The optimal treatment was characterized and evaluated on the physicochemical properties of the gooseberry (Physalis peruviana L. in two storage conditions: environment (17- 2 ºC y HR: 69% and cooling (4 -2 ºC y HR: 66%. The results for the 15th day evaluation indicated a decrease in the percentage of weight loss, in storage environment and cooling (36.20% and 41.50% respectively. Control treatments decreased acidity in storage environment and cooling 3.88% and 4.92% respectively, compared to coating treatments. pH have not significantly change with any treatment. The coating prevent reduction of soluble solids 3.76% to environmental conditions and 2.27% to cooling conditions. For maturity index were not any significant changes between control treatments and coating treatments, in both conditions. The firmness remained without any significant changes except treatment environment uncoated, this presented a loss of firmness of 12.04% compared to the treatment environment coated. The respiration rate indicates a climacteric peak at day 8th for environment and at day 10th for cooling. For some properties, the cooling treatment uncoated and environment treatment coated have not any significant changes whereby the coating application can be an alternative to the cooling.

  6. ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins.

    Science.gov (United States)

    Ruiz-Blanco, Yasser B; Paz, Waldo; Green, James; Marrero-Ponce, Yovani

    2015-05-16

    The exponential growth of protein structural and sequence databases is enabling multifaceted approaches to understanding the long sought sequence-structure-function relationship. Advances in computation now make it possible to apply well-established data mining and pattern recognition techniques to these data to learn models that effectively relate structure and function. However, extracting meaningful numerical descriptors of protein sequence and structure is a key issue that requires an efficient and widely available solution. We here introduce ProtDCal, a new computational software suite capable of generating tens of thousands of features considering both sequence-based and 3D-structural descriptors. We demonstrate, by means of principle component analysis and Shannon entropy tests, how ProtDCal's sequence-based descriptors provide new and more relevant information not encoded by currently available servers for sequence-based protein feature generation. The wide diversity of the 3D-structure-based features generated by ProtDCal is shown to provide additional complementary information and effectively completes its general protein encoding capability. As demonstration of the utility of ProtDCal's features, prediction models of N-linked glycosylation sites are trained and evaluated. Classification performance compares favourably with that of contemporary predictors of N-linked glycosylation sites, in spite of not using domain-specific features as input information. ProtDCal provides a friendly and cross-platform graphical user interface, developed in the Java programming language and is freely available at: http://bioinf.sce.carleton.ca/ProtDCal/ . ProtDCal introduces local and group-based encoding which enhances the diversity of the information captured by the computed features. Furthermore, we have shown that adding structure-based descriptors contributes non-redundant additional information to the features-based characterization of polypeptide systems. This

  7. SEC16 in COPII coat dynamics at ER exit sites

    NARCIS (Netherlands)

    Sprangers, Joep; Rabouille, Catherine

    Protein export from the endoplasmic reticulum (ER), the first step in protein transport through the secretory pathway, is mediated by coatomer protein II (COPII)-coated vesicles at ER exit sites. COPII coat assembly on the ER is well understood and the conserved large hydrophilic protein Sec16

  8. Role of Pea Enation Mosaic Virus Coat Protein in the Host Plant and Aphid Vector

    Directory of Open Access Journals (Sweden)

    Juliette Doumayrou

    2016-11-01

    Full Text Available Understanding the molecular mechanisms involved in plant virus–vector interactions is essential for the development of effective control measures for aphid-vectored epidemic plant diseases. The coat proteins (CP are the main component of the viral capsids, and they are implicated in practically every stage of the viral infection cycle. Pea enation mosaic virus 1 (PEMV1, Enamovirus, Luteoviridae and Pea enation mosaic virus 2 (PEMV2, Umbravirus, Tombusviridae are two RNA viruses in an obligate symbiosis causing the pea enation mosaic disease. Sixteen mutant viruses were generated with mutations in different domains of the CP to evaluate the role of specific amino acids in viral replication, virion assembly, long-distance movement in Pisum sativum, and aphid transmission. Twelve mutant viruses were unable to assemble but were able to replicate in inoculated leaves, move long-distance, and express the CP in newly infected leaves. Four mutant viruses produced virions, but three were not transmissible by the pea aphid, Acyrthosiphon pisum. Three-dimensional modeling of the PEMV CP, combined with biological assays for virion assembly and aphid transmission, allowed for a model of the assembly of PEMV coat protein subunits.

  9. Role of Pea Enation Mosaic Virus Coat Protein in the Host Plant and Aphid Vector.

    Science.gov (United States)

    Doumayrou, Juliette; Sheber, Melissa; Bonning, Bryony C; Miller, W Allen

    2016-11-18

    Understanding the molecular mechanisms involved in plant virus-vector interactions is essential for the development of effective control measures for aphid-vectored epidemic plant diseases. The coat proteins (CP) are the main component of the viral capsids, and they are implicated in practically every stage of the viral infection cycle. Pea enation mosaic virus 1 (PEMV1, Enamovirus , Luteoviridae ) and Pea enation mosaic virus 2 (PEMV2, Umbravirus , Tombusviridae ) are two RNA viruses in an obligate symbiosis causing the pea enation mosaic disease. Sixteen mutant viruses were generated with mutations in different domains of the CP to evaluate the role of specific amino acids in viral replication, virion assembly, long-distance movement in Pisum sativum , and aphid transmission. Twelve mutant viruses were unable to assemble but were able to replicate in inoculated leaves, move long-distance, and express the CP in newly infected leaves. Four mutant viruses produced virions, but three were not transmissible by the pea aphid, Acyrthosiphon pisum . Three-dimensional modeling of the PEMV CP, combined with biological assays for virion assembly and aphid transmission, allowed for a model of the assembly of PEMV coat protein subunits.

  10. Modeling compositional dynamics based on GC and purine contents of protein-coding sequences

    KAUST Repository

    Zhang, Zhang

    2010-11-08

    Background: Understanding the compositional dynamics of genomes and their coding sequences is of great significance in gaining clues into molecular evolution and a large number of publically-available genome sequences have allowed us to quantitatively predict deviations of empirical data from their theoretical counterparts. However, the quantification of theoretical compositional variations for a wide diversity of genomes remains a major challenge.Results: To model the compositional dynamics of protein-coding sequences, we propose two simple models that take into account both mutation and selection effects, which act differently at the three codon positions, and use both GC and purine contents as compositional parameters. The two models concern the theoretical composition of nucleotides, codons, and amino acids, with no prerequisite of homologous sequences or their alignments. We evaluated the two models by quantifying theoretical compositions of a large collection of protein-coding sequences (including 46 of Archaea, 686 of Bacteria, and 826 of Eukarya), yielding consistent theoretical compositions across all the collected sequences.Conclusions: We show that the compositions of nucleotides, codons, and amino acids are largely determined by both GC and purine contents and suggest that deviations of the observed from the expected compositions may reflect compositional signatures that arise from a complex interplay between mutation and selection via DNA replication and repair mechanisms.Reviewers: This article was reviewed by Zhaolei Zhang (nominated by Mark Gerstein), Guruprasad Ananda (nominated by Kateryna Makova), and Daniel Haft. 2010 Zhang and Yu; licensee BioMed Central Ltd.

  11. Modeling compositional dynamics based on GC and purine contents of protein-coding sequences

    KAUST Repository

    Zhang, Zhang; Yu, Jun

    2010-01-01

    Background: Understanding the compositional dynamics of genomes and their coding sequences is of great significance in gaining clues into molecular evolution and a large number of publically-available genome sequences have allowed us to quantitatively predict deviations of empirical data from their theoretical counterparts. However, the quantification of theoretical compositional variations for a wide diversity of genomes remains a major challenge.Results: To model the compositional dynamics of protein-coding sequences, we propose two simple models that take into account both mutation and selection effects, which act differently at the three codon positions, and use both GC and purine contents as compositional parameters. The two models concern the theoretical composition of nucleotides, codons, and amino acids, with no prerequisite of homologous sequences or their alignments. We evaluated the two models by quantifying theoretical compositions of a large collection of protein-coding sequences (including 46 of Archaea, 686 of Bacteria, and 826 of Eukarya), yielding consistent theoretical compositions across all the collected sequences.Conclusions: We show that the compositions of nucleotides, codons, and amino acids are largely determined by both GC and purine contents and suggest that deviations of the observed from the expected compositions may reflect compositional signatures that arise from a complex interplay between mutation and selection via DNA replication and repair mechanisms.Reviewers: This article was reviewed by Zhaolei Zhang (nominated by Mark Gerstein), Guruprasad Ananda (nominated by Kateryna Makova), and Daniel Haft. 2010 Zhang and Yu; licensee BioMed Central Ltd.

  12. An alignment-free method to find similarity among protein sequences via the general form of Chou's pseudo amino acid composition.

    Science.gov (United States)

    Gupta, M K; Niyogi, R; Misra, M

    2013-01-01

    In this paper, we propose a method to create the 60-dimensional feature vector for protein sequences via the general form of pseudo amino acid composition. The construction of the feature vector is based on the contents of amino acids, total distance of each amino acid from the first amino acid in the protein sequence and the distribution of 20 amino acids. The obtained cosine distance metric (also called the similarity matrix) is used to construct the phylogenetic tree by the neighbour joining method. In order to show the applicability of our approach, we tested it on three proteins: 1) ND5 protein sequences from nine species, 2) ND6 protein sequences from eight species, and 3) 50 coronavirus spike proteins. The results are in agreement with known history and the output from the multiple sequence alignment program ClustalW, which is widely used. We have also compared our phylogenetic results with six other recently proposed alignment-free methods. These comparisons show that our proposed method gives a more consistent biological relationship than the others. In addition, the time complexity is linear and space required is less as compared with other alignment-free methods that use graphical representation. It should be noted that the multiple sequence alignment method has exponential time complexity.

  13. An intuitive graphical webserver for multiple-choice protein sequence search.

    Science.gov (United States)

    Banky, Daniel; Szalkai, Balazs; Grolmusz, Vince

    2014-04-10

    Every day tens of thousands of sequence searches and sequence alignment queries are submitted to webservers. The capitalized word "BLAST" becomes a verb, describing the act of performing sequence search and alignment. However, if one needs to search for sequences that contain, for example, two hydrophobic and three polar residues at five given positions, the query formation on the most frequently used webservers will be difficult. Some servers support the formation of queries with regular expressions, but most of the users are unfamiliar with their syntax. Here we present an intuitive, easily applicable webserver, the Protein Sequence Analysis server, that allows the formation of multiple choice queries by simply drawing the residues to their positions; if more than one residue are drawn to the same position, then they will be nicely stacked on the user interface, indicating the multiple choice at the given position. This computer-game-like interface is natural and intuitive, and the coloring of the residues makes possible to form queries requiring not just certain amino acids in the given positions, but also small nonpolar, negatively charged, hydrophobic, positively charged, or polar ones. The webserver is available at http://psa.pitgroup.org. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. MIToS.jl: mutual information tools for protein sequence analysis in the Julia language

    DEFF Research Database (Denmark)

    Zea, Diego J.; Anfossi, Diego; Nielsen, Morten

    2017-01-01

    Motivation: MIToS is an environment for mutual information analysis and a framework for protein multiple sequence alignments (MSAs) and protein structures (PDB) management in Julia language. It integrates sequence and structural information through SIFTS, making Pfam MSAs analysis straightforward....... MIToS streamlines the implementation of any measure calculated from residue contingency tables and its optimization and testing in terms of protein contact prediction. As an example, we implemented and tested a BLOSUM62-based pseudo-count strategy in mutual information analysis. Availability...... and Implementation: The software is totally implemented in Julia and supported for Linux, OS X and Windows. It’s freely available on GitHub under MIT license: http://mitos.leloir.org.ar. Contacts:diegozea@gmail.com or cmb@leloir.org.ar Supplementary information: Supplementary data are available at Bioinformatics...

  15. 40 CFR 174.516 - Coat protein of cucumber mosaic virus; exemption from the requirement of a tolerance.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 23 2010-07-01 2010-07-01 false Coat protein of cucumber mosaic virus; exemption from the requirement of a tolerance. 174.516 Section 174.516 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) PESTICIDE PROGRAMS PROCEDURES AND REQUIREMENTS FOR PLANT-INCORPORATED PROTECTANTS Tolerances and Tolerance...

  16. 40 CFR 174.515 - Coat Protein of Papaya Ringspot Virus; exemption from the requirement of a tolerance.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 23 2010-07-01 2010-07-01 false Coat Protein of Papaya Ringspot Virus; exemption from the requirement of a tolerance. 174.515 Section 174.515 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) PESTICIDE PROGRAMS PROCEDURES AND REQUIREMENTS FOR PLANT-INCORPORATED PROTECTANTS Tolerances and Tolerance...

  17. Nucleotide sequence of Phaseolus vulgaris L. alcohol dehydrogenase encoding cDNA and three-dimensional structure prediction of the deduced protein.

    Science.gov (United States)

    Amelia, Kassim; Khor, Chin Yin; Shah, Farida Habib; Bhore, Subhash J

    2015-01-01

    Common beans (Phaseolus vulgaris L.) are widely consumed as a source of proteins and natural products. However, its yield needs to be increased. In line with the agenda of Phaseomics (an international consortium), work of expressed sequence tags (ESTs) generation from bean pods was initiated. Altogether, 5972 ESTs have been isolated. Alcohol dehydrogenase (AD) encoding gene cDNA was a noticeable transcript among the generated ESTs. This AD is an important enzyme; therefore, to understand more about it this study was undertaken. The objective of this study was to elucidate P. vulgaris L. AD (PvAD) gene cDNA sequence and to predict the three-dimensional (3D) structure of deduced protein. positive and negative strands of the PvAD cDNA clone were sequenced using M13 forward and M13 reverse primers to elucidate the nucleotide sequence. Deduced PvAD cDNA and protein sequence was analyzed for their basic features using online bioinformatics tools. Sequence comparison was carried out using bl2seq program, and tree-view program was used to construct a phylogenetic tree. The secondary structures and 3D structure of PvAD protein were predicted by using the PHYRE automatic fold recognition server. The sequencing results analysis showed that PvAD cDNA is 1294 bp in length. It's open reading frame encodes for a protein that contains 371 amino acids. Deduced protein sequence analysis showed the presence of putative substrate binding, catalytic Zn binding, and NAD binding sites. Results indicate that the predicted 3D structure of PvAD protein is analogous to the experimentally determined crystal structure of s-nitrosoglutathione reductase from an Arabidopsis species. The 1294 bp long PvAD cDNA encodes for 371 amino acid long protein that contains conserved domains required for biological functions of AD. The predicted deduced PvAD protein's 3D structure reflects the analogy with the crystal structure of Arabidopsis thaliana s-nitrosoglutathione reductase. Further study is required

  18. Middle Pleistocene protein sequences from the rhinoceros genus Stephanorhinus and the phylogeny of extant and extinct Middle/Late Pleistocene Rhinocerotidae.

    Science.gov (United States)

    Welker, Frido; Smith, Geoff M; Hutson, Jarod M; Kindler, Lutz; Garcia-Moreno, Alejandro; Villaluenga, Aritza; Turner, Elaine; Gaudzinski-Windheuser, Sabine

    2017-01-01

    Ancient protein sequences are increasingly used to elucidate the phylogenetic relationships between extinct and extant mammalian taxa. Here, we apply these recent developments to Middle Pleistocene bone specimens of the rhinoceros genus Stephanorhinus . No biomolecular sequence data is currently available for this genus, leaving phylogenetic hypotheses on its evolutionary relationships to extant and extinct rhinoceroses untested. Furthermore, recent phylogenies based on Rhinocerotidae (partial or complete) mitochondrial DNA sequences differ in the placement of the Sumatran rhinoceros ( Dicerorhinus sumatrensis ). Therefore, studies utilising ancient protein sequences from Middle Pleistocene contexts have the potential to provide further insights into the phylogenetic relationships between extant and extinct species, including Stephanorhinus and Dicerorhinus . ZooMS screening (zooarchaeology by mass spectrometry) was performed on several Late and Middle Pleistocene specimens from the genus Stephanorhinus , subsequently followed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) to obtain ancient protein sequences from a Middle Pleistocene Stephanorhinus specimen. We performed parallel analysis on a Late Pleistocene woolly rhinoceros specimen and extant species of rhinoceroses, resulting in the availability of protein sequence data for five extant species and two extinct genera. Phylogenetic analysis additionally included all extant Perissodactyla genera ( Equus , Tapirus ), and was conducted using Bayesian (MrBayes) and maximum-likelihood (RAxML) methods. Various ancient proteins were identified in both the Middle and Late Pleistocene rhinoceros samples. Protein degradation and proteome complexity are consistent with an endogenous origin of the identified proteins. Phylogenetic analysis of informative proteins resolved the Perissodactyla phylogeny in agreement with previous studies in regards to the placement of the families Equidae, Tapiridae, and

  19. Oral treponeme major surface protein: Sequence diversity and distributions within periodontal niches.

    Science.gov (United States)

    You, M; Chan, Y; Lacap-Bugler, D C; Huo, Y-B; Gao, W; Leung, W K; Watt, R M

    2017-12-01

    Treponema denticola and other species (phylotypes) of oral spirochetes are widely considered to play important etiological roles in periodontitis and other oral infections. The major surface protein (Msp) of T. denticola is directly implicated in several pathological mechanisms. Here, we have analyzed msp sequence diversity across 68 strains of oral phylogroup 1 and 2 treponemes; including reference strains of T. denticola, Treponema putidum, Treponema medium, 'Treponema vincentii', and 'Treponema sinensis'. All encoded Msp proteins contained highly conserved, taxon-specific signal peptides, and shared a predicted 'three-domain' structure. A clone-based strategy employing 'msp-specific' polymerase chain reaction primers was used to analyze msp gene sequence diversity present in subgingival plaque samples collected from a group of individuals with chronic periodontitis (n=10), vs periodontitis-free controls (n=10). We obtained 626 clinical msp gene sequences, which were assigned to 21 distinct 'clinical msp genotypes' (95% sequence identity cut-off). The most frequently detected clinical msp genotype corresponded to T. denticola ATCC 35405 T , but this was not correlated to disease status. UniFrac and libshuff analysis revealed that individuals with periodontitis and periodontitis-free controls harbored significantly different communities of treponeme clinical msp genotypes (Pdiversity than periodontitis-free controls (Mann-Whitney U-test, Pdiversity of Treponema clinical msp genotypes within their subgingival niches. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  20. Tailoring odorant-binding protein coatings characteristics for surface acoustic wave biosensor development

    Energy Technology Data Exchange (ETDEWEB)

    Di Pietrantonio, F., E-mail: fabio.dp@idasc.cnr.it [Institute of Acoustics and Sensors “O. M. Corbino”, National Research Council of Italy, Via del Fosso del Cavaliere 100, 00133 Rome (Italy); Benetti, M. [Institute of Acoustics and Sensors “O. M. Corbino”, National Research Council of Italy, Via del Fosso del Cavaliere 100, 00133 Rome (Italy); Dinca, V. [National Institute for Lasers, Plasma and Radiation Physics, 409 Atomistilor Street, PO Box MG-16, 077125 Magurele (Romania); Cannatà, D. [Institute of Acoustics and Sensors “O. M. Corbino”, National Research Council of Italy, Via del Fosso del Cavaliere 100, 00133 Rome (Italy); Verona, E. [Institute for Photonics and Nanotechnologies, National Research Council of Italy, Via del Cineto Romano 42, 00156 Rome (Italy); D’Auria, S. [Institute of Protein Biochemistry, National Research Council of Italy, Via Pietro Castellino 111, 80131 Naples (Italy); Dinescu, M. [National Institute for Lasers, Plasma and Radiation Physics, 409 Atomistilor Street, PO Box MG-16, 077125 Magurele (Romania)

    2014-05-01

    In this study, wild type bovine odorant-binding proteins (wtbOBPs) were deposited by matrix-assisted pulsed laser evaporation (MAPLE) and utilized as active material on surface acoustic wave (SAW) biosensors. Fourier transform infrared spectroscopy (FTIR), and atomic force microscopy (AFM) were used to determine the chemical, morphological characteristics of the protein thin films. The FTIR data demonstrates that the functional groups of wtbOBPs do not suffer significant changes in the MAPLE-deposited films when compared to the reference one. The topographical studies show that the homogeneity, density and the roughness of the coatings are related mainly to the laser parameters (fluence and number of pulses). SAW biosensor responses to different concentrations of R-(–)-1-octen-3-ol (octenol) and R-(–)-carvone (carvone) were evaluated. The obtained sensitivities, achieved through the optimization of deposition parameters, demonstrated that MAPLE is a promising deposition technique for SAW biosensor implementation.

  1. A Molecular Staple: D-Loops in the I Domain of Bacteriophage P22 Coat Protein Make Important Intercapsomer Contacts Required for Procapsid Assembly

    Science.gov (United States)

    D'Lima, Nadia G.

    2015-01-01

    ABSTRACT Bacteriophage P22, a double-stranded DNA (dsDNA) virus, has a nonconserved 124-amino-acid accessory domain inserted into its coat protein, which has the canonical HK97 protein fold. This I domain is involved in virus capsid size determination and stability, as well as protein folding. The nuclear magnetic resonance (NMR) solution structure of the I domain revealed the presence of a D-loop, which was hypothesized to make important intersubunit contacts between coat proteins in adjacent capsomers. Here we show that amino acid substitutions of residues near the tip of the D-loop result in aberrant assembly products, including tubes and broken particles, highlighting the significance of the D-loops in proper procapsid assembly. Using disulfide cross-linking, we showed that the tips of the D-loops are positioned directly across from each other both in the procapsid and the mature virion, suggesting their importance in both states. Our results indicate that D-loop interactions act as “molecular staples” at the icosahedral 2-fold symmetry axis and significantly contribute to stabilizing the P22 capsid for DNA packaging. IMPORTANCE Many dsDNA viruses have morphogenic pathways utilizing an intermediate capsid, known as a procapsid. These procapsids are assembled from a coat protein having the HK97 fold in a reaction driven by scaffolding proteins or delta domains. Maturation of the capsid occurs during DNA packaging. Bacteriophage HK97 uniquely stabilizes its capsid during maturation by intercapsomer cross-linking, but most virus capsids are stabilized by alternate means. Here we show that the I domain that is inserted into the coat protein of bacteriophage P22 is important in the process of proper procapsid assembly. Specifically, the I domain allows for stabilizing interactions across the capsid 2-fold axis of symmetry via a D-loop. When amino acid residues at the tip of the D-loop are mutated, aberrant assembly products, including tubes, are formed instead

  2. Amino acid sequences of predicted proteins and their annotation for 95 organism species. - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Amino acid sequences of predicted proteins and their annotation for 95 organis...m species. Data detail Data name Amino acid sequences of predicted proteins and their annotation for 95 orga...nism species. DOI 10.18908/lsdba.nbdc00464-001 Description of data contents Amino acid sequences of predicted proteins...Database Description Download License Update History of This Database Site Policy | Contact Us Amino acid sequences of predicted prot...eins and their annotation for 95 organism species. - Gclust Server | LSDB Archive ...

  3. Properties of Whey-Protein-Coated Films and Laminates as Novel Recyclable Food Packaging Materials with Excellent Barrier Properties

    Directory of Open Access Journals (Sweden)

    Markus Schmid

    2012-01-01

    Full Text Available In case of food packaging applications, high oxygen and water vapour barriers are the prerequisite conditions for preserving the quality of the products throughout their whole lifecycle. Currently available polymers and/or biopolymer films are mostly used in combination with barrier materials derived from oil based plastics or aluminium to enhance their low barrier properties. In order to replace these non-renewable materials, current research efforts are focused on the development of sustainable coatings, while maintaining the functional properties of the resulting packaging materials. This article provides an introduction to food packaging requirements, highlights prior art on the use of whey-based coatings for their barriers properties, and describes the key properties of an innovative packaging multilayer material that includes a whey-based layer. The developed whey protein formulations had excellent barrier properties almost comparable to the ethylene vinyl alcohol copolymers (EVOH barrier layer conventionally used in food packaging composites, with an oxygen barrier (OTR of <2 [cm³(STP/(m²d bar] when normalized to a thickness of 100 μm. Further requirements of the barrier layer are good adhesion to the substrate and sufficient flexibility to withstand mechanical load while preventing delamination and/or brittle fracture. Whey-protein-based coatings have successfully met these functional and mechanical requirements.

  4. Evolutionary rates at codon sites may be used to align sequences and infer protein domain function

    Directory of Open Access Journals (Sweden)

    Hazelhurst Scott

    2010-03-01

    Full Text Available Abstract Background Sequence alignments form part of many investigations in molecular biology, including the determination of phylogenetic relationships, the prediction of protein structure and function, and the measurement of evolutionary rates. However, to obtain meaningful results, a significant degree of sequence similarity is required to ensure that the alignments are accurate and the inferences correct. Limitations arise when sequence similarity is low, which is particularly problematic when working with fast-evolving genes, evolutionary distant taxa, genomes with nucleotide biases, and cases of convergent evolution. Results A novel approach was conceptualized to address the "low sequence similarity" alignment problem. We developed an alignment algorithm termed FIRE (Functional Inference using the Rates of Evolution, which aligns sequences using the evolutionary rate at codon sites, as measured by the dN/dS ratio, rather than nucleotide or amino acid residues. FIRE was used to test the hypotheses that evolutionary rates can be used to align sequences and that the alignments may be used to infer protein domain function. Using a range of test data, we found that aligning domains based on evolutionary rates was possible even when sequence similarity was very low (for example, antibody variable regions. Furthermore, the alignment has the potential to infer protein domain function, indicating that domains with similar functions are subject to similar evolutionary constraints. These data suggest that an evolutionary rate-based approach to sequence analysis (particularly when combined with structural data may be used to study cases of convergent evolution or when sequences have very low similarity. However, when aligning homologous gene sets with sequence similarity, FIRE did not perform as well as the best traditional alignment algorithms indicating that the conventional approach of aligning residues as opposed to evolutionary rates remains the

  5. Comparative In silico Study of Sex-Determining Region Y (SRY) Protein Sequences Involved in Sex-Determining.

    Science.gov (United States)

    Vakili Azghandi, Masoume; Nasiri, Mohammadreza; Shamsa, Ali; Jalali, Mohsen; Shariati, Mohammad Mahdi

    2016-04-01

    The SRY gene (SRY) provides instructions for making a transcription factor called the sex-determining region Y protein. The sex-determining region Y protein causes a fetus to develop as a male. In this study, SRY of 15 spices included of human, chimpanzee, dog, pig, rat, cattle, buffalo, goat, sheep, horse, zebra, frog, urial, dolphin and killer whale were used for determine of bioinformatic differences. Nucleotide sequences of SRY were retrieved from the NCBI databank. Bioinformatic analysis of SRY is done by CLC Main Workbench version 5.5 and ClustalW (http:/www.ebi.ac.uk/clustalw/) and MEGA6 softwares. The multiple sequence alignment results indicated that SRY protein sequences from Orcinus orca (killer whale) and Tursiopsaduncus (dolphin) have least genetic distance of 0.33 in these 15 species and are 99.67% identical at the amino acid level. Homosapiens and Pantroglodytes (chimpanzee) have the next lowest genetic distance of 1.35 and are 98.65% identical at the amino acid level. These findings indicate that the SRY proteins are conserved in the 15 species, and their evolutionary relationships are similar.

  6. Dominant Red Coat Color in Holstein Cattle Is Associated with a Missense Mutation in the Coatomer Protein Complex, Subunit Alpha (COPA Gene.

    Directory of Open Access Journals (Sweden)

    Ben Dorshorst

    Full Text Available Coat color in Holstein dairy cattle is primarily controlled by the melanocortin 1 receptor (MC1R gene, a central determinant of black (eumelanin vs. red/brown pheomelanin synthesis across animal species. The major MC1R alleles in Holsteins are Dominant Black (MC1RD and Recessive Red (MC1Re. A novel form of dominant red coat color was first observed in an animal born in 1980. The mutation underlying this phenotype was named Dominant Red and is epistatic to the constitutively activated MC1RD. Here we show that a missense mutation in the coatomer protein complex, subunit alpha (COPA, a gene with previously no known role in pigmentation synthesis, is completely associated with Dominant Red in Holstein dairy cattle. The mutation results in an arginine to cysteine substitution at an amino acid residue completely conserved across eukaryotes. Despite this high level of conservation we show that both heterozygotes and homozygotes are healthy and viable. Analysis of hair pigment composition shows that the Dominant Red phenotype is similar to the MC1R Recessive Red phenotype, although less effective at reducing eumelanin synthesis. RNA-seq data similarly show that Dominant Red animals achieve predominantly pheomelanin synthesis by downregulating genes normally required for eumelanin synthesis. COPA is a component of the coat protein I seven subunit complex that is involved with retrograde and cis-Golgi intracellular coated vesicle transport of both protein and RNA cargo. This suggests that Dominant Red may be caused by aberrant MC1R protein or mRNA trafficking within the highly compartmentalized melanocyte, mimicking the effect of the Recessive Red loss of function MC1R allele.

  7. Sequence of a cloned cDNA encoding human ribosomal protein S11

    Energy Technology Data Exchange (ETDEWEB)

    Lott, J B; Mackie, G A

    1988-02-11

    The authors have isolated a cloned cDNA that encodes human ribosomal protein (rp) S11 by screening a human fibroblast cDNA library with a labelled 204 bp DNA fragment encompassing residues 212-416 of pRS11, a rat rp Sll cDNA clone. The human rp S11 cloned cDNA consists of 15 residues of the 5' leader, the entire coding sequence and all 51 residues of the 3' untranslated region. The predicted amino acid sequence of 158 residues is identical to rat rpS11. The nucleotide sequence in the coding region differs, however, from that in rat in the first position in two codons and in the third position in 44 codons.

  8. An Alignment-Free Algorithm in Comparing the Similarity of Protein Sequences Based on Pseudo-Markov Transition Probabilities among Amino Acids.

    Science.gov (United States)

    Li, Yushuang; Song, Tian; Yang, Jiasheng; Zhang, Yi; Yang, Jialiang

    2016-01-01

    In this paper, we have proposed a novel alignment-free method for comparing the similarity of protein sequences. We first encode a protein sequence into a 440 dimensional feature vector consisting of a 400 dimensional Pseudo-Markov transition probability vector among the 20 amino acids, a 20 dimensional content ratio vector, and a 20 dimensional position ratio vector of the amino acids in the sequence. By evaluating the Euclidean distances among the representing vectors, we compare the similarity of protein sequences. We then apply this method into the ND5 dataset consisting of the ND5 protein sequences of 9 species, and the F10 and G11 datasets representing two of the xylanases containing glycoside hydrolase families, i.e., families 10 and 11. As a result, our method achieves a correlation coefficient of 0.962 with the canonical protein sequence aligner ClustalW in the ND5 dataset, much higher than those of other 5 popular alignment-free methods. In addition, we successfully separate the xylanases sequences in the F10 family and the G11 family and illustrate that the F10 family is more heat stable than the G11 family, consistent with a few previous studies. Moreover, we prove mathematically an identity equation involving the Pseudo-Markov transition probability vector and the amino acids content ratio vector.

  9. Amyloid fibril formation from sequences of a natural beta-structured fibrous protein, the adenovirus fiber.

    Science.gov (United States)

    Papanikolopoulou, Katerina; Schoehn, Guy; Forge, Vincent; Forsyth, V Trevor; Riekel, Christian; Hernandez, Jean-François; Ruigrok, Rob W H; Mitraki, Anna

    2005-01-28

    Amyloid fibrils are fibrous beta-structures that derive from abnormal folding and assembly of peptides and proteins. Despite a wealth of structural studies on amyloids, the nature of the amyloid structure remains elusive; possible connections to natural, beta-structured fibrous motifs have been suggested. In this work we focus on understanding amyloid structure and formation from sequences of a natural, beta-structured fibrous protein. We show that short peptides (25 to 6 amino acids) corresponding to repetitive sequences from the adenovirus fiber shaft have an intrinsic capacity to form amyloid fibrils as judged by electron microscopy, Congo Red binding, infrared spectroscopy, and x-ray fiber diffraction. In the presence of the globular C-terminal domain of the protein that acts as a trimerization motif, the shaft sequences adopt a triple-stranded, beta-fibrous motif. We discuss the possible structure and arrangement of these sequences within the amyloid fibril, as compared with the one adopted within the native structure. A 6-amino acid peptide, corresponding to the last beta-strand of the shaft, was found to be sufficient to form amyloid fibrils. Structural analysis of these amyloid fibrils suggests that perpendicular stacking of beta-strand repeat units is an underlying common feature of amyloid formation.

  10. Statistical distributions of optimal global alignment scores of random protein sequences

    Directory of Open Access Journals (Sweden)

    Tang Jiaowei

    2005-10-01

    Full Text Available Abstract Background The inference of homology from statistically significant sequence similarity is a central issue in sequence alignments. So far the statistical distribution function underlying the optimal global alignments has not been completely determined. Results In this study, random and real but unrelated sequences prepared in six different ways were selected as reference datasets to obtain their respective statistical distributions of global alignment scores. All alignments were carried out with the Needleman-Wunsch algorithm and optimal scores were fitted to the Gumbel, normal and gamma distributions respectively. The three-parameter gamma distribution performs the best as the theoretical distribution function of global alignment scores, as it agrees perfectly well with the distribution of alignment scores. The normal distribution also agrees well with the score distribution frequencies when the shape parameter of the gamma distribution is sufficiently large, for this is the scenario when the normal distribution can be viewed as an approximation of the gamma distribution. Conclusion We have shown that the optimal global alignment scores of random protein sequences fit the three-parameter gamma distribution function. This would be useful for the inference of homology between sequences whose relationship is unknown, through the evaluation of gamma distribution significance between sequences.

  11. Strategies in protein sequencing and characterization: Multi-enzyme digestion coupled with alternate CID/ETD tandem mass spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    Nardiello, Donatella; Palermo, Carmen, E-mail: carmen.palermo@unifg.it; Natale, Anna; Quinto, Maurizio; Centonze, Diego

    2015-01-07

    Highlights: • Multi-enzyme digestion for protein sequencing and characterization by CID/ETD. • Simultaneous use of trypsin/chymotrypsin for the maximization of sequence. • Identification of PTMs, sequence variants and species-specific residues. • Increase of accuracy in sequence assignments by orthogonal fragmentation techniques. - Abstract: A strategy based on a simultaneous multi-enzyme digestion coupled with electron transfer dissociation (ETD) and collision-induced dissociation (CID) was developed for protein sequencing and characterization, as a valid alternative platform in ion-trap based proteomics. The effect of different proteolytic procedures using chymotrypsin, trypsin, a combination of both, and Lys-C, was carefully evaluated in terms of number of identified peptides, protein coverage, and score distribution. A systematic comparison between CID and ETD is shown for the analysis of peptides originating from the in-solution digestion of standard caseins. The best results were achieved with a trypsin/chymotrypsin mix combined with CID and ETD operating in alternating mode. A post-database search validation of MS/MS dataset was performed, then, the matched peptides were cross checked by the evaluation of ion scores, rank, number of experimental product ions, and their relative abundances in the MS/MS spectrum. By integrated CID/ETD experiments, high quality-spectra have been obtained, thus allowing a confirmation of spectral information and an increase of accuracy in peptide sequence assignments. Overlapping peptides, produced throughout the proteins, reduce the ambiguity in mapping modifications between natural variants and animal species, and allow the characterization of post translational modifications. The advantages of using the enzymatic mix trypsin/chymotrypsin were confirmed by the nanoLC and CID/ETD tandem mass spectrometry of goat milk proteins, previously separated by two-dimensional gel electrophoresis.

  12. Synthesis of protein-coated biocompatible methotrexate-loaded PLA-PEG-PLA nanoparticles for breast cancer treatment

    Directory of Open Access Journals (Sweden)

    Salam Massadeh

    2016-06-01

    Full Text Available Background: PLA-PEG-PLA triblock polymer nanoparticles are promising tools for targeted dug delivery. The main aim in designing polymeric nanoparticles for drug delivery is achieving a controlled and targeted release of a specific drug at the therapeutically optimal rate and choosing a suitable preparation method to encapsulate the drug efficiently, which depends mainly on the nature of the drug (hydrophilic or hydrophobic. In this study, methotrexate (MTX-loaded nanoparticles were prepared by the double emulsion method. Method: Biodegradable polymer polyethylene glycol-polylactide acid tri-block was used with poly(vinyl alcohol as emulsifier. The resulting methotrexate polymer nanoparticles were coated with bovine serum albumin in order to improve their biocompatibility. This study focused on particle size distribution, zeta potential, encapsulation efficiency, loading capacity, and in vitro drug release at various concentrations of PVA (0.5%, 1%, 2%, and 3%. Results: Reduced particle size of methotrexate-loaded nanoparticles was obtained using lower PVA concentrations. Enhanced encapsulation efficiency and loading capacity was obtained using 1% PVA. FT-IR characterization was conducted for the void polymer nanoparticles and for drug-loaded nanoparticles with methotrexate, and the protein-coated nanoparticles in solid state showed the structure of the plain PEG-PLA and the drug-loaded nanoparticles with methotrexate. The methotrexate-loaded PLA-PEG-PLA nanoparticles have been studied in vitro; the drug release, drug loading, and yield are reported. Conclusion: The drug release profile was monitored over a period of 168 hours, and was free of burst effect before the protein coating. The results obtained from this work are promising; this work can be taken further to develop MTX based therapies.

  13. Synthesis of protein-coated biocompatible methotrexate-loaded PLA-PEG-PLA nanoparticles for breast cancer treatment

    Science.gov (United States)

    Massadeh, Salam; Alaamery, Manal; Al-Qatanani, Shatha; Alarifi, Saqer; Bawazeer, Shahad; Alyafee, Yusra

    2016-01-01

    Background PLA-PEG-PLA triblock polymer nanoparticles are promising tools for targeted dug delivery. The main aim in designing polymeric nanoparticles for drug delivery is achieving a controlled and targeted release of a specific drug at the therapeutically optimal rate and choosing a suitable preparation method to encapsulate the drug efficiently, which depends mainly on the nature of the drug (hydrophilic or hydrophobic). In this study, methotrexate (MTX)-loaded nanoparticles were prepared by the double emulsion method. Method Biodegradable polymer polyethylene glycol-polylactide acid tri-block was used with poly(vinyl alcohol) as emulsifier. The resulting methotrexate polymer nanoparticles were coated with bovine serum albumin in order to improve their biocompatibility. This study focused on particle size distribution, zeta potential, encapsulation efficiency, loading capacity, and in vitro drug release at various concentrations of PVA (0.5%, 1%, 2%, and 3%). Results Reduced particle size of methotrexate-loaded nanoparticles was obtained using lower PVA concentrations. Enhanced encapsulation efficiency and loading capacity was obtained using 1% PVA. FT-IR characterization was conducted for the void polymer nanoparticles and for drug-loaded nanoparticles with methotrexate, and the protein-coated nanoparticles in solid state showed the structure of the plain PEG-PLA and the drug-loaded nanoparticles with methotrexate. The methotrexate-loaded PLA-PEG-PLA nanoparticles have been studied in vitro; the drug release, drug loading, and yield are reported. Conclusion The drug release profile was monitored over a period of 168 hours, and was free of burst effect before the protein coating. The results obtained from this work are promising; this work can be taken further to develop MTX based therapies.

  14. Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads.

    Science.gov (United States)

    Huson, Daniel H; Tappu, Rewati; Bazinet, Adam L; Xie, Chao; Cummings, Michael P; Nieselt, Kay; Williams, Rohan

    2017-01-25

    Microbiome sequencing projects typically collect tens of millions of short reads per sample. Depending on the goals of the project, the short reads can either be subjected to direct sequence analysis or be assembled into longer contigs. The assembly of whole genomes from metagenomic sequencing reads is a very difficult problem. However, for some questions, only specific genes of interest need to be assembled. This is then a gene-centric assembly where the goal is to assemble reads into contigs for a family of orthologous genes. We present a new method for performing gene-centric assembly, called protein-alignment-guided assembly, and provide an implementation in our metagenome analysis tool MEGAN. Genes are assembled on the fly, based on the alignment of all reads against a protein reference database such as NCBI-nr. Specifically, the user selects a gene family based on a classification such as KEGG and all reads binned to that gene family are assembled. Using published synthetic community metagenome sequencing reads and a set of 41 gene families, we show that the performance of this approach compares favorably with that of full-featured assemblers and that of a recently published HMM-based gene-centric assembler, both in terms of the number of reference genes detected and of the percentage of reference sequence covered. Protein-alignment-guided assembly of orthologous gene families complements whole-metagenome assembly in a new and very useful way.

  15. Simple sequence proteins in prokaryotic proteomes

    Directory of Open Access Journals (Sweden)

    Ramachandran Srinivasan

    2006-06-01

    Full Text Available Abstract Background The structural and functional features associated with Simple Sequence Proteins (SSPs are non-globularity, disease states, signaling and post-translational modification. SSPs are also an important source of genetic and possibly phenotypic variation. Analysis of 249 prokaryotic proteomes offers a new opportunity to examine the genomic properties of SSPs. Results SSPs are a minority but they grow with proteome size. This relationship is exhibited across species varying in genomic GC, mutational bias, life style, and pathogenicity. Their proportion in each proteome is strongly influenced by genomic base compositional bias. In most species simple duplications is favoured, but in a few cases such as Mycobacteria, large families of duplications occur. Amino acid preference in SSPs exhibits a trend towards low cost of biosynthesis. In SSPs and in non-SSPs, Alanine, Glycine, Leucine, and Valine are abundant in species widely varying in genomic GC whereas Isoleucine and Lysine are rich only in organisms with low genomic GC. Arginine is abundant in SSPs of two species and in the non-SSPs of Xanthomonas oryzae. Asparagine is abundant only in SSPs of low GC species. Aspartic acid is abundant only in the non-SSPs of Halobacterium sp NRC1. The abundance of Serine in SSPs of 62 species extends over a broader range compared to that of non-SSPs. Threonine(T is abundant only in SSPs of a couple of species. SSPs exhibit preferential association with Cell surface, Cell membrane and Transport functions and a negative association with Metabolism. Mesophiles and Thermophiles display similar ranges in the content of SSPs. Conclusion Although SSPs are a minority, the genomic forces of base compositional bias and duplications influence their growth and pattern in each species. The preferences and abundance of amino acids are governed by low biosynthetic cost, evolutionary age and base composition of codons. Abundance of charged amino acids Arginine

  16. In situ detection of a heat-shock regulatory element binding protein using a soluble short synthetic enhancer sequence

    Energy Technology Data Exchange (ETDEWEB)

    Harel-Bellan, A; Brini, A T; Farrar, W L [National Cancer Institute, Frederick, MD (USA); Ferris, D K [Program Resources, Inc., Frederick, MD (USA); Robin, P [Institut Gustave Roussy, Villejuif (France)

    1989-06-12

    In various studies, enhancer binding proteins have been successfully absorbed out by competing sequences inserted into plasmids, resulting in the inhibition of the plasmid expression. Theoretically, such a result could be achieved using synthetic enhancer sequences not inserted into plasmids. In this study, a double stranded DNA sequence corresponding to the human heat shock regulatory element was chemically synthesized. By in vitro retardation assays, the synthetic sequence was shown to bind specifically a protein in extracts from the human T cell line Jurkat. When the synthetic enhancer was electroporated into Jurkat cells, not only the enhancer was shown to remain undegraded into the cells for up to 2 days, but also its was shown to bind intracellularly a protein. The binding was specific and was modulated upon heat shock. Furthermore, the binding protein was shown to be of the expected molecular weight by UV crosslinking. However, when the synthetic enhancer element was co-electroporated with an HSP 70-CAT reporter construct, the expression of the reporter plasmid was consistently enhanced in the presence of the exogenous synthetic enhancer.

  17. A protein-tyrosine phosphatase with sequence similarity to the SH2 domain of the protein-tyrosine kinases.

    Science.gov (United States)

    Shen, S H; Bastien, L; Posner, B I; Chrétien, P

    1991-08-22

    The phosphorylation of proteins at tyrosine residues is critical in cellular signal transduction, neoplastic transformation and control of the mitotic cycle. These mechanisms are regulated by the activities of both protein-tyrosine kinases (PTKs) and protein-tyrosine phosphatases (PTPases). As in the PTKs, there are two classes of PTPases: membrane associated, receptor-like enzymes and soluble proteins. Here we report the isolation of a complementary DNA clone encoding a new form of soluble PTPase, PTP1C. The enzyme possesses a large noncatalytic region at the N terminus which unexpectedly contains two adjacent copies of the Src homology region 2 (the SH2 domain) found in various nonreceptor PTKs and other cytoplasmic signalling proteins. As with other SH2 sequences, the SH2 domains of PTP1C formed high-affinity complexes with the activated epidermal growth factor receptor and other phosphotyrosine-containing proteins. These results suggest that the SH2 regions in PTP1C may interact with other cellular components to modulate its own phosphatase activity against interacting substrates. PTPase activity may thus directly link growth factor receptors and other signalling proteins through protein-tyrosine phosphorylation.

  18. Electrophoretic mobility shift assay reveals a novel recognition sequence for Setaria italica NAC protein.

    Science.gov (United States)

    Puranik, Swati; Kumar, Karunesh; Srivastava, Prem S; Prasad, Manoj

    2011-10-01

    The NAC (NAM/ATAF1,2/CUC2) proteins are among the largest family of plant transcription factors. Its members have been associated with diverse plant processes and intricately regulate the expression of several genes. Inspite of this immense progress, knowledge of their DNA-binding properties are still limited. In our recent publication,1 we reported isolation of a membrane-associated NAC domain protein from Setaria italica (SiNAC). Transactivation analysis revealed that it was a functionally active transcription factor as it could stimulate expression of reporter genes in vivo. Truncations of the transmembrane region of the protein lead to its nuclear localization. Here we describe expression and purification of SiNAC DNA-binding domain. We further report identification of a novel DNA-binding site, [C/G][A/T][T/A][G/C]TC[C/G][A/T][C/G][G/C] for SiNAC by electrophoretic mobility shift assay. The SiNAC-GST protein could bind to the NAC recognition sequence in vitro as well as to sequences where some bases had been reshuffled. The results presented here contribute to our understanding of the DNA-binding specificity of SiNAC protein.

  19. Camps 2.0: exploring the sequence and structure space of prokaryotic, eukaryotic, and viral membrane proteins.

    Science.gov (United States)

    Neumann, Sindy; Hartmann, Holger; Martin-Galiano, Antonio J; Fuchs, Angelika; Frishman, Dmitrij

    2012-03-01

    Structural bioinformatics of membrane proteins is still in its infancy, and the picture of their fold space is only beginning to emerge. Because only a handful of three-dimensional structures are available, sequence comparison and structure prediction remain the main tools for investigating sequence-structure relationships in membrane protein families. Here we present a comprehensive analysis of the structural families corresponding to α-helical membrane proteins with at least three transmembrane helices. The new version of our CAMPS database (CAMPS 2.0) covers nearly 1300 eukaryotic, prokaryotic, and viral genomes. Using an advanced classification procedure, which is based on high-order hidden Markov models and considers both sequence similarity as well as the number of transmembrane helices and loop lengths, we identified 1353 structurally homogeneous clusters roughly corresponding to membrane protein folds. Only 53 clusters are associated with experimentally determined three-dimensional structures, and for these clusters CAMPS is in reasonable agreement with structure-based classification approaches such as SCOP and CATH. We therefore estimate that ∼1300 structures would need to be determined to provide a sufficient structural coverage of polytopic membrane proteins. CAMPS 2.0 is available at http://webclu.bio.wzw.tum.de/CAMPS2.0/. Copyright © 2011 Wiley Periodicals, Inc.

  20. GenProBiS: web server for mapping of sequence variants to protein binding sites.

    Science.gov (United States)

    Konc, Janez; Skrlj, Blaz; Erzen, Nika; Kunej, Tanja; Janezic, Dusanka

    2017-07-03

    Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein-protein, protein-nucleic acid, protein-compound, and protein-metal ion binding sites. The concept of a protein-compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Simplifying complex sequence information: a PCP-consensus protein binds antibodies against all four Dengue serotypes.

    Science.gov (United States)

    Bowen, David M; Lewis, Jessica A; Lu, Wenzhe; Schein, Catherine H

    2012-09-14

    Designing proteins that reflect the natural variability of a pathogen is essential for developing novel vaccines and drugs. Flaviviruses, including Dengue (DENV) and West Nile (WNV), evolve rapidly and can "escape" neutralizing monoclonal antibodies by mutation. Designing antigens that represent many distinct strains is important for DENV, where infection with a strain from one of the four serotypes may lead to severe hemorrhagic disease on subsequent infection with a strain from another serotype. Here, a DENV physicochemical property (PCP)-consensus sequence was derived from 671 unique sequences from the Flavitrack database. PCP-consensus proteins for domain 3 of the envelope protein (EdomIII) were expressed from synthetic genes in Escherichia coli. The ability of the purified consensus proteins to bind polyclonal antibodies generated in response to infection with strains from each of the four DENV serotypes was determined. The initial consensus protein bound antibodies from DENV-1-3 in ELISA and Western blot assays. This sequence was altered in 3 steps to incorporate regions of maximum variability, identified as significant changes in the PCPs, characteristic of DENV-4 strains. The final protein was recognized by antibodies against all four serotypes. Two amino acids essential for efficient binding to all DENV antibodies are part of a discontinuous epitope previously defined for a neutralizing monoclonal antibody. The PCP-consensus method can significantly reduce the number of experiments required to define a multivalent antigen, which is particularly important when dealing with pathogens that must be tested at higher biosafety levels. Copyright © 2012 Elsevier Ltd. All rights reserved.

  2. A branch-heterogeneous model of protein evolution for efficient inference of ancestral sequences.

    Science.gov (United States)

    Groussin, M; Boussau, B; Gouy, M

    2013-07-01

    Most models of nucleotide or amino acid substitution used in phylogenetic studies assume that the evolutionary process has been homogeneous across lineages and that composition of nucleotides or amino acids has remained the same throughout the tree. These oversimplified assumptions are refuted by the observation that compositional variability characterizes extant biological sequences. Branch-heterogeneous models of protein evolution that account for compositional variability have been developed, but are not yet in common use because of the large number of parameters required, leading to high computational costs and potential overparameterization. Here, we present a new branch-nonhomogeneous and nonstationary model of protein evolution that captures more accurately the high complexity of sequence evolution. This model, henceforth called Correspondence and likelihood analysis (COaLA), makes use of a correspondence analysis to reduce the number of parameters to be optimized through maximum likelihood, focusing on most of the compositional variation observed in the data. The model was thoroughly tested on both simulated and biological data sets to show its high performance in terms of data fitting and CPU time. COaLA efficiently estimates ancestral amino acid frequencies and sequences, making it relevant for studies aiming at reconstructing and resurrecting ancestral amino acid sequences. Finally, we applied COaLA on a concatenate of universal amino acid sequences to confirm previous results obtained with a nonhomogeneous Bayesian model regarding the early pattern of adaptation to optimal growth temperature, supporting the mesophilic nature of the Last Universal Common Ancestor.

  3. The structure of the COPII transport-vesicle coat assembled on membranes.

    Science.gov (United States)

    Zanetti, Giulia; Prinz, Simone; Daum, Sebastian; Meister, Annette; Schekman, Randy; Bacia, Kirsten; Briggs, John A G

    2013-09-17

    Coat protein complex II (COPII) mediates formation of the membrane vesicles that export newly synthesised proteins from the endoplasmic reticulum. The inner COPII proteins bind to cargo and membrane, linking them to the outer COPII components that form a cage around the vesicle. Regulated flexibility in coat architecture is essential for transport of a variety of differently sized cargoes, but structural data on the assembled coat has not been available. We have used cryo-electron tomography and subtomogram averaging to determine the structure of the complete, membrane-assembled COPII coat. We describe a novel arrangement of the outer coat and find that the inner coat can assemble into regular lattices. The data reveal how coat subunits interact with one another and with the membrane, suggesting how coordinated assembly of inner and outer coats can mediate and regulate packaging of vesicles ranging from small spheres to large tubular carriers. DOI:http://dx.doi.org/10.7554/eLife.00951.001.

  4. Complete nucleotide sequence of the RNA-2 of grapevine deformation and Grapevine Anatolian ringspot viruses.

    Science.gov (United States)

    Ghanem-Sabanadzovic, Nina Abou; Sabanadzovic, Sead; Digiaro, Michele; Martelli, Giovanni P

    2005-05-01

    The nucleotide sequence of RNA-2 of Grapevine Anatolian ringspot virus (GARSV) and Grapevine deformation virus (GDefV), two recently described nepoviruses, has been determined. These RNAs are 3753 nt (GDefV) and 4607 nt (GARSV) in size and contain a single open reading frame encoding a polyprotein of 122 kDa (GDefV) and 150 kDa (GARSV). Full-length nucleotide sequence comparison disclosed 71-73% homology between GDefV RNA-2 and that of Grapevine fanleaf virus (GFLV) and Arabis mosaic virus (ArMV), and 62-64% homology between GARSV RNA-2 and that of Grapevine chrome mosaic virus (GCMV) and Tomato black ring virus (TBRV). As previously observed in other nepoviruses, the 5' non-coding regions of both RNAs are capable of forming stem-loop structures. Phylogenetic analysis of the three proteins encoded by RNA-2 (i.e. protein 2A, movement protein and coat protein) confirmed that GDefV and GARSV are distinct viruses which can be assigned as definitive species in subgroup A and subgroup B of the genus Nepovirus, respectively.

  5. The 42-kDa coat protein of Andean potato mottle virus acts as a transcriptional activator in yeast

    Directory of Open Access Journals (Sweden)

    Vidal M.S.

    2002-01-01

    Full Text Available Interactions of viral proteins play an important role in the virus life cycle, especially in capsid assembly. Andean potato mottle comovirus (APMoV is a plant RNA virus with a virion formed by two coat proteins (CP42 and CP22. Both APMoV coat protein open reading frames were cloned into pGBT9 and pGAD10, two-hybrid system vectors. HF7c yeast cells transformed with the p9CP42 construct grew on yeast dropout selection media lacking tryptophan and histidine. Clones also exhibited ß-galactosidase activity in both qualitative and quantitative assays. These results suggest that CP42 protein contains an amino acid motif able to activate transcription of His3 and lacZ reporter genes in Saccharomyces cerevisiae. Several deletions of the CP42 gene were cloned into the pGBT9 vector to locate the region involved in this activation. CP42 constructions lacking 12 residues from the C-terminal region and another one with 267 residues deleted from the N-terminus are still able to activate transcription of reporter genes. However, transcription activation was not observed with construction p9CP42deltaC57, which does not contain the last 57 amino acid residues. These results demonstrate that a transcription activation domain is present at the C-terminus of CP42 between residues 267 and 374.

  6. Rapid and Efficient Protein Digestion using Trypsin Coated Magnetic Nanoparticles under Pressure Cycles

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Byoungsoo; Lopez-Ferrer, Daniel; Kim, Byoung Chan; Na, Hyon Bin; Park, Yong Il; Weitz, Karl K.; Warner, Marvin G.; Hyeon, Taeghwan; Lee, Sang-Won; Smith, Richard D.; Kim, Jungbae

    2011-01-01

    Trypsin-coated magnetic nanoparticles (EC-TR/NPs), prepared via a simple crosslinking of the enzyme to magnetic nanoparticles, were highly stable and could be easily captured using a magnet after the digestion was complete. EC-TR/NPs showed a negligible loss of trypsin activity after multiple uses and continuous shaking, while a control sample of covalently-attached trypsin on NPs resulted in a rapid inactivation under the same conditions due to the denaturation and autolysis of trypsin. Digestions were carried out on a single model protein, a five protein mixture, and a whole mouse brain proteome, and also compared for digestion at atmospheric pressure and 37 ºC for 12 h, and in combination with pressure cycling technology (PCT) at room temperature for 1 min. In all cases, the EC-TR/NPs performed equally as well or better than free trypsin in terms of the number of peptide/protein identifications and reproducibility across technical replicates. However, the concomitant use of EC-TR/NPs and PCT resulted in very fast (~1 min) and more reproducible digestions.

  7. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.

    Science.gov (United States)

    Meinicke, Peter

    2009-09-02

    Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.

  8. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences

    Directory of Open Access Journals (Sweden)

    Meinicke Peter

    2009-09-01

    Full Text Available Abstract Background Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Description Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. Conclusion For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.

  9. Cluster based on sequence comparison of homologous proteins of 95 organism species - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Cluster based on sequence comparison of homologous proteins of 95 organism spe...cies Data detail Data name Cluster based on sequence comparison of homologous proteins of 95 organism specie...istory of This Database Site Policy | Contact Us Cluster based on sequence compariso

  10. Analysis of potato virus Y coat protein epitopes recognized by three commercial monoclonal antibodies.

    Science.gov (United States)

    Tian, Yan-Ping; Hepojoki, Jussi; Ranki, Harri; Lankinen, Hilkka; Valkonen, Jari P T

    2014-01-01

    Potato virus Y (PVY, genus Potyvirus) causes substantial economic losses in solanaceous plants. Routine screening for PVY is an essential part of seed potato certification, and serological assays are often used. The commercial, commonly used monoclonal antibodies, MAb1128, MAb1129, and MAb1130, recognize the viral coat protein (CP) of PVY and distinguish PVYN strains from PVYO and PVYC strains, or detect all PVY strains, respectively. However, the minimal epitopes recognized by these antibodies have not been identified. SPOT peptide array was used to map the epitopes in CP recognized by MAb1128, MAb1129, and MAb1130. Then alanine replacement as well as N- and C-terminal deletion analysis of the identified peptide epitopes was done to determine critical amino acids for antibody recognition and the respective minimal epitopes. The epitopes of all antibodies were located within the 30 N-terminal-most residues. The minimal epitope of MAb1128 was 25NLNKEK30. Replacement of 25N or 27N with alanine weakened the recognition by MAb1128, and replacement of 26L, 29E, or 30K nearly precluded recognition. The minimal epitope for MAb1129 was 16RPEQGSIQSNP26 and the most critical residues for recognition were 22I and 23Q. The epitope of MAb1130 was defined by residues 5IDAGGS10. Mutation of residue 6D abrogated and mutation of 9G strongly reduced recognition of the peptide by MAb1130. Amino acid sequence alignment demonstrated that these epitopes are relatively conserved among PVY strains. Finally, recombinant CPs were produced to demonstrate that mutations in the variable positions of the epitope regions can affect detection with the MAbs. The epitope data acquired can be compared with data on PVY CP-encoding sequences produced by laboratories worldwide and utilized to monitor how widely the new variants of PVY can be detected with current seed potato certification schemes or during the inspection of imported seed potatoes as conducted with these MAbs.

  11. Accessible surface area of proteins from purely sequence information and the importance of global features

    Science.gov (United States)

    Faraggi, Eshel; Zhou, Yaoqi; Kloczkowski, Andrzej

    2014-03-01

    We present a new approach for predicting the accessible surface area of proteins. The novelty of this approach lies in not using residue mutation profiles generated by multiple sequence alignments as descriptive inputs. Rather, sequential window information and the global monomer and dimer compositions of the chain are used. We find that much of the lost accuracy due to the elimination of evolutionary information is recouped by the use of global features. Furthermore, this new predictor produces similar results for proteins with or without sequence homologs deposited in the Protein Data Bank, and hence shows generalizability. Finally, these predictions are obtained in a small fraction (1/1000) of the time required to run mutation profile based prediction. All these factors indicate the possible usability of this work in de-novo protein structure prediction and in de-novo protein design using iterative searches. Funded in part by the financial support of the National Institutes of Health through Grants R01GM072014 and R01GM073095, and the National Science Foundation through Grant NSF MCB 1071785.

  12. Complete amino acid sequences of the ribosomal proteins L25, L29 and L31 from the archaebacterium Halobacterium marismortui.

    Science.gov (United States)

    Hatakeyama, T; Kimura, M

    1988-03-15

    Ribosomal proteins were extracted from 50S ribosomal subunits of the archaebacterium Halobacterium marismortui by decreasing the concentration of Mg2+ and K+, and the proteins were separated and purified by ion-exchange column chromatography on DEAE-cellulose. Ten proteins were purified to homogeneity and three of these proteins were subjected to sequence analysis. The complete amino acid sequences of the ribosomal proteins L25, L29 and L31 were established by analyses of the peptides obtained by enzymatic digestion with trypsin, Staphylococcus aureus protease, chymotrypsin and lysylendopeptidase. Proteins L25, L29 and L31 consist of 84, 115 and 95 amino acid residues with the molecular masses of 9472 Da, 12293 Da and 10418 Da respectively. A comparison of their sequences with those of other large-ribosomal-subunit proteins from other organisms revealed that protein L25 from H. marismortui is homologous to protein L23 from Escherichia coli (34.6%), Bacillus stearothermophilus (41.8%), and tobacco chloroplasts (16.3%) as well as to protein L25 from yeast (38.0%). Proteins L29 and L31 do not appear to be homologous to any other ribosomal proteins whose structures are so far known.

  13. Other notable protein blotting methods: a brief review.

    Science.gov (United States)

    Kurien, Biji T; Scofield, R Hal

    2015-01-01

    Proteins have been transferred from the gel to the membrane by a variety of methods. These include vacuum blotting, centrifuge blotting, electroblotting of proteins to Teflon tape and membranes for N- and C-terminal sequence analysis, multiple tissue blotting, a two-step transfer of low- and high-molecular-weight proteins, acid electroblotting onto activated glass, membrane-array method for the detection of human intestinal bacteria in fecal samples, protein microarray using a new black cellulose nitrate support, electrotransfer using square wave alternating voltage for enhanced protein recovery, polyethylene glycol-mediated significant enhancement of the immunoblotting transfer, parallel protein chemical processing before and during western blot and the molecular scanner concept, electronic western blot of matrix-assisted laser desorption/ionization mass spectrometric-identified polypeptides from parallel processed gel-separated proteins, semidry electroblotting of peptides and proteins from acid-urea polyacrylamide gels, transfer of silver-stained proteins from polyacrylamide gels to polyvinylidene difluoride (PVDF) membranes, and the display of K(+) channel proteins on a solid nitrocellulose support for assaying toxin binding. The quantification of proteins bound to PVDF membranes by elution of CBB, clarification of immunoblots on PVDF for transmission densitometry, gold coating of nonconductive membranes before matrix-assisted laser desorption/ionization tandem mass spectrometric analysis to prevent charging effect for analysis of peptides from PVDF membranes, and a simple method for coating native polysaccharides onto nitrocellulose are some of the methods involving either the manipulation of membranes with transferred proteins or just a passive transfer of antigens to membranes. All these methods are briefly reviewed in this chapter.

  14. New sensitive and specific assay for human immunodeficiency virus antibodies using labeled recombinant fusion protein and time-resolved fluoroimmunoassay.

    OpenAIRE

    Siitari, H; Turunen, P; Schrimsher, J; Nunn, M

    1990-01-01

    A new, rapid method for the detection of human immunodeficiency virus type 1 (HIV-1) antibody by time-resolved fluoroimmunoassay (TR-FIA) was developed. In this assay format, microtitration strips were coated with a recombinant fusion protein, and the same protein was labeled with europium and added into the wells simultaneously with the test specimens. The recombinant fusion protein contained the HIV-1 p24 gag protein sequence that carried an insertion, near the carboxyl terminus, of a 23-am...

  15. Length and sequence dependence in the association of Huntingtin protein with lipid membranes

    Science.gov (United States)

    Jawahery, Sudi; Nagarajan, Anu; Matysiak, Silvina

    2013-03-01

    There is a fundamental gap in our understanding of how aggregates of mutant Huntingtin protein (htt) with overextended polyglutamine (polyQ) sequences gain the toxic properties that cause Huntington's disease (HD). Experimental studies have shown that the most important step associated with toxicity is the binding of mutant htt aggregates to lipid membranes. Studies have also shown that flanking amino acid sequences around the polyQ sequence directly affect interactions with the lipid bilayer, and that polyQ sequences of greater than 35 glutamine repeats in htt are a characteristic of HD. The key steps that determine how flanking sequences and polyQ length affect the structure of lipid bilayers remain unknown. In this study, we use atomistic molecular dynamics simulations to study the interactions between lipid membranes of varying compositions and polyQ peptides of varying lengths and flanking sequences. We find that overextended polyQ interactions do cause deformation in model membranes, and that the flanking sequences do play a role in intensifying this deformation by altering the shape of the affected regions.

  16. Comparative In silico Study of Sex-Determining Region Y (SRY Protein Sequences Involved in Sex-Determining

    Directory of Open Access Journals (Sweden)

    Masoume Vakili Azghandi

    2016-05-01

    Full Text Available Background: The SRY gene (SRY provides instructions for making a transcription factor called the sex-determining region Y protein. The sex-determining region Y protein causes a fetus to develop as a male. In this study, SRY of 15 spices included of human, chimpanzee, dog, pig, rat, cattle, buffalo, goat, sheep, horse, zebra, frog, urial, dolphin and killer whale were used for determine of bioinformatic differences. Methods: Nucleotide sequences of SRY were retrieved from the NCBI databank. Bioinformatic analysis of SRY is done by CLC Main Workbench version 5.5 and ClustalW (http:/www.ebi.ac.uk/clustalw/ and MEGA6 softwares. Results: The multiple sequence alignment results indicated that SRY protein sequences from Orcinus orca (killer whale and Tursiopsaduncus (dolphin have least genetic distance of 0.33 in these 15 species and are 99.67% identical at the amino acid level. Homosapiens and Pantroglodytes (chimpanzee have the next lowest genetic distance of 1.35 and are 98.65% identical at the amino acid level. Conclusion: These findings indicate that the SRY proteins are conserved in the 15 species, and their evolutionary relationships are similar.

  17. C-terminal sequences of hsp70 and hsp90 as non-specific anchors for tetratricopeptide repeat (TPR) proteins.

    Science.gov (United States)

    Ramsey, Andrew J; Russell, Lance C; Chinkers, Michael

    2009-10-12

    Steroid-hormone-receptor maturation is a multi-step process that involves several TPR (tetratricopeptide repeat) proteins that bind to the maturation complex via the C-termini of hsp70 (heat-shock protein 70) and hsp90 (heat-shock protein 90). We produced a random T7 peptide library to investigate the roles played by the C-termini of the two heat-shock proteins in the TPR-hsp interactions. Surprisingly, phages with the MEEVD sequence, found at the C-terminus of hsp90, were not recovered from our biopanning experiments. However, two groups of phages were isolated that bound relatively tightly to HsPP5 (Homo sapiens protein phosphatase 5) TPR. Multiple copies of phages with a C-terminal sequence of LFG were isolated. These phages bound specifically to the TPR domain of HsPP5, although mutation studies produced no evidence that they bound to the domain's hsp90-binding groove. However, the most abundant family obtained in the initial screen had an aspartate residue at the C-terminus. Two members of this family with a C-terminal sequence of VD appeared to bind with approximately the same affinity as the hsp90 C-12 control. A second generation pseudo-random phage library produced a large number of phages with an LD C-terminus. These sequences acted as hsp70 analogues and had relatively low affinities for hsp90-specific TPR domains. Unfortunately, we failed to identify residues near hsp90's C-terminus that impart binding specificity to individual hsp90-TPR interactions. The results suggest that the C-terminal sequences of hsp70 and hsp90 act primarily as non-specific anchors for TPR proteins.

  18. A two-step recognition of signal sequences determines the translocation efficiency of proteins.

    OpenAIRE

    Belin, D; Bost, S; Vassalli, J D; Strub, K

    1996-01-01

    The cytosolic and secreted, N-glycosylated, forms of plasminogen activator inhibitor-2 (PAI-2) are generated by facultative translocation. To study the molecular events that result in the bi-topological distribution of proteins, we determined in vitro the capacities of several signal sequences to bind the signal recognition particle (SRP) during targeting, and to promote vectorial transport of murine PAI-2 (mPAI-2). Interestingly, the six signal sequences we compared (mPAI-2 and three mutated...

  19. A novel approach to sequence validating protein expression clones with automated decision making

    Directory of Open Access Journals (Sweden)

    Mohr Stephanie E

    2007-06-01

    Full Text Available Abstract Background Whereas the molecular assembly of protein expression clones is readily automated and routinely accomplished in high throughput, sequence verification of these clones is still largely performed manually, an arduous and time consuming process. The ultimate goal of validation is to determine if a given plasmid clone matches its reference sequence sufficiently to be "acceptable" for use in protein expression experiments. Given the accelerating increase in availability of tens of thousands of unverified clones, there is a strong demand for rapid, efficient and accurate software that automates clone validation. Results We have developed an Automated Clone Evaluation (ACE system – the first comprehensive, multi-platform, web-based plasmid sequence verification software package. ACE automates the clone verification process by defining each clone sequence as a list of multidimensional discrepancy objects, each describing a difference between the clone and its expected sequence including the resulting polypeptide consequences. To evaluate clones automatically, this list can be compared against user acceptance criteria that specify the allowable number of discrepancies of each type. This strategy allows users to re-evaluate the same set of clones against different acceptance criteria as needed for use in other experiments. ACE manages the entire sequence validation process including contig management, identifying and annotating discrepancies, determining if discrepancies correspond to polymorphisms and clone finishing. Designed to manage thousands of clones simultaneously, ACE maintains a relational database to store information about clones at various completion stages, project processing parameters and acceptance criteria. In a direct comparison, the automated analysis by ACE took less time and was more accurate than a manual analysis of a 93 gene clone set. Conclusion ACE was designed to facilitate high throughput clone sequence

  20. Complete nucleotide sequence and genome organization of Olive latent virus 3, a new putative member of the family Tymoviridae.

    Science.gov (United States)

    Alabdullah, Abdulkader; Minafra, Angelantonio; Elbeaino, Toufic; Saponari, Maria; Savino, Vito; Martelli, Giovanni P

    2010-09-01

    The complete nucleotide sequence and the genome organization were determined of a putative new member of the family Tymoviridae, tentatively named Olive latent virus 3 (OLV-3), recovered in southern Italy from a symptomless olive tree. The sequenced ssRNA genome comprises 7148 nucleotides excluding the poly(A) tail and contains four open reading frames (ORFs). ORF1 encodes a polyprotein of 221.6kDa in size, containing the conserved signatures of the methyltransferase (MTR), papain-like protease (PRO), helicase (HEL) and RNA-dependent RNA polymerase (RdRp) domains of the replication-associated proteins of positive-strand RNA viruses. ORF2 overlaps completely ORF1 and encodes a putative protein of 43.33kDa showing limited sequence similarity with the putative movement protein of Maize rayado fino virus (MRFV). ORF3 codes for a protein with predicted molecular mass of 28.46kDa, identified as the coat protein (CP), whereas ORF4 overlaps ORF3 and encodes a putative protein of 16kDa with sequence similarity to the p16 and p31 proteins of Citrus sudden death-associated virus (CSDaV) and Grapevine fleck virus (GFkV), respectively. Within the family Tymoviridae, OLV-3 genome has the closest identity level (49-52%) with members of the genus Marafivirus, from which, however, it differs because of the diverse genome organization and the presence of a single type of CP subunits. Copyright (c) 2010 Elsevier B.V. All rights reserved.

  1. Hidden Markov model-derived structural alphabet for proteins: the learning of protein local shapes captures sequence specificity.

    Science.gov (United States)

    Camproux, A C; Tufféry, P

    2005-08-05

    Understanding and predicting protein structures depend on the complexity and the accuracy of the models used to represent them. We have recently set up a Hidden Markov Model to optimally compress protein three-dimensional conformations into a one-dimensional series of letters of a structural alphabet. Such a model learns simultaneously the shape of representative structural letters describing the local conformation and the logic of their connections, i.e. the transition matrix between the letters. Here, we move one step further and report some evidence that such a model of protein local architecture also captures some accurate amino acid features. All the letters have specific and distinct amino acid distributions. Moreover, we show that words of amino acids can have significant propensities for some letters. Perspectives point towards the prediction of the series of letters describing the structure of a protein from its amino acid sequence.

  2. The two capsid proteins of maize rayado fino virus contain common peptide sequences.

    Science.gov (United States)

    Falk, B W; Tsai, J H

    1986-01-01

    Virions of maize rayado fino virus (MRFV) were purified and two major capsid proteins (ca. Mr 29,000 and 22,000) were resolved by SDS-PAGE. When the two major capsid proteins were isolated from gels and compared by one-dimensional peptide mapping after digestion with Staphylococcus aureus V-8 protease, indistinguishable peptide maps were obtained, suggesting that these two proteins contain common peptide sequences. Some preparations also showed minor protein components that were intermediate between the Mr 22,000 and Mr 29,000 capsid proteins. One of the minor proteins, ca. Mr 27,000, gave a peptide map indistinguishable from the major capsid proteins. In vitro ageing of partially purified preparations or virion treatment with proteolytic enzymes failed to show conversion of the Mr 29,000 protein to a Mr 22,000. Protease inhibitors added to the buffers used for virion purification did not affect the apparent 1:3 ratio of 29,000 to 22,000 proteins in the purified preparations.

  3. Prediction of host - pathogen protein interactions between Mycobacterium tuberculosis and Homo sapiens using sequence motifs.

    Science.gov (United States)

    Huo, Tong; Liu, Wei; Guo, Yu; Yang, Cheng; Lin, Jianping; Rao, Zihe

    2015-03-26

    Emergence of multiple drug resistant strains of M. tuberculosis (MDR-TB) threatens to derail global efforts aimed at reigning in the pathogen. Co-infections of M. tuberculosis with HIV are difficult to treat. To counter these new challenges, it is essential to study the interactions between M. tuberculosis and the host to learn how these bacteria cause disease. We report a systematic flow to predict the host pathogen interactions (HPIs) between M. tuberculosis and Homo sapiens based on sequence motifs. First, protein sequences were used as initial input for identifying the HPIs by 'interolog' method. HPIs were further filtered by prediction of domain-domain interactions (DDIs). Functional annotations of protein and publicly available experimental results were applied to filter the remaining HPIs. Using such a strategy, 118 pairs of HPIs were identified, which involve 43 proteins from M. tuberculosis and 48 proteins from Homo sapiens. A biological interaction network between M. tuberculosis and Homo sapiens was then constructed using the predicted inter- and intra-species interactions based on the 118 pairs of HPIs. Finally, a web accessible database named PATH (Protein interactions of M. tuberculosis and Human) was constructed to store these predicted interactions and proteins. This interaction network will facilitate the research on host-pathogen protein-protein interactions, and may throw light on how M. tuberculosis interacts with its host.

  4. Sequencing Larger Intact Proteins (30-70 kDa) with Activated Ion Electron Transfer Dissociation

    Science.gov (United States)

    Riley, Nicholas M.; Westphall, Michael S.; Coon, Joshua J.

    2018-01-01

    The analysis of intact proteins via mass spectrometry can offer several benefits to proteome characterization, although the majority of top-down experiments focus on proteoforms in a relatively low mass range (AI-ETD) to proteins in the 30-70 kDa range. AI-ETD leverages infrared photo-activation concurrent to ETD reactions to improve sequence-informative product ion generation. This method generates more product ions and greater sequence coverage than conventional ETD, higher-energy collisional dissociation (HCD), and ETD combined with supplemental HCD activation (EThcD). Importantly, AI-ETD provides the most thorough protein characterization for every precursor ion charge state investigated in this study, making it suitable as a universal fragmentation method in top-down experiments. Additionally, we highlight several acquisition strategies that can benefit characterization of larger proteins with AI-ETD, including combination of spectra from multiple ETD reaction times for a given precursor ion, multiple spectral acquisitions of the same precursor ion, and combination of spectra from two different dissociation methods (e.g., AI-ETD and HCD). In all, AI-ETD shows great promise as a method for dissociating larger intact protein ions as top-down proteomics continues to advance into larger mass ranges. [Figure not available: see fulltext.

  5. Structural protein descriptors in 1-dimension and their sequence-based predictions.

    Science.gov (United States)

    Kurgan, Lukasz; Disfani, Fatemeh Miri

    2011-09-01

    The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods.

  6. Shelf-life of fresh blueberries coated with quinoa protein/chitosan/sunflower oil edible film.

    Science.gov (United States)

    Abugoch, Lilian; Tapia, Cristián; Plasencia, Dora; Pastor, Ana; Castro-Mandujano, Olivio; López, Luis; Escalona, Victor H

    2016-01-30

    The aim of this study was to evaluate quinoa protein (Q), chitosan (CH) and sunflower oil (SO) as edible film material as well as the influence of this coating in extending the shelf-life of fresh blueberries stored at 4 °C and 75% relative humidity. These conditions were used to simulate the storage conditions in supermarkets and represent adverse conditions for testing the effects of the coating. The mechanical, barrier, and structural properties of the film were measured. The effectiveness of the coating in fresh blueberries (CB) was evaluated by changes in weight loss, firmness, color, molds and yeast count, pH, titratable acidity, and soluble solids content. The tensile strength and elongation at break of the edible film were 0.45 ± 0.29 MPa and 117.2% ± 7%, respectively. The water vapor permeability was 3.3 × 10(-12) ± 4.0 × 10(-13) g s(-1) m(-1) Pa(-1). In all of the color parameters CB presented significant differences. CB had slight delayed fruit ripening as evidenced by higher titratable acidity (0.3-0.5 g citric acid 100 g(-1)) and lower pH (3.4-3.6) than control during storage; however, it showed reduced firmness (up to 38%). The use of Q/CH/SO as a coating in fresh blueberries was able to control the growth of molds and yeasts during 32 days of storage, whereas the control showed an increasing of molds and yeast, between 1.8 and 3.1 log cycles (between 20 and 35 days). © 2015 Society of Chemical Industry.

  7. Using structural knowledge in the protein data bank to inform the search for potential host-microbe protein interactions in sequence space: application to Mycobacterium tuberculosis.

    Science.gov (United States)

    Mahajan, Gaurang; Mande, Shekhar C

    2017-04-04

    A comprehensive map of the human-M. tuberculosis (MTB) protein interactome would help fill the gaps in our understanding of the disease, and computational prediction can aid and complement experimental studies towards this end. Several sequence-based in silico approaches tap the existing data on experimentally validated protein-protein interactions (PPIs); these PPIs serve as templates from which novel interactions between pathogen and host are inferred. Such comparative approaches typically make use of local sequence alignment, which, in the absence of structural details about the interfaces mediating the template interactions, could lead to incorrect inferences, particularly when multi-domain proteins are involved. We propose leveraging the domain-domain interaction (DDI) information in PDB complexes to score and prioritize candidate PPIs between host and pathogen proteomes based on targeted sequence-level comparisons. Our method picks out a small set of human-MTB protein pairs as candidates for physical interactions, and the use of functional meta-data suggests that some of them could contribute to the in vivo molecular cross-talk between pathogen and host that regulates the course of the infection. Further, we present numerical data for Pfam domain families that highlights interaction specificity on the domain level. Not every instance of a pair of domains, for which interaction evidence has been found in a few instances (i.e. structures), is likely to functionally interact. Our sorting approach scores candidates according to how "distant" they are in sequence space from known examples of DDIs (templates). Thus, it provides a natural way to deal with the heterogeneity in domain-level interactions. Our method represents a more informed application of local alignment to the sequence-based search for potential human-microbial interactions that uses available PPI data as a prior. Our approach is somewhat limited in its sensitivity by the restricted size and

  8. Recyclability of PET/WPI/PE Multilayer Films by Removal of Whey Protein Isolate-Based Coatings with Enzymatic Detergents

    Directory of Open Access Journals (Sweden)

    Patrizia Cinelli

    2016-06-01

    Full Text Available Multilayer plastic films provide a range of properties, which cannot be obtained from monolayer films but, at present, their recyclability is an open issue and should be improved. Research to date has shown the possibility of using whey protein as a layer material with the property of acting as an excellent barrier against oxygen and moisture, replacing petrochemical non-recyclable materials. The innovative approach of the present research was to achieve the recyclability of the substrate films by separating them, with a simple process compatible with industrial procedures, in order to promote recycling processes leading to obtain high value products that will beneficially impact the packaging and food industries. Hence, polyethyleneterephthalate (PET/polyethylene (PE multi-layer film was prepared based on PET coated with a whey protein layer, and then the previous structure was laminated with PE. Whey proteins, constituting the coating, can be degraded by enzymes so that the coating films can be washed off from the plastic substrate layer. Enzyme types, dosage, time, and temperature optima, which are compatible with procedures adopted in industrial waste recycling, were determined for a highly-efficient process. The washing of samples based on PET/whey and PET/whey/PE were efficient when performed with enzymatic detergent containing protease enzymes, as an alternative to conventional detergents used in recycling facilities. Different types of enzymatic detergents tested presented positive results in removing the protein layer from the PET substrate and from the PET/whey/PE multilayer films at room temperature. These results attested to the possibility of organizing the pre-treatment of the whey-based multilayer film by washing with different available commercial enzymatic detergents in order to separate PET and PE, thus allowing a better recycling of the two different polymers. Mechanical properties of the plastic substrate, such as stress at

  9. Recyclability of PET/WPI/PE Multilayer Films by Removal of Whey Protein Isolate-Based Coatings with Enzymatic Detergents

    Science.gov (United States)

    Cinelli, Patrizia; Schmid, Markus; Bugnicourt, Elodie; Coltelli, Maria Beatrice; Lazzeri, Andrea

    2016-01-01

    Multilayer plastic films provide a range of properties, which cannot be obtained from monolayer films but, at present, their recyclability is an open issue and should be improved. Research to date has shown the possibility of using whey protein as a layer material with the property of acting as an excellent barrier against oxygen and moisture, replacing petrochemical non-recyclable materials. The innovative approach of the present research was to achieve the recyclability of the substrate films by separating them, with a simple process compatible with industrial procedures, in order to promote recycling processes leading to obtain high value products that will beneficially impact the packaging and food industries. Hence, polyethyleneterephthalate (PET)/polyethylene (PE) multi-layer film was prepared based on PET coated with a whey protein layer, and then the previous structure was laminated with PE. Whey proteins, constituting the coating, can be degraded by enzymes so that the coating films can be washed off from the plastic substrate layer. Enzyme types, dosage, time, and temperature optima, which are compatible with procedures adopted in industrial waste recycling, were determined for a highly-efficient process. The washing of samples based on PET/whey and PET/whey/PE were efficient when performed with enzymatic detergent containing protease enzymes, as an alternative to conventional detergents used in recycling facilities. Different types of enzymatic detergents tested presented positive results in removing the protein layer from the PET substrate and from the PET/whey/PE multilayer films at room temperature. These results attested to the possibility of organizing the pre-treatment of the whey-based multilayer film by washing with different available commercial enzymatic detergents in order to separate PET and PE, thus allowing a better recycling of the two different polymers. Mechanical properties of the plastic substrate, such as stress at yield, stress and

  10. Insights into Protein Sequence and Structure-Derived Features Mediating 3D Domain Swapping Mechanism using Support Vector Machine Based Approach

    Directory of Open Access Journals (Sweden)

    Khader Shameer

    2010-06-01

    Full Text Available 3-dimensional domain swapping is a mechanism where two or more protein molecules form higher order oligomers by exchanging identical or similar subunits. Recently, this phenomenon has received much attention in the context of prions and neuro-degenerative diseases, due to its role in the functional regulation, formation of higher oligomers, protein misfolding, aggregation etc. While 3-dimensional domain swap mechanism can be detected from three-dimensional structures, it remains a formidable challenge to derive common sequence or structural patterns from proteins involved in swapping. We have developed a SVM-based classifier to predict domain swapping events using a set of features derived from sequence and structural data. The SVM classifier was trained on features derived from 150 proteins reported to be involved in 3D domain swapping and 150 proteins not known to be involved in swapped conformation or related to proteins involved in swapping phenomenon. The testing was performed using 63 proteins from the positive dataset and 63 proteins from the negative dataset. We obtained 76.33% accuracy from training and 73.81% accuracy from testing. Due to high diversity in the sequence, structure and functions of proteins involved in domain swapping, availability of such an algorithm to predict swapping events from sequence and structure-derived features will be an initial step towards identification of more putative proteins that may be involved in swapping or proteins involved in deposition disease. Further, the top features emerging in our feature selection method may be analysed further to understand their roles in the mechanism of domain swapping.

  11. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    Science.gov (United States)

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by

  12. Effects of orthopedic implants with a polycaprolactone polymer coating containing bone morphogenetic protein-2 on osseointegration in bones of sheep.

    Science.gov (United States)

    Niehaus, Andrew J; Anderson, David E; Samii, Valerie F; Weisbrode, Steven E; Johnson, Jed K; Noon, Mike S; Tomasko, David L; Lannutti, John J

    2009-11-01

    To determine elution characteristics of bone morphogenetic protein (BMP)-2 from a polycaprolactone coating applied to orthopedic implants and determine effects of this coating on osseointegration. 6 sheep. An in vitro study was conducted to determine BMP-2 elution from polycaprolactone-coated implants. An in vivo study was conducted to determine the effects on osseointegration when the polycaprolactone with BMP-2 coating was applied to bone screws. Osseointegration was assessed via radiography, measurement of peak removal torque and bone mineral density, and histomorphometric analysis. Physiologic response was assessed by measuring serum bone-specific alkaline phosphatase activity and uptake of bone markers. Mean +/- SD elution on day 1 of the in vitro study was 263 +/- 152 pg/d, which then maintained a plateau at 59.8 +/- 29.1 pg/d. Mean peak removal torque for screws coated with polycalprolactone and BMP-2 (0.91 +/- 0.65 dN x m) and screws coated with polycaprolactone alone (0.97 +/- 1.30 dN.m) did not differ significantly from that for the control screws (2.34 +/- 1.62 dN x m). Mean bone mineral densities were 0.535 +/- 0.060 g/cm(2), 0.596 +/- 0.093 g/cm(2), and 0.524 +/- 0.142 g/cm(2) for the polycaprolactone-BMP-2-coated, polycaprolactone-coated, and control screws, respectively, and did not differ significantly among groups. Histologically, bone was in closer apposition to the implant with the control screws than with either of the coated screws. BMP-2 within the polycaprolactone coating did not stimulate osteogenesis. The polycaprolactone coating appeared to cause a barrier effect that prevented formation of new bone. A longer period or use of another carrier polymer may result in increased osseointegration.

  13. Creation and structure determination of an artificial protein with three complete sequence repeats

    Energy Technology Data Exchange (ETDEWEB)

    Adachi, Motoyasu, E-mail: adachi.motoyasu@jaea.go.jp; Shimizu, Rumi; Kuroki, Ryota [Japan Atomic Energy Agency, Shirakatashirane 2-4, Nakagun Tokaimura, Ibaraki 319-1195 (Japan); Blaber, Michael [Japan Atomic Energy Agency, Shirakatashirane 2-4, Nakagun Tokaimura, Ibaraki 319-1195 (Japan); Florida State University, Tallahassee, FL 32306-4300 (United States)

    2013-11-01

    An artificial protein with three complete sequence repeats was created and the structure was determined by X-ray crystallography. The structure showed threefold symmetry even though there is an amino- and carboxy-terminal. The artificial protein with threefold symmetry may be useful as a scaffold to capture small materials with C3 symmetry. Symfoil-4P is a de novo protein exhibiting the threefold symmetrical β-trefoil fold designed based on the human acidic fibroblast growth factor. First three asparagine–glycine sequences of Symfoil-4P are replaced with glutamine–glycine (Symfoil-QG) or serine–glycine (Symfoil-SG) sequences protecting from deamidation, and His-Symfoil-II was prepared by introducing a protease digestion site into Symfoil-QG so that Symfoil-II has three complete repeats after removal of the N-terminal histidine tag. The Symfoil-QG and SG and His-Symfoil-II proteins were expressed in Eschericha coli as soluble protein, and purified by nickel affinity chromatography. Symfoil-II was further purified by anion-exchange chromatography after removing the HisTag by proteolysis. Both Symfoil-QG and Symfoil-II were crystallized in 0.1 M Tris-HCl buffer (pH 7.0) containing 1.8 M ammonium sulfate as precipitant at 293 K; several crystal forms were observed for Symfoil-QG and II. The maximum diffraction of Symfoil-QG and II crystals were 1.5 and 1.1 Å resolution, respectively. The Symfoil-II without histidine tag diffracted better than Symfoil-QG with N-terminal histidine tag. Although the crystal packing of Symfoil-II is slightly different from Symfoil-QG and other crystals of Symfoil derivatives having the N-terminal histidine tag, the refined crystal structure of Symfoil-II showed pseudo-threefold symmetry as expected from other Symfoils. Since the removal of the unstructured N-terminal histidine tag did not affect the threefold structure of Symfoil, the improvement of diffraction quality of Symfoil-II may be caused by molecular characteristics of

  14. Large scale identification and categorization of protein sequences using structured logistic regression

    DEFF Research Database (Denmark)

    Pedersen, Bjørn Panella; Ifrim, Georgiana; Liboriussen, Poul

    2014-01-01

    Abstract Background Structured Logistic Regression (SLR) is a newly developed machine learning tool first proposed in the context of text categorization. Current availability of extensive protein sequence databases calls for an automated method to reliably classify sequences and SLR seems well...... problem. Results Using SLR, we have built classifiers to identify and automatically categorize P-type ATPases into one of 11 pre-defined classes. The SLR-classifiers are compared to a Hidden Markov Model approach and shown to be highly accurate and scalable. Representing the bulk of currently known...... for further biochemical characterization and structural analysis....

  15. Architecture and assembly of the Bacillus subtilis spore coat.

    Science.gov (United States)

    Plomp, Marco; Carroll, Alicia Monroe; Setlow, Peter; Malkin, Alexander J

    2014-01-01

    Bacillus spores are encased in a multilayer, proteinaceous self-assembled coat structure that assists in protecting the bacterial genome from stresses and consists of at least 70 proteins. The elucidation of Bacillus spore coat assembly, architecture, and function is critical to determining mechanisms of spore pathogenesis, environmental resistance, immune response, and physicochemical properties. Recently, genetic, biochemical and microscopy methods have provided new insight into spore coat architecture, assembly, structure and function. However, detailed spore coat architecture and assembly, comprehensive understanding of the proteomic composition of coat layers, and specific roles of coat proteins in coat assembly and their precise localization within the coat remain in question. In this study, atomic force microscopy was used to probe the coat structure of Bacillus subtilis wild type and cotA, cotB, safA, cotH, cotO, cotE, gerE, and cotE gerE spores. This approach provided high-resolution visualization of the various spore coat structures, new insight into the function of specific coat proteins, and enabled the development of a detailed model of spore coat architecture. This model is consistent with a recently reported four-layer coat assembly and further adds several coat layers not reported previously. The coat is organized starting from the outside into an outermost amorphous (crust) layer, a rodlet layer, a honeycomb layer, a fibrous layer, a layer of "nanodot" particles, a multilayer assembly, and finally the undercoat/basement layer. We propose that the assembly of the previously unreported fibrous layer, which we link to the darkly stained outer coat seen by electron microscopy, and the nanodot layer are cotH- and cotE- dependent and cotE-specific respectively. We further propose that the inner coat multilayer structure is crystalline with its apparent two-dimensional (2D) nuclei being the first example of a non-mineral 2D nucleation crystallization

  16. Architecture and Assembly of the Bacillus subtilis Spore Coat

    Science.gov (United States)

    Plomp, Marco; Carroll, Alicia Monroe; Setlow, Peter; Malkin, Alexander J.

    2014-01-01

    Bacillus spores are encased in a multilayer, proteinaceous self-assembled coat structure that assists in protecting the bacterial genome from stresses and consists of at least 70 proteins. The elucidation of Bacillus spore coat assembly, architecture, and function is critical to determining mechanisms of spore pathogenesis, environmental resistance, immune response, and physicochemical properties. Recently, genetic, biochemical and microscopy methods have provided new insight into spore coat architecture, assembly, structure and function. However, detailed spore coat architecture and assembly, comprehensive understanding of the proteomic composition of coat layers, and specific roles of coat proteins in coat assembly and their precise localization within the coat remain in question. In this study, atomic force microscopy was used to probe the coat structure of Bacillus subtilis wild type and cotA, cotB, safA, cotH, cotO, cotE, gerE, and cotE gerE spores. This approach provided high-resolution visualization of the various spore coat structures, new insight into the function of specific coat proteins, and enabled the development of a detailed model of spore coat architecture. This model is consistent with a recently reported four-layer coat assembly and further adds several coat layers not reported previously. The coat is organized starting from the outside into an outermost amorphous (crust) layer, a rodlet layer, a honeycomb layer, a fibrous layer, a layer of “nanodot” particles, a multilayer assembly, and finally the undercoat/basement layer. We propose that the assembly of the previously unreported fibrous layer, which we link to the darkly stained outer coat seen by electron microscopy, and the nanodot layer are cotH- and cotE- dependent and cotE-specific respectively. We further propose that the inner coat multilayer structure is crystalline with its apparent two-dimensional (2D) nuclei being the first example of a non-mineral 2D nucleation crystallization

  17. Protein-Protein Interactions Prediction Using a Novel Local Conjoint Triad Descriptor of Amino Acid Sequences

    Directory of Open Access Journals (Sweden)

    Jun Wang

    2017-11-01

    Full Text Available Protein-protein interactions (PPIs play crucial roles in almost all cellular processes. Although a large amount of PPIs have been verified by high-throughput techniques in the past decades, currently known PPIs pairs are still far from complete. Furthermore, the wet-lab experiments based techniques for detecting PPIs are time-consuming and expensive. Hence, it is urgent and essential to develop automatic computational methods to efficiently and accurately predict PPIs. In this paper, a sequence-based approach called DNN-LCTD is developed by combining deep neural networks (DNNs and a novel local conjoint triad description (LCTD feature representation. LCTD incorporates the advantage of local description and conjoint triad, thus, it is capable to account for the interactions between residues in both continuous and discontinuous regions of amino acid sequences. DNNs can not only learn suitable features from the data by themselves, but also learn and discover hierarchical representations of data. When performing on the PPIs data of Saccharomyces cerevisiae, DNN-LCTD achieves superior performance with accuracy as 93.12%, precision as 93.75%, sensitivity as 93.83%, area under the receiver operating characteristic curve (AUC as 97.92%, and it only needs 718 s. These results indicate DNN-LCTD is very promising for predicting PPIs. DNN-LCTD can be a useful supplementary tool for future proteomics study.

  18. Accelerated differentiation of osteoblast cells on polycaprolactone scaffolds driven by a combined effect of protein coating and plasma modification

    Energy Technology Data Exchange (ETDEWEB)

    Yildirim, Eda D; Gueceri, Selcuk; Sun, Wei [Department of Mechanical Engineering and Mechanics, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104 (United States); Besunder, Robyn; Allen, Fred [Drexel University, School of Biomedical Engineering Science and Health System, 3141 Chestnut Street, Philadelphia, PA 19104 (United States); Pappas, Daphne, E-mail: edy22@drexel.ed [Army Research Laboratory, Aberdeen Proving Ground, MD 21005 (United States)

    2010-03-15

    A combined effect of protein coating and plasma modification on the quality of the osteoblast-scaffold interaction was investigated. Three-dimensional polycaprolactone (PCL) scaffolds were manufactured by the precision extrusion deposition (PED) system. The structural, physical, chemical and biological cues were introduced to the surface through providing 3D structure, coating with adhesive protein fibronectin and modifying the surface with oxygen-based plasma. The changes in the surface properties of PCL after those modifications were examined by contact angle goniometry, surface energy calculation, surface chemistry analysis (XPS) and surface topography measurements (AFM). The effects of modification techniques on osteoblast short-term and long-term functions were examined by cell adhesion, proliferation assays and differentiation markers, namely alkaline phosphatase activity (ALP) and osteocalcin secretion. The results suggested that the physical and chemical cues introduced by plasma modification might be sufficient for improved cell adhesion, but for accelerated osteoblast differentiation the synergetic effects of structural, physical, chemical and biological cues should be introduced to the PCL surface.

  19. Sequence analysis of the L protein of the Ebola 2014 outbreak: Insight into conserved regions and mutations.

    Science.gov (United States)

    Ayub, Gohar; Waheed, Yasir

    2016-06-01

    The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents.

  20. Hydrophobic cluster analysis of G protein-coupled receptors: a powerful tool to derive structural and functional information from 2D-representation of protein sequences

    NARCIS (Netherlands)

    Lentes, K.U.; Mathieu, E.; Bischoff, Rainer; Rasmussen, U.B.; Pavirani, A.

    1993-01-01

    Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20-25%,