WorldWideScience

Sample records for synthases genomic structures

  1. Deciphering the genomic structure, function and evolution of carotenogenesis related phytoene synthases in grasses

    Directory of Open Access Journals (Sweden)

    Dibari Bianca

    2012-06-01

    Full Text Available Abstract Background Carotenoids are isoprenoid pigments, essential for photosynthesis and photoprotection in plants. The enzyme phytoene synthase (PSY plays an essential role in mediating condensation of two geranylgeranyl diphosphate molecules, the first committed step in carotenogenesis. PSY are nuclear enzymes encoded by a small gene family consisting of three paralogous genes (PSY1-3 that have been widely characterized in rice, maize and sorghum. Results In wheat, for which yellow pigment content is extremely important for flour colour, only PSY1 has been extensively studied because of its association with QTLs reported for yellow pigment whereas PSY2 has been partially characterized. Here, we report the isolation of bread wheat PSY3 genes from a Renan BAC library using Brachypodium as a model genome for the Triticeae to develop Conserved Orthologous Set markers prior to gene cloning and sequencing. Wheat PSY3 homoeologous genes were sequenced and annotated, unravelling their novel structure associated with intron-loss events and consequent exonic fusions. A wheat PSY3 promoter region was also investigated for the presence of cis-acting elements involved in the response to abscisic acid (ABA, since carotenoids also play an important role as precursors of signalling molecules devoted to plant development and biotic/abiotic stress responses. Expression of wheat PSYs in leaves and roots was investigated during ABA treatment to confirm the up-regulation of PSY3 during abiotic stress. Conclusions We investigated the structural and functional determinisms of PSY genes in wheat. More generally, among eudicots and monocots, the PSY gene family was found to be associated with differences in gene copy numbers, allowing us to propose an evolutionary model for the entire PSY gene family in Grasses.

  2. STRUCTURAL ENZYMOLOGY OF POLYKETIDE SYNTHASES

    OpenAIRE

    Tsai, Shiou-Chuan (Sheryl); Ames, Brian Douglas

    2009-01-01

    This chapter describes structural and associated enzymological studies of polyketide synthases, including isolated single domains and multidomain fragments. The sequence–structure–function relationship of polyketide biosynthesis, compared with homologous fatty acid synthesis, is discussed in detail. Structural enzymology sheds light on sequence and structural motifs that are important for the precise timing, substrate recognition, enzyme catalysis, and protein–protein interactions leading to ...

  3. Crystal structure of riboflavin synthase

    Energy Technology Data Exchange (ETDEWEB)

    Liao, D.-I.; Wawrzak, Z.; Calabrese, J.C.; Viitanen, P.V.; Jordan, D.B. (DuPont); (NWU)

    2010-03-05

    Riboflavin synthase catalyzes the dismutation of two molecules of 6,7-dimethyl-8-(1'-D-ribityl)-lumazine to yield riboflavin and 4-ribitylamino-5-amino-2,6-dihydroxypyrimidine. The homotrimer of 23 kDa subunits has no cofactor requirements for catalysis. The enzyme is nonexistent in humans and is an attractive target for antimicrobial agents of organisms whose pathogenicity depends on their ability to biosynthesize riboflavin. The first three-dimensional structure of the enzyme was determined at 2.0 {angstrom} resolution using the multiwavelength anomalous diffraction (MAD) method on the Escherichia coli protein containing selenomethionine residues. The homotrimer consists of an asymmetric assembly of monomers, each of which comprises two similar {beta} barrels and a C-terminal {alpha} helix. The similar {beta} barrels within the monomer confirm a prediction of pseudo two-fold symmetry that is inferred from the sequence similarity between the two halves of the protein. The {beta} barrels closely resemble folds found in phthalate dioxygenase reductase and other flavoproteins. The three active sites of the trimer are proposed to lie between pairs of monomers in which residues conserved among species reside, including two Asp-His-Ser triads and dyads of Cys-Ser and His-Thr. The proposed active sites are located where FMN (an analog of riboflavin) is modeled from an overlay of the {beta} barrels of phthalate dioxygenase reductase and riboflavin synthase. In the trimer, one active site is formed, and the other two active sites are wide open and exposed to solvent. The nature of the trimer configuration suggests that only one active site can be formed and be catalytically competent at a time.

  4. A functional isopenicillin N synthase in an animal genome

    NARCIS (Netherlands)

    Roelofs, D.; Timmermans, M.J.T.N.; Hensbergen, P.J.; van Leeuwen, H.; Koopman, J.; Faddeeva-Vakhrusheva, A.; Suring, W.J.; de Boer, T.E.; Mariën, A.G.H.; Boer, R.; Bovenberg, R.; van Straalen, N.M.

    Horizontal transfer of genes is widespread among prokaryotes, but is less common between microorganisms and animals. Here, we present evidence for the presence of a gene encoding functional isopenicillin N synthase, an enzyme in the β-lactam antibiotics biosynthesis pathway, in the genome of the

  5. Functional isopenicillin N synthase in an animal genome

    NARCIS (Netherlands)

    Roelofs, D.; Timmermans, M.J.T.N.; Hensbergen, P.; van Leeuwen, H.; Koopman, J.; Faddeeva, A.; Suring, W.; de Boer, T.E.; Mariën, J.; Boer, R.; Bovenberg, R.; van Straalen, N.M.

    2013-01-01

    Horizontal transfer of genes is widespread among prokaryotes, but is less common between microorganisms and animals. Here, we present evidence for the presence of a gene encoding functional isopenicillin N synthase, an enzyme in the β-lactam antibiotics biosynthesis pathway, in the genome of the

  6. Structural Basis of Catalysis in the Bacterial Monoterpene Synthases Linalool Synthase and 1,8-Cineole Synthase

    OpenAIRE

    Karuppiah, Vijaykumar; Ranaghan, Kara E.; Leferink, Nicole G. H.; Johannissen, Linus O.; Shanmugam, Muralidharan; Ní Cheallaigh, Aisling; Bennett, Nathan J.; Kearsey, Lewis J.; Takano, Eriko; Gardiner, John M.; van der Kamp, Marc W.; Hay, Sam; Mulholland, Adrian J.; Leys, David; Scrutton, Nigel S.

    2017-01-01

    Terpenoids form the largest and stereochemically most diverse class of natural products, and there is considerable interest in producing these by biocatalysis with whole cells or purified enzymes, and by metabolic engineering. The monoterpenes are an important class of terpenes and are industrially important as flavors and fragrances. We report here structures for the recently discovered Streptomyces clavuligerus monoterpene synthases linalool synthase (bLinS) and 1,8-cineole synthase (bCinS)...

  7. Uncovering the structures of modular polyketide synthases.

    Science.gov (United States)

    Weissman, Kira J

    2015-03-01

    The modular polyketide synthases (PKSs) are multienzyme proteins responsible for the assembly of diverse secondary metabolites of high economic and therapeutic importance. These molecular 'assembly lines' consist of repeated functional units called 'modules' organized into gigantic polypeptides. For several decades, concerted efforts have been made to understand in detail the structure and function of PKSs in order to facilitate genetic engineering of the systems towards the production of polyketide analogues for evaluation as drug leads. Despite this intense activity, it has not yet been possible to solve the crystal structure of a single module, let alone a multimodular subunit. Nonetheless, on the basis of analysis of the structures of modular fragments and the study of the related multienzyme of animal fatty acid synthase (FAS), several models of modular PKS architecture have been proposed. This year, however, the situation has changed - three modular structures have been characterized, not by X-ray crystallography, but by the complementary methods of single-particle cryo-electron microscopy and small-angle X-ray scattering. This review aims to compare the cryo-EM structures and SAXS-derived structural models, and to interpret them in the context of previously obtained data and existing architectural proposals. The consequences for genetic engineering of the systems will also be discussed, as well as unresolved questions and future directions.

  8. Structural Basis of Catalysis in the Bacterial Monoterpene Synthases Linalool Synthase and 1,8-Cineole Synthase

    Science.gov (United States)

    2017-01-01

    Terpenoids form the largest and stereochemically most diverse class of natural products, and there is considerable interest in producing these by biocatalysis with whole cells or purified enzymes, and by metabolic engineering. The monoterpenes are an important class of terpenes and are industrially important as flavors and fragrances. We report here structures for the recently discovered Streptomyces clavuligerus monoterpene synthases linalool synthase (bLinS) and 1,8-cineole synthase (bCinS), and we show that these are active biocatalysts for monoterpene production using biocatalysis and metabolic engineering platforms. In metabolically engineered monoterpene-producing E. coli strains, use of bLinS leads to 300-fold higher linalool production compared with the corresponding plant monoterpene synthase. With bCinS, 1,8-cineole is produced with 96% purity compared to 67% from plant species. Structures of bLinS and bCinS, and their complexes with fluorinated substrate analogues, show that these bacterial monoterpene synthases are similar to previously characterized sesquiterpene synthases. Molecular dynamics simulations suggest that these monoterpene synthases do not undergo large-scale conformational changes during the reaction cycle, making them attractive targets for structured-based protein engineering to expand the catalytic scope of these enzymes toward alternative monoterpene scaffolds. Comparison of the bLinS and bCinS structures indicates how their active sites steer reactive carbocation intermediates to the desired acyclic linalool (bLinS) or bicyclic 1,8-cineole (bCinS) products. The work reported here provides the analysis of structures for this important class of monoterpene synthase. This should now guide exploitation of the bacterial enzymes as gateway biocatalysts for the production of other monoterpenes and monoterpenoids. PMID:28966840

  9. Genomic insights into the distribution, genetic diversity and evolution of polyketide synthases and nonribosomal peptide synthetases.

    Science.gov (United States)

    Wang, Hao; Sivonen, Kaarina; Fewer, David P

    2015-12-01

    Polyketides and nonribosomal peptides are important secondary metabolites that exhibit enormous structural diversity, have many pharmaceutical applications, and include a number of clinically important drugs. These complex metabolites are most commonly synthesized on enzymatic assembly lines of polyketide synthases and nonribosomal peptide synthetases. Genome-mining studies making use of the recent explosion in the number of genome sequences have demonstrated unexpected enzymatic diversity and greatly expanded the known distribution of these enzyme systems across the three domains of life. The wealth of data now available suggests that genome-mining efforts will uncover new natural products, novel biosynthetic mechanisms, and shed light on the origin and evolution of these important enzymes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Genome and transcriptome-wide analyses of cellulose synthase gene superfamily in soybean.

    Science.gov (United States)

    Nawaz, Muhammad Amjad; Rehman, Hafiz Mamoon; Baloch, Faheem Shehzad; Ijaz, Babar; Ali, Muhammad Amjad; Khan, Iqrar Ahmad; Lee, Jeong Dong; Chung, Gyuhwa; Yang, Seung Hwan

    2017-08-01

    The plant cellulose synthase gene superfamily belongs to the category of type-2 glycosyltransferases, and is involved in cellulose and hemicellulose biosynthesis. These enzymes are vital for maintaining cell-wall structural integrity throughout plant life. Here, we identified 78 putative cellulose synthases (CS) in the soybean genome. Phylogenetic analysis against 40 reference Arabidopsis CS genes clustered soybean CSs into seven major groups (CESA, CSL A, B, C, D, E and G), located on 19 chromosomes (except chromosome 18). Soybean CS expansion occurred in 66 duplication events. Additionally, we identified 95 simple sequence repeat makers related to 44 CSs. We next performed digital expression analysis using publically available datasets to understand potential CS functions in soybean. We found that CSs were highly expressed during soybean seed development, a pattern confirmed with an Affymatrix soybean IVT array and validated with RNA-seq profiles. Within CS groups, CESAs had higher relative expression than CSLs. Soybean CS models were designed based on maximum average RPKM values. Gene co-expression networks were developed to explore which CSs could work together in soybean. Finally, RT-PCR analysis confirmed the expression of 15 selected CSs during all four seed developmental stages. Copyright © 2017 Elsevier GmbH. All rights reserved.

  11. Cryo-EM structure of the yeast ATP synthase.

    Science.gov (United States)

    Lau, Wilson C Y; Baker, Lindsay A; Rubinstein, John L

    2008-10-24

    We have used electron cryomicroscopy of single particles to determine the structure of the ATP synthase from Saccharomyces cerevisiae. The resulting map at 24 A resolution can accommodate atomic models of the F(1)-c(10) subcomplex, the peripheral stalk subcomplex, and the N-terminal domain of the oligomycin sensitivity conferral protein. The map is similar to an earlier electron cryomicroscopy structure of bovine mitochondrial ATP synthase but with important differences. It resolves the internal structure of the membrane region of the complex, especially the membrane embedded subunits b, c, and a. Comparison of the yeast ATP synthase map, which lacks density from the dimer-specific subunits e and g, with a map of the bovine enzyme that included e and g indicates where these subunits are located in the intact complex. This new map has allowed construction of a model of subunit arrangement in the F(O) motor of ATP synthase that dictates how dimerization of the complex via subunits e and g might occur.

  12. The Structural Basis of Erwinia rhapontici Isomaltulose Synthase

    Science.gov (United States)

    Xu, Zheng; Li, Sha; Li, Jie; Li, Yan; Feng, Xiaohai; Wang, Renxiao; Xu, Hong; Zhou, Jiahai

    2013-01-01

    Sucrose isomerase NX-5 from Erwiniarhapontici efficiently catalyzes the isomerization of sucrose to isomaltulose (main product) and trehalulose (by-product). To investigate the molecular mechanism controlling sucrose isomer formation, we determined the crystal structures of native NX-5 and its mutant complexes E295Q/sucrose and D241A/glucose at 1.70 Å, 1.70 Å and 2.00 Å, respectively. The overall structure and active site architecture of NX-5 resemble those of other reported sucrose isomerases. Strikingly, the substrate binding mode of NX-5 is also similar to that of trehalulose synthase from Pseudomonasmesoacidophila MX-45 (MutB). Detailed structural analysis revealed the catalytic RXDRX motif and the adjacent 10-residue loop of NX-5 and isomaltulose synthase PalI from Klebsiella sp. LX3 adopt a distinct orientation from those of trehalulose synthases. Mutations of the loop region of NX-5 resulted in significant changes of the product ratio between isomaltulose and trehalulose. The molecular dynamics simulation data supported the product specificity of NX-5 towards isomaltulose and the role of the loop330-339 in NX-5 catalysis. This work should prove useful for the engineering of sucrose isomerase for industrial carbohydrate biotransformations. PMID:24069347

  13. Structural Analysis of Thymidylate Synthase from Kaposi's Sarcoma-Associated Herpesvirus with the Anticancer Drug Raltitrexed.

    Directory of Open Access Journals (Sweden)

    Yong Mi Choi

    Full Text Available Kaposi's sarcoma-associated herpesvirus (KSHV is a highly infectious human herpesvirus that causes Kaposi's sarcoma. KSHV encodes functional thymidylate synthase, which is a target for anticancer drugs such as raltitrexed or 5-fluorouracil. Thymidylate synthase catalyzes the conversion of 2'-deoxyuridine-5'-monophosphate (dUMP to thymidine-5'-monophosphate (dTMP using 5,10-methylenetetrahydrofolate (mTHF as a co-substrate. The crystal structures of thymidylate synthase from KSHV (apo, complexes with dUMP (binary, and complexes with both dUMP and raltitrexed (ternary were determined at 1.7 Å, 2.0 Å, and 2.4 Å, respectively. While the ternary complex structures of human thymidylate synthase and E. coli thymidylate synthase had a closed conformation, the ternary complex structure of KSHV thymidylate synthase was observed in an open conformation, similar to that of rat thymidylate synthase. The complex structures of KSHV thymidylate synthase did not have a covalent bond between the sulfhydryl group of Cys219 and C6 atom of dUMP, unlike the human thymidylate synthase. The catalytic Cys residue demonstrated a dual conformation in the apo structure, and its sulfhydryl group was oriented toward the C6 atom of dUMP with no covalent bond upon ligand binding in the complex structures. These structural data provide the potential use of antifolates such as raltitrexed as a viral induced anticancer drug and structural basis to design drugs for targeting the thymidylate synthase of KSHV.

  14. Crystal Structures of Two Isozymes of Citrate Synthase from Sulfolobus tokodaii Strain 7

    Directory of Open Access Journals (Sweden)

    Midori Murakami

    2016-01-01

    Full Text Available Thermoacidophilic archaeon Sulfolobus tokodaii strain 7 has two citrate synthase genes (ST1805-CS and ST0587-CS in the genome with 45% sequence identity. Because they exhibit similar optimal temperatures of catalytic activity and thermal inactivation profiles, we performed structural comparisons between these isozymes to elucidate adaptation mechanisms to high temperatures in thermophilic CSs. The crystal structures of ST1805-CS and ST0587-CS were determined at 2.0 Å and 2.7 Å resolutions, respectively. Structural comparison reveals that both of them are dimeric enzymes composed of two identical subunits, and these dimeric structures are quite similar to those of citrate synthases from archaea and eubacteria. ST0587-CS has, however, 55 ion pairs within whole dimer structure, while having only 36 in ST1805-CS. Although the number and distributions of ion pairs are distinct from each other, intersubunit ion pairs between two domains of each isozyme are identical especially in interterminal region. Because the location and number of ion pairs are in a trend with other CSs from thermophilic microorganisms, the factors responsible for thermal adaptation of ST-CS isozymes are characterized by ion pairs in interterminal region.

  15. The multifunctional 6-methylsalicylic acid synthase gene of Penicillium patulum. Its gene structure relative to that of other polyketide synthases.

    Science.gov (United States)

    Beck, J; Ripka, S; Siegner, A; Schiltz, E; Schweizer, E

    1990-09-11

    6-Methylsalicylic acid synthase (MSAS) from Penicillium patulum is a homomultimer of a single, multifunctional protein subunit. The enzyme is induced, at the transcriptional level, during the end of the logarithmic growth phase. After approximately 150-fold purification, a homogeneous enzyme preparation was obtained exhibiting, upon SDS gel electrophoresis, a subunit molecular mass of 188 kDa. By immunological screening of a genomic P. patulum DNA expression library, the MSAS gene together with its flanking sequences was isolated; 7131 base pairs of the cloned genomic DNA were sequenced. Within this sequence the MSAS gene was identified as a 5322-bp-long open reading frame coding for a protein of 1774 amino acids and 190,731 Da molecular mass. Transcriptional initiation and termination sites were determined both by primer extension studies and from cDNA sequences specially prepared for the 5' and 3' portions of the gene. The same cDNA sequences revealed the presence of a 69-bp intron within the N-terminal part of the MSAS gene. The intron contains the canonical GT and AG dinucleotides at its 5'- and 3'-splice junctions. An internal TACTGAC sequence, resembling the TACTAAC consensus element of Saccharomyces cerevisiae introns is suggested to represent the branch point of the lariat splicing intermediate. When compared to other known polyketide synthases, distinct amino acid sequence similarities of limited lengths were observed with some, though not all, of them. A comparatively low degree of similarity was detected to the yeast and Penicillium FAS or to the plant chalcone and resveratrol synthases. In contrast, a significantly higher sequence similarity was found between MSAS and the rat fatty acid synthase, especially at their transacylase, 2-oxoacyl reductase, 2-oxoacyl synthase and acyl carrier protein domains. Besides several dissimilar, interspersed regions probably coding for MSAS- and FAS-specific functions, the sequential order of the similar domains was

  16. Structure of the dimeric form of CTP synthase from Sulfolobus solfataricus

    DEFF Research Database (Denmark)

    Lauritsen, Iben; Willemoës, Martin; Jensen, Kaj Frank

    2011-01-01

    CTP synthase catalyzes the last committed step in de novo pyrimidine-nucleotide biosynthesis. Active CTP synthase is a tetrameric enzyme composed of a dimer of dimers. The tetramer is favoured in the presence of the substrate nucleotides ATP and UTP; when saturated with nucleotide, the tetramer c....... solfataricus CTP synthase according to a structural alignment with the E. coli enzyme all have large thermal parameters in the dimeric form. Furthermore, they are seen to undergo substantial movement upon tetramerization....

  17. Expression, crystallization and structure elucidation of γ-terpinene synthase from Thymus vulgaris.

    Science.gov (United States)

    Rudolph, Kristin; Parthier, Christoph; Egerer-Sieber, Claudia; Geiger, Daniel; Muller, Yves A; Kreis, Wolfgang; Müller-Uri, Frieder

    2016-01-01

    The biosynthesis of γ-terpinene, a precursor of the phenolic isomers thymol and carvacrol found in the essential oil from Thymus sp., is attributed to the activitiy of γ-terpinene synthase (TPS). Purified γ-terpinene synthase from T. vulgaris (TvTPS), the Thymus species that is the most widely spread and of the greatest economical importance, is able to catalyze the enzymatic conversion of geranyl diphosphate (GPP) to γ-terpinene. The crystal structure of recombinantly expressed and purified TvTPS is reported at 1.65 Å resolution, confirming the dimeric structure of the enzyme. The putative active site of TvTPS is deduced from its pronounced structural similarity to enzymes from other species of the Lamiaceae family involved in terpenoid biosynthesis: to (+)-bornyl diphosphate synthase and 1,8-cineole synthase from Salvia sp. and to (4S)-limonene synthase from Mentha spicata.

  18. Structural basis for the recruitment of glycogen synthase by glycogenin.

    Science.gov (United States)

    Zeqiraj, Elton; Tang, Xiaojing; Hunter, Roger W; García-Rocha, Mar; Judd, Andrew; Deak, Maria; von Wilamowitz-Moellendorff, Alexander; Kurinov, Igor; Guinovart, Joan J; Tyers, Mike; Sakamoto, Kei; Sicheri, Frank

    2014-07-15

    Glycogen is a primary form of energy storage in eukaryotes that is essential for glucose homeostasis. The glycogen polymer is synthesized from glucose through the cooperative action of glycogen synthase (GS), glycogenin (GN), and glycogen branching enzyme and forms particles that range in size from 10 to 290 nm. GS is regulated by allosteric activation upon glucose-6-phosphate binding and inactivation by phosphorylation on its N- and C-terminal regulatory tails. GS alone is incapable of starting synthesis of a glycogen particle de novo, but instead it extends preexisting chains initiated by glycogenin. The molecular determinants by which GS recognizes self-glucosylated GN, the first step in glycogenesis, are unknown. We describe the crystal structure of Caenorhabditis elegans GS in complex with a minimal GS targeting sequence in GN and show that a 34-residue region of GN binds to a conserved surface on GS that is distinct from previously characterized allosteric and binding surfaces on the enzyme. The interaction identified in the GS-GN costructure is required for GS-GN interaction and for glycogen synthesis in a cell-free system and in intact cells. The interaction of full-length GS-GN proteins is enhanced by an avidity effect imparted by a dimeric state of GN and a tetrameric state of GS. Finally, the structure of the N- and C-terminal regulatory tails of GS provide a basis for understanding phosphoregulation of glycogen synthesis. These results uncover a central molecular mechanism that governs glycogen metabolism.

  19. Cellulose synthase complex organization and cellulose microfibril structure.

    Science.gov (United States)

    Turner, Simon; Kumar, Manoj

    2018-02-13

    Cellulose consists of linear chains of β-1,4-linked glucose units, which are synthesized by the cellulose synthase complex (CSC). In plants, these chains associate in an ordered manner to form the cellulose microfibrils. Both the CSC and the local environment in which the individual chains coalesce to form the cellulose microfibril determine the structure and the unique physical properties of the microfibril. There are several recent reviews that cover many aspects of cellulose biosynthesis, which include trafficking of the complex to the plasma membrane and the relationship between the movement of the CSC and the underlying cortical microtubules (Bringmann et al. 2012 Trends Plant Sci. 17 , 666-674 (doi:10.1016/j.tplants.2012.06.003); Kumar & Turner 2015 Phytochemistry 112 , 91-99 (doi:10.1016/j.phytochem.2014.07.009); Schneider et al. 2016 Curr. Opin. Plant Biol. 34 , 9-16 (doi:10.1016/j.pbi.2016.07.007)). In this review, we will focus on recent advances in cellulose biosynthesis in plants, with an emphasis on our current understanding of the structure of individual catalytic subunits together with the local membrane environment where cellulose synthesis occurs. We will attempt to relate this information to our current knowledge of the structure of the cellulose microfibril and propose a model in which variations in the structure of the CSC have important implications for the structure of the cellulose microfibril produced.This article is part of a discussion meeting issue 'New horizons for cellulose nanotechnology'. © 2017 The Author(s).

  20. Structure of the human beta-ketoacyl [ACP] synthase from the mitochondrial type II fatty acid synthase

    DEFF Research Database (Denmark)

    Christensen, Caspar Elo; Kragelund, Birthe Brandt; Von Wettstein-Knowles, Penny

    2007-01-01

    Two distinct ways of organizing fatty acid biosynthesis exist: the multifunctional type I fatty acid synthase (FAS) of mammals, fungi, and lower eukaryotes with activities residing on one or two polypeptides; and the dissociated type II FAS of prokaryotes, plastids, and mitochondria with individual...... activities encoded by discrete genes. The beta-ketoacyl [ACP] synthase (KAS) moiety of the mitochondrial FAS (mtKAS) is targeted by the antibiotic cerulenin and possibly by the other antibiotics inhibiting prokaryotic KASes: thiolactomycin, platensimycin, and the alpha-methylene butyrolactone, C75. The high...... degree of structural similarity between mitochondrial and prokaryotic KASes complicates development of novel antibiotics targeting prokaryotic KAS without affecting KAS domains of cytoplasmic FAS. KASes catalyze the C(2) fatty acid elongation reaction using either a Cys-His-His or Cys-His-Asn catalytic...

  1. Human hematopoietic prostaglandin D synthase inhibitor complex structures.

    Science.gov (United States)

    Kado, Yuji; Aritake, Kosuke; Uodome, Nobuko; Okano, Yousuke; Okazaki, Nobuo; Matsumura, Hiroyoshi; Urade, Yoshihiro; Inoue, Tsuyoshi

    2012-04-01

    In mast and Th2 cells, hematopoietic prostaglandin (PG) D synthase (H-PGDS) catalyses the isomerization of PGH(2) in the presence of glutathione (GSH) to produce the allergic and inflammatory mediator PGD(2). We determined the X-ray structures of human H-PGDS inhibitor complexes with 1-amino-4-{4-[4-chloro-6-(2-sulpho-phenylamino)-[1,3,5]triazin-2-ylmethyl]-3-sulpho-phenylamino}-9,10-dioxo-9,10-dihydro-anthracene-2-sulphonic acid (Cibacron Blue) and 1-amino-4-(4-aminosulphonyl) phenyl-anthraquinone-2-sulphonic acid (APAS) at 2.0 Å resolution. When complexed with H-PGDS, Cibacron Blue had an IC(50) value of 40 nM and APAS 2.1 μM. The Cibacron Blue molecule was stabilized by four hydrogen bonds and π-π stacking between the anthraquinone ring and Trp104, the ceiling of the active site H-PGDS pocket. Among the four hydrogen bonds, the Cibacron Blue terminal sulphonic group directly interacted with conserved residues Lys112 and Lys198, which recognize the PGH(2) substrate α-chain. In contrast, the APAS anthraquinone ring was inverted to interact with Trp104, while its benzenesulphonic group penetrated the GSH-bound region at the bottom of the active site. Due to the lack of extended aromatic rings, APAS could not directly hydrogen bond with the two conserved lysine residues, thus decreasing the total number of hydrogen bond from four to one. These factors may contribute to the 50-fold difference in the IC(50) values obtained for the two inhibitors.

  2. The Plasmodiophora brassicae genome reveals insights in its life cycle and ancestry of chitin synthases

    Science.gov (United States)

    Schwelm, Arne; Fogelqvist, Johan; Knaust, Andrea; Jülke, Sabine; Lilja, Tua; Bonilla-Rosso, German; Karlsson, Magnus; Shevchenko, Andrej; Dhandapani, Vignesh; Choi, Su Ryun; Kim, Hong Gi; Park, Ju Young; Lim, Yong Pyo; Ludwig-Müller, Jutta; Dixelius, Christina

    2015-01-01

    Plasmodiophora brassicae causes clubroot, a major disease of Brassica oil and vegetable crops worldwide. P. brassicae is a Plasmodiophorid, obligate biotrophic protist in the eukaryotic kingdom of Rhizaria. Here we present the 25.5 Mb genome draft of P. brassicae, developmental stage-specific transcriptomes and a transcriptome of Spongospora subterranea, the Plasmodiophorid causing powdery scab on potato. Like other biotrophic pathogens both Plasmodiophorids are reduced in metabolic pathways. Phytohormones contribute to the gall phenotypes of infected roots. We report a protein (PbGH3) that can modify auxin and jasmonic acid. Plasmodiophorids contain chitin in cell walls of the resilient resting spores. If recognized, chitin can trigger defense responses in plants. Interestingly, chitin-related enzymes of Plasmodiophorids built specific families and the carbohydrate/chitin binding (CBM18) domain is enriched in the Plasmodiophorid secretome. Plasmodiophorids chitin synthases belong to two families, which were present before the split of the eukaryotic Stramenopiles/Alveolates/Rhizaria/Plantae and Metazoa/Fungi/Amoebozoa megagroups, suggesting chitin synthesis to be an ancient feature of eukaryotes. This exemplifies the importance of genomic data from unexplored eukaryotic groups, such as the Plasmodiophorids, to decipher evolutionary relationships and gene diversification of early eukaryotes. PMID:26084520

  3. The Plasmodiophora brassicae genome reveals insights in its life cycle and ancestry of chitin synthases.

    Science.gov (United States)

    Schwelm, Arne; Fogelqvist, Johan; Knaust, Andrea; Jülke, Sabine; Lilja, Tua; Bonilla-Rosso, German; Karlsson, Magnus; Shevchenko, Andrej; Dhandapani, Vignesh; Choi, Su Ryun; Kim, Hong Gi; Park, Ju Young; Lim, Yong Pyo; Ludwig-Müller, Jutta; Dixelius, Christina

    2015-06-18

    Plasmodiophora brassicae causes clubroot, a major disease of Brassica oil and vegetable crops worldwide. P. brassicae is a Plasmodiophorid, obligate biotrophic protist in the eukaryotic kingdom of Rhizaria. Here we present the 25.5 Mb genome draft of P. brassicae, developmental stage-specific transcriptomes and a transcriptome of Spongospora subterranea, the Plasmodiophorid causing powdery scab on potato. Like other biotrophic pathogens both Plasmodiophorids are reduced in metabolic pathways. Phytohormones contribute to the gall phenotypes of infected roots. We report a protein (PbGH3) that can modify auxin and jasmonic acid. Plasmodiophorids contain chitin in cell walls of the resilient resting spores. If recognized, chitin can trigger defense responses in plants. Interestingly, chitin-related enzymes of Plasmodiophorids built specific families and the carbohydrate/chitin binding (CBM18) domain is enriched in the Plasmodiophorid secretome. Plasmodiophorids chitin synthases belong to two families, which were present before the split of the eukaryotic Stramenopiles/Alveolates/Rhizaria/Plantae and Metazoa/Fungi/Amoebozoa megagroups, suggesting chitin synthesis to be an ancient feature of eukaryotes. This exemplifies the importance of genomic data from unexplored eukaryotic groups, such as the Plasmodiophorids, to decipher evolutionary relationships and gene diversification of early eukaryotes.

  4. The Structure of Sucrose Synthase-1 from Arabidopsis thaliana and Its Functional Implications

    Energy Technology Data Exchange (ETDEWEB)

    Zheng, Yi; Anderson, Spencer; Zhang, Yanfeng; Garavito, R. Michael (MSU); (NWU)

    2014-10-02

    Sucrose transport is the central system for the allocation of carbon resources in vascular plants. During growth and development, plants control carbon distribution by coordinating sites of sucrose synthesis and cleavage in different plant organs and different cellular locations. Sucrose synthase, which reversibly catalyzes sucrose synthesis and cleavage, provides a direct and reversible means to regulate sucrose flux. Depending on the metabolic environment, sucrose synthase alters its cellular location to participate in cellulose, callose, and starch biosynthesis through its interactions with membranes, organelles, and cytoskeletal actin. The x-ray crystal structure of sucrose synthase isoform 1 from Arabidopsis thaliana (AtSus1) has been determined as a complex with UDP-glucose and as a complex with UDP and fructose, at 2.8- and 2.85-{angstrom} resolutions, respectively. The AtSus1 structure provides insights into sucrose catalysis and cleavage, as well as the regulation of sucrose synthase and its interactions with cellular targets.

  5. Functional annotation, genome organization and phylogeny of the grapevine (Vitis vinifera) terpene synthase gene family based on genome assembly, FLcDNA cloning, and enzyme assays.

    Science.gov (United States)

    Martin, Diane M; Aubourg, Sébastien; Schouwey, Marina B; Daviet, Laurent; Schalk, Michel; Toub, Omid; Lund, Steven T; Bohlmann, Jörg

    2010-10-21

    Terpenoids are among the most important constituents of grape flavour and wine bouquet, and serve as useful metabolite markers in viticulture and enology. Based on the initial 8-fold sequencing of a nearly homozygous Pinot noir inbred line, 89 putative terpenoid synthase genes (VvTPS) were predicted by in silico analysis of the grapevine (Vitis vinifera) genome assembly 1. The finding of this very large VvTPS family, combined with the importance of terpenoid metabolism for the organoleptic properties of grapevine berries and finished wines, prompted a detailed examination of this gene family at the genomic level as well as an investigation into VvTPS biochemical functions. We present findings from the analysis of the up-dated 12-fold sequencing and assembly of the grapevine genome that place the number of predicted VvTPS genes at 69 putatively functional VvTPS, 20 partial VvTPS, and 63 VvTPS probable pseudogenes. Gene discovery and annotation included information about gene architecture and chromosomal location. A dense cluster of 45 VvTPS is localized on chromosome 18. Extensive FLcDNA cloning, gene synthesis, and protein expression enabled functional characterization of 39 VvTPS; this is the largest number of functionally characterized TPS for any species reported to date. Of these enzymes, 23 have unique functions and/or phylogenetic locations within the plant TPS gene family. Phylogenetic analyses of the TPS gene family showed that while most VvTPS form species-specific gene clusters, there are several examples of gene orthology with TPS of other plant species, representing perhaps more ancient VvTPS, which have maintained functions independent of speciation. The highly expanded VvTPS gene family underpins the prominence of terpenoid metabolism in grapevine. We provide a detailed experimental functional annotation of 39 members of this important gene family in grapevine and comprehensive information about gene structure and phylogeny for the entire currently

  6. Structural characterization and comparison of three acyl-carrier-protein synthases from pathogenic bacteria

    Energy Technology Data Exchange (ETDEWEB)

    Halavaty, Andrei S. [Center for Structural Genomics of Infectious Diseases, (United States); Northwestern University, Chicago, IL 60611 (United States); Kim, Youngchang [Center for Structural Genomics of Infectious Diseases, (United States); Argonne National Laboratory, Argonne, IL 60439 (United States); University of Chicago, Chicago, IL 60637 (United States); Minasov, George; Shuvalova, Ludmilla; Dubrovska, Ievgeniia; Winsor, James [Center for Structural Genomics of Infectious Diseases, (United States); Northwestern University, Chicago, IL 60611 (United States); Zhou, Min [Center for Structural Genomics of Infectious Diseases, (United States); Argonne National Laboratory, Argonne, IL 60439 (United States); University of Chicago, Chicago, IL 60637 (United States); Onopriyenko, Olena; Skarina, Tatiana [Center for Structural Genomics of Infectious Diseases, (United States); University of Toronto, Toronto, Ontario M5G 1L6 (Canada); Papazisi, Leka; Kwon, Keehwan; Peterson, Scott N. [Center for Structural Genomics of Infectious Diseases, (United States); J. Craig Venter Institute, Rockville, MD 20850 (United States); Joachimiak, Andrzej [Center for Structural Genomics of Infectious Diseases, (United States); Argonne National Laboratory, Argonne, IL 60439 (United States); University of Chicago, Chicago, IL 60637 (United States); Savchenko, Alexei [Center for Structural Genomics of Infectious Diseases, (United States); University of Toronto, Toronto, Ontario M5G 1L6 (Canada); Anderson, Wayne F., E-mail: wf-anderson@northwestern.edu [Center for Structural Genomics of Infectious Diseases, (United States); Northwestern University, Chicago, IL 60611 (United States)

    2012-10-01

    The structural characterization of acyl-carrier-protein synthase (AcpS) from three different pathogenic microorganisms is reported. One interesting finding of the present work is a crystal artifact related to the activity of the enzyme, which fortuitously represents an opportunity for a strategy to design a potential inhibitor of a pathogenic AcpS. Some bacterial type II fatty-acid synthesis (FAS II) enzymes have been shown to be important candidates for drug discovery. The scientific and medical quest for new FAS II protein targets continues to stimulate research in this field. One of the possible additional candidates is the acyl-carrier-protein synthase (AcpS) enzyme. Its holo form post-translationally modifies the apo form of an acyl carrier protein (ACP), which assures the constant delivery of thioester intermediates to the discrete enzymes of FAS II. At the Center for Structural Genomics of Infectious Diseases (CSGID), AcpSs from Staphylococcus aureus (AcpS{sub SA}), Vibrio cholerae (AcpS{sub VC}) and Bacillus anthracis (AcpS{sub BA}) have been structurally characterized in their apo, holo and product-bound forms, respectively. The structure of AcpS{sub BA} is emphasized because of the two 3′, 5′-adenosine diphosphate (3′, 5′-ADP) product molecules that are found in each of the three coenzyme A (CoA) binding sites of the trimeric protein. One 3′, 5′-ADP is bound as the 3′, 5′-ADP part of CoA in the known structures of the CoA–AcpS and 3′, 5′-ADP–AcpS binary complexes. The position of the second 3′, 5′-ADP has never been described before. It is in close proximity to the first 3′, 5′-ADP and the ACP-binding site. The coordination of two ADPs in AcpS{sub BA} may possibly be exploited for the design of AcpS inhibitors that can block binding of both CoA and ACP.

  7. Structural characterization and comparison of three acyl-carrier-protein synthases from pathogenic bacteria

    International Nuclear Information System (INIS)

    Halavaty, Andrei S.; Kim, Youngchang; Minasov, George; Shuvalova, Ludmilla; Dubrovska, Ievgeniia; Winsor, James; Zhou, Min; Onopriyenko, Olena; Skarina, Tatiana; Papazisi, Leka; Kwon, Keehwan; Peterson, Scott N.; Joachimiak, Andrzej; Savchenko, Alexei; Anderson, Wayne F.

    2012-01-01

    The structural characterization of acyl-carrier-protein synthase (AcpS) from three different pathogenic microorganisms is reported. One interesting finding of the present work is a crystal artifact related to the activity of the enzyme, which fortuitously represents an opportunity for a strategy to design a potential inhibitor of a pathogenic AcpS. Some bacterial type II fatty-acid synthesis (FAS II) enzymes have been shown to be important candidates for drug discovery. The scientific and medical quest for new FAS II protein targets continues to stimulate research in this field. One of the possible additional candidates is the acyl-carrier-protein synthase (AcpS) enzyme. Its holo form post-translationally modifies the apo form of an acyl carrier protein (ACP), which assures the constant delivery of thioester intermediates to the discrete enzymes of FAS II. At the Center for Structural Genomics of Infectious Diseases (CSGID), AcpSs from Staphylococcus aureus (AcpS SA ), Vibrio cholerae (AcpS VC ) and Bacillus anthracis (AcpS BA ) have been structurally characterized in their apo, holo and product-bound forms, respectively. The structure of AcpS BA is emphasized because of the two 3′, 5′-adenosine diphosphate (3′, 5′-ADP) product molecules that are found in each of the three coenzyme A (CoA) binding sites of the trimeric protein. One 3′, 5′-ADP is bound as the 3′, 5′-ADP part of CoA in the known structures of the CoA–AcpS and 3′, 5′-ADP–AcpS binary complexes. The position of the second 3′, 5′-ADP has never been described before. It is in close proximity to the first 3′, 5′-ADP and the ACP-binding site. The coordination of two ADPs in AcpS BA may possibly be exploited for the design of AcpS inhibitors that can block binding of both CoA and ACP

  8. High order quaternary arrangement confers increased structural stability to Brucella Spp. lumazine synthase

    Energy Technology Data Exchange (ETDEWEB)

    Zylberman, V.; Craig, P.O.; Klinke, S.; Cauerhff, A.; Goldbaum, F.A. [Instituto Leloir, Buenos Aires (Argentina); Braden, B.C. [Bowie State Univ., Maryland (United States)

    2004-07-01

    The penultimate step in the pathway of riboflavin biosynthesis is catalyzed by the enzyme lumazine synthase (LS). One of the most distinctive characteristics of this enzyme is the structural quaternary divergence found in different species. The protein exists as pentameric and icosahedral forms, built from practically the same structural monomeric unit. The pentameric structure is formed by five 18 kDa monomers, each extensively contacting neighboring monomers. The icosahedral structure consists of 60 LS monomers arranged as twelve pentamers giving rise to a capsid exhibiting icosahedral 532 symmetry. In all lumazine synthases studied, the topologically equivalent active sites are located at the interfaces between adjacent subunits in the pentameric modules. The Brucella spp. lumazine synthase (BLS) sequence clearly diverges from pentameric and icosahedral enzymes. This unusual divergence prompted to further investigate on its quaternary arrangement. In the present work, we demonstrate by means of solution Light Scattering and X-ray structural analyses that BLS assembles as a very stable dimer of pentamers representing a third category of quaternary assembly for lumazine synthases. We also describe by spectroscopic studies the thermodynamic stability of this oligomeric protein, and postulate a mechanism for dissociation/unfolding of this macromolecular assembly. The higher molecular order of BLS increases its stability 20 deg C compared to pentameric lumazine synthases. The decameric arrangement described in this work highlights the importance of quaternary interactions in the stabilization of proteins. (author)

  9. 2004 Structural, Function and Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  10. High Polyhydroxybutyrate Production in Pseudomonas extremaustralis Is Associated with Differential Expression of Horizontally Acquired and Core Genome Polyhydroxyalkanoate Synthase Genes

    Science.gov (United States)

    Catone, Mariela V.; Ruiz, Jimena A.; Castellanos, Mildred; Segura, Daniel; Espin, Guadalupe; López, Nancy I.

    2014-01-01

    Pseudomonas extremaustralis produces mainly polyhydroxybutyrate (PHB), a short chain length polyhydroxyalkanoate (sclPHA) infrequently found in Pseudomonas species. Previous studies with this strain demonstrated that PHB genes are located in a genomic island. In this work, the analysis of the genome of P. extremaustralis revealed the presence of another PHB cluster phbFPX, with high similarity to genes belonging to Burkholderiales, and also a cluster, phaC1ZC2D, coding for medium chain length PHA production (mclPHA). All mclPHA genes showed high similarity to genes from Pseudomonas species and interestingly, this cluster also showed a natural insertion of seven ORFs not related to mclPHA metabolism. Besides PHB, P. extremaustralis is able to produce mclPHA although in minor amounts. Complementation analysis demonstrated that both mclPHA synthases, PhaC1 and PhaC2, were functional. RT-qPCR analysis showed different levels of expression for the PHB synthase, phbC, and the mclPHA synthases. The expression level of phbC, was significantly higher than the obtained for phaC1 and phaC2, in late exponential phase cultures. The analysis of the proteins bound to the PHA granules showed the presence of PhbC and PhaC1, whilst PhaC2 could not be detected. In addition, two phasin like proteins (PhbP and PhaI) associated with the production of scl and mcl PHAs, respectively, were detected. The results of this work show the high efficiency of a foreign gene (phbC) in comparison with the mclPHA core genome genes (phaC1 and phaC2) indicating that the ability of P. extremaustralis to produce high amounts of PHB could be explained by the different expression levels of the genes encoding the scl and mcl PHA synthases. PMID:24887088

  11. High polyhydroxybutyrate production in Pseudomonas extremaustralis is associated with differential expression of horizontally acquired and core genome polyhydroxyalkanoate synthase genes.

    Directory of Open Access Journals (Sweden)

    Mariela V Catone

    Full Text Available Pseudomonas extremaustralis produces mainly polyhydroxybutyrate (PHB, a short chain length polyhydroxyalkanoate (sclPHA infrequently found in Pseudomonas species. Previous studies with this strain demonstrated that PHB genes are located in a genomic island. In this work, the analysis of the genome of P. extremaustralis revealed the presence of another PHB cluster phbFPX, with high similarity to genes belonging to Burkholderiales, and also a cluster, phaC1ZC2D, coding for medium chain length PHA production (mclPHA. All mclPHA genes showed high similarity to genes from Pseudomonas species and interestingly, this cluster also showed a natural insertion of seven ORFs not related to mclPHA metabolism. Besides PHB, P. extremaustralis is able to produce mclPHA although in minor amounts. Complementation analysis demonstrated that both mclPHA synthases, PhaC1 and PhaC2, were functional. RT-qPCR analysis showed different levels of expression for the PHB synthase, phbC, and the mclPHA synthases. The expression level of phbC, was significantly higher than the obtained for phaC1 and phaC2, in late exponential phase cultures. The analysis of the proteins bound to the PHA granules showed the presence of PhbC and PhaC1, whilst PhaC2 could not be detected. In addition, two phasin like proteins (PhbP and PhaI associated with the production of scl and mcl PHAs, respectively, were detected. The results of this work show the high efficiency of a foreign gene (phbC in comparison with the mclPHA core genome genes (phaC1 and phaC2 indicating that the ability of P. extremaustralis to produce high amounts of PHB could be explained by the different expression levels of the genes encoding the scl and mcl PHA synthases.

  12. Structure of Salmonella typhimurium OMP Synthase in a Complete Substrate Complex

    DEFF Research Database (Denmark)

    Grubmeyer, Charles; Hansen, Michael Riis; Fedorov, Alexander A.

    2012-01-01

    Dimeric Salmonella typhimurium orotate phosphoribosyltransferase (OMP synthase, EC 2.4.2.10), a key enzyme in de novo pyrimidine nucleotide synthesis, has been cocrystallized in a complete substrate E·MgPRPP·orotate complex and the structure determined to 2.2 Å resolution. This structure resem...

  13. Analyses of the sucrose synthase gene family in cotton: structure, phylogeny and expression patterns

    Directory of Open Access Journals (Sweden)

    Chen Aiqun

    2012-06-01

    Full Text Available Abstract Background In plants, sucrose synthase (Sus is widely considered as a key enzyme involved in sucrose metabolism. Several paralogous genes encoding different isozymes of Sus have been identified and characterized in multiple plant genomes, while limited information of Sus genes is available to date for cotton. Results Here, we report the molecular cloning, structural organization, phylogenetic evolution and expression profiles of seven Sus genes (GaSus1 to 7 identified from diploid fiber cotton (Gossypium arboreum. Comparisons between cDNA and genomic sequences revealed that the cotton GaSus genes were interrupted by multiple introns. Comparative screening of introns in homologous genes demonstrated that the number and position of Sus introns are highly conserved among Sus genes in cotton and other more distantly related plant species. Phylogenetic analysis showed that GaSus1, GaSus2, GaSus3, GaSus4 and GaSus5 could be clustered together into a dicot Sus group, while GaSus6 and GaSus7 were separated evenly into other two groups, with members from both dicot and monocot species. Expression profiles analyses of the seven Sus genes indicated that except GaSus2, of which the transcripts was undetectable in all tissues examined, and GaSus7, which was only expressed in stem and petal, the other five paralogues were differentially expressed in a wide ranges of tissues, and showed development-dependent expression profiles in cotton fiber cells. Conclusions This is a comprehensive study of the Sus gene family in cotton plant. The results presented in this work provide new insights into the evolutionary conservation and sub-functional divergence of the cotton Sus gene family in response to cotton fiber growth and development.

  14. Crystal Structure of Albaflavenone Monooxygenase Containing a Moonlighting Terpene Synthase Active Site

    Energy Technology Data Exchange (ETDEWEB)

    Zhao, Bin; Lei, Li; Vassylyev, Dmitry G.; Lin, Xin; Cane, David E.; Kelly, Steven L.; Yuan, Hang; Lamb, David C.; Waterman, Michael R.; (Vanderbilt); (UAB); (Brown); (Swansea)

    2010-01-08

    Albaflavenone synthase (CYP170A1) is a monooxygenase catalyzing the final two steps in the biosynthesis of this antibiotic in the soil bacterium, Streptomyces coelicolor A3(2). Interestingly, CYP170A1 shows no stereo selection forming equal amounts of two albaflavenol epimers, each of which is oxidized in turn to albaflavenone. To explore the structural basis of the reaction mechanism, we have studied the crystal structures of both ligand-free CYP170A1 (2.6 {angstrom}) and complex of endogenous substrate (epi-isozizaene) with CYP170A1 (3.3 {angstrom}). The structure of the complex suggests that the proximal epi-isozizaene molecules may bind to the heme iron in two orientations. In addition, much to our surprise, we have found that albaflavenone synthase also has a second, completely distinct catalytic activity corresponding to the synthesis of farnesene isomers from farnesyl diphosphate. Within the cytochrome P450 {alpha}-helical domain both the primary sequence and x-ray structure indicate the presence of a novel terpene synthase active site that is moonlighting on the P450 structure. This includes signature sequences for divalent cation binding and an {alpha}-helical barrel. This barrel is unusual because it consists of only four helices rather than six found in all other terpene synthases. Mutagenesis establishes that this barrel is essential for the terpene synthase activity of CYP170A1 but not for the monooxygenase activity. This is the first bifunctional P450 discovered to have another active site moonlighting on it and the first time a terpene synthase active site is found moonlighting on another protein.

  15. Controlling Citrate Synthase Expression by CRISPR/Cas9 Genome Editing for n-Butanol Production in Escherichia coli

    DEFF Research Database (Denmark)

    Heo, Min-Ji; Jung, Hwi-Min; Um, Jaeyong

    2017-01-01

    Genome editing using CRISPR/Cas9 was successfully demonstrated in Esherichia coli to effectively produce n-butanol in a defined medium under microaerobic condition. The butanol synthetic pathway genes including those encoding oxygen-tolerant alcohol dehydrogenase were overexpressed in metabolical...... that redistributing carbon flux using genome editing is an efficient engineering tool for metabolite overproduction.......Genome editing using CRISPR/Cas9 was successfully demonstrated in Esherichia coli to effectively produce n-butanol in a defined medium under microaerobic condition. The butanol synthetic pathway genes including those encoding oxygen-tolerant alcohol dehydrogenase were overexpressed in metabolically...... engineered E. coli, resulting in 0.82 g/L butanol production. To increase butanol production, carbon flux from acetyl-CoA to citric acid cycle should be redirected to acetoacetyl-CoA. For this purpose, the 5′-untranslated region sequence of gltA encoding citrate synthase was designed using an expression...

  16. Structural basis for substrate activation and regulation by cystathionine beta-synthase (CBS) domains in cystathionine [beta]-synthase

    Energy Technology Data Exchange (ETDEWEB)

    Koutmos, Markos; Kabil, Omer; Smith, Janet L.; Banerjee, Ruma (Michigan-Med)

    2011-08-17

    The catalytic potential for H{sub 2}S biogenesis and homocysteine clearance converge at the active site of cystathionine {beta}-synthase (CBS), a pyridoxal phosphate-dependent enzyme. CBS catalyzes {beta}-replacement reactions of either serine or cysteine by homocysteine to give cystathionine and water or H{sub 2}S, respectively. In this study, high-resolution structures of the full-length enzyme from Drosophila in which a carbanion (1.70 {angstrom}) and an aminoacrylate intermediate (1.55 {angstrom}) have been captured are reported. Electrostatic stabilization of the zwitterionic carbanion intermediate is afforded by the close positioning of an active site lysine residue that is initially used for Schiff base formation in the internal aldimine and later as a general base. Additional stabilizing interactions between active site residues and the catalytic intermediates are observed. Furthermore, the structure of the regulatory 'energy-sensing' CBS domains, named after this protein, suggests a mechanism for allosteric activation by S-adenosylmethionine.

  17. Functional Annotation, Genome Organization and Phylogeny of the Grapevine (Vitis vinifera Terpene Synthase Gene Family Based on Genome Assembly, FLcDNA Cloning, and Enzyme Assays

    Directory of Open Access Journals (Sweden)

    Toub Omid

    2010-10-01

    Full Text Available Abstract Background Terpenoids are among the most important constituents of grape flavour and wine bouquet, and serve as useful metabolite markers in viticulture and enology. Based on the initial 8-fold sequencing of a nearly homozygous Pinot noir inbred line, 89 putative terpenoid synthase genes (VvTPS were predicted by in silico analysis of the grapevine (Vitis vinifera genome assembly 1. The finding of this very large VvTPS family, combined with the importance of terpenoid metabolism for the organoleptic properties of grapevine berries and finished wines, prompted a detailed examination of this gene family at the genomic level as well as an investigation into VvTPS biochemical functions. Results We present findings from the analysis of the up-dated 12-fold sequencing and assembly of the grapevine genome that place the number of predicted VvTPS genes at 69 putatively functional VvTPS, 20 partial VvTPS, and 63 VvTPS probable pseudogenes. Gene discovery and annotation included information about gene architecture and chromosomal location. A dense cluster of 45 VvTPS is localized on chromosome 18. Extensive FLcDNA cloning, gene synthesis, and protein expression enabled functional characterization of 39 VvTPS; this is the largest number of functionally characterized TPS for any species reported to date. Of these enzymes, 23 have unique functions and/or phylogenetic locations within the plant TPS gene family. Phylogenetic analyses of the TPS gene family showed that while most VvTPS form species-specific gene clusters, there are several examples of gene orthology with TPS of other plant species, representing perhaps more ancient VvTPS, which have maintained functions independent of speciation. Conclusions The highly expanded VvTPS gene family underpins the prominence of terpenoid metabolism in grapevine. We provide a detailed experimental functional annotation of 39 members of this important gene family in grapevine and comprehensive information

  18. CELLULOSE SYNTHASE-LIKE A2, a glucomannan synthase, is involved in maintaining adherent mucilage structure in Arabidopsis seed.

    Science.gov (United States)

    Yu, Li; Shi, Dachuan; Li, Junling; Kong, Yingzhen; Yu, Yanchong; Chai, Guohua; Hu, Ruibo; Wang, Juan; Hahn, Michael G; Zhou, Gongke

    2014-04-01

    Mannans are hemicellulosic polysaccharides that are considered to have both structural and storage functions in the plant cell wall. However, it is not yet known how mannans function in Arabidopsis (Arabidopsis thaliana) seed mucilage. In this study, CELLULOSE SYNTHASE-LIKE A2 (CSLA2; At5g22740) expression was observed in several seed tissues, including the epidermal cells of developing seed coats. Disruption of CSLA2 resulted in thinner adherent mucilage halos, although the total amount of the adherent mucilage did not change compared with the wild type. This suggested that the adherent mucilage in the mutant was more compact compared with that of the wild type. In accordance with the role of CSLA2 in glucomannan synthesis, csla2-1 mucilage contained 30% less mannosyl and glucosyl content than did the wild type. No appreciable changes in the composition, structure, or macromolecular properties were observed for nonmannan polysaccharides in mutant mucilage. Biochemical analysis revealed that cellulose crystallinity was substantially reduced in csla2-1 mucilage; this was supported by the removal of most mucilage cellulose through treatment of csla2-1 seeds with endo-β-glucanase. Mutation in CSLA2 also resulted in altered spatial distribution of cellulose and an absence of birefringent cellulose microfibrils within the adherent mucilage. As with the observed changes in crystalline cellulose, the spatial distribution of pectin was also modified in csla2-1 mucilage. Taken together, our results demonstrate that glucomannans synthesized by CSLA2 are involved in modulating the structure of adherent mucilage, potentially through altering cellulose organization and crystallization.

  19. Structure of the ATP Synthase Catalytic Complex (F1) from Escherichia coli in an Autoinhibited conformation

    Energy Technology Data Exchange (ETDEWEB)

    G Cingolani; T Duncan

    2011-12-31

    ATP synthase is a membrane-bound rotary motor enzyme that is critical for cellular energy metabolism in all kingdoms of life. Despite conservation of its basic structure and function, autoinhibition by one of its rotary stalk subunits occurs in bacteria and chloroplasts but not in mitochondria. The crystal structure of the ATP synthase catalytic complex (F{sub 1}) from Escherichia coli described here reveals the structural basis for this inhibition. The C-terminal domain of subunit {var_epsilon} adopts a heretofore unknown, highly extended conformation that inserts deeply into the central cavity of the enzyme and engages both rotor and stator subunits in extensive contacts that are incompatible with functional rotation. As a result, the three catalytic subunits are stabilized in a set of conformations and rotational positions distinct from previous F{sub 1} structures.

  20. Genome based analysis of type-I polyketide synthase and nonribosomal peptide synthetase gene clusters in seven strains of five representative Nocardia species.

    Science.gov (United States)

    Komaki, Hisayuki; Ichikawa, Natsuko; Hosoyama, Akira; Takahashi-Nakaguchi, Azusa; Matsuzawa, Tetsuhiro; Suzuki, Ken-ichiro; Fujita, Nobuyuki; Gonoi, Tohru

    2014-04-30

    Actinobacteria of the genus Nocardia usually live in soil or water and play saprophytic roles, but they also opportunistically infect the respiratory system, skin, and other organs of humans and animals. Primarily because of the clinical importance of the strains, some Nocardia genomes have been sequenced, and genome sequences have accumulated. Genome sizes of Nocardia strains are similar to those of Streptomyces strains, the producers of most antibiotics. In the present work, we compared secondary metabolite biosynthesis gene clusters of type-I polyketide synthase (PKS-I) and nonribosomal peptide synthetase (NRPS) among genomes of representative Nocardia species/strains based on domain organization and amino acid sequence homology. Draft genome sequences of Nocardia asteroides NBRC 15531(T), Nocardia otitidiscaviarum IFM 11049, Nocardia brasiliensis NBRC 14402(T), and N. brasiliensis IFM 10847 were read and compared with published complete genome sequences of Nocardia farcinica IFM 10152, Nocardia cyriacigeorgica GUH-2, and N. brasiliensis HUJEG-1. Genome sizes are as follows: N. farcinica, 6.0 Mb; N. cyriacigeorgica, 6.2 Mb; N. asteroides, 7.0 Mb; N. otitidiscaviarum, 7.8 Mb; and N. brasiliensis, 8.9 - 9.4 Mb. Predicted numbers of PKS-I, NRPS, and PKS-I/NRPS hybrid clusters ranged between 4-11, 7-13, and 1-6, respectively, depending on strains, and tended to increase with increasing genome size. Domain and module structures of representative or unique clusters are discussed in the text. We conclude the following: 1) genomes of Nocardia strains carry as many PKS-I and NRPS gene clusters as those of Streptomyces strains, 2) the number of PKS-I and NRPS gene clusters in Nocardia strains varies substantially depending on species, and N. brasiliensis strains carry the largest numbers of clusters among the species studied, 3) the seven Nocardia strains studied in the present work have seven common PKS-I and/or NRPS clusters, some of whose products are yet to be studied

  1. Structure and Reaction Mechanism of Pyrrolysine Synthase (PylD)

    KAUST Repository

    Quitterer, Felix

    2013-05-29

    The final step in the biosynthesis of the 22nd genetically encoded amino acid, pyrrolysine, is catalyzed by PylD, a structurally and mechanistically unique dehydrogenase. This catalyzed reaction includes an induced-fit mechanism achieved by major structural rearrangements of the N-terminal helix upon substrate binding. Different steps of the reaction trajectory are visualized by complex structures of PylD with substrate and product. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Chemical analysis of a genome wide polyketide synthase gene deletion library in Aspergillus nidulans

    DEFF Research Database (Denmark)

    Larsen, Thomas Ostenfeld; Klejnstrup, Marie Louise; Nielsen, Jakob Blæsbjerg

    to encode polyketide synthases have been individually been deleted. This presentation will highlight our recent linking of secondary metabolites in A. nidulans to genes, and in particular describe some recent work on characterization of ANID_6448 and associated genes responsible for biosynthesis of 3-methyl...

  3. Structural study and thermodynamic characterization of inhibitor binding to lumazine synthase from Bacillus anthracis

    Energy Technology Data Exchange (ETDEWEB)

    Morgunova, Ekaterina [Karolinska Institutet NOVUM, Center of Structural Biochemistry, Hälsovägen 7-9, 141 57 Huddinge (Sweden); Illarionov, Boris; Saller, Sabine [Institut für Lebensmittelchemie, Universität Hamburg, Grindelallee 117, 20146 Hamburg (Germany); Popov, Aleksander [European Synchrotron Radiation Facility, BP 220, F-38043 Grenoble CEDEX 09 (France); Sambaiah, Thota [Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University (United States); Bacher, Adelbert [Chemistry Department, Technical University of Munich, 85747 Garching (Germany); Cushman, Mark [Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University (United States); Fischer, Markus [Institut für Lebensmittelchemie, Universität Hamburg, Grindelallee 117, 20146 Hamburg (Germany); Ladenstein, Rudolf, E-mail: rudolf.ladenstein@ki.se [Karolinska Institutet NOVUM, Center of Structural Biochemistry, Hälsovägen 7-9, 141 57 Huddinge (Sweden)

    2010-09-01

    Crystallographic studies of lumazine synthase, the penultimate enzyme of the riboflavin-biosynthetic pathway in B. anthracis, provide a structural framework for the design of antibiotic inhibitors, together with calorimetric and kinetic investigations of inhibitor binding. The crystal structure of lumazine synthase from Bacillus anthracis was solved by molecular replacement and refined to R{sub cryst} = 23.7% (R{sub free} = 28.4%) at a resolution of 3.5 Å. The structure reveals the icosahedral symmetry of the enzyme and specific features of the active site that are unique in comparison with previously determined orthologues. The application of isothermal titration calorimetry in combination with enzyme kinetics showed that three designed pyrimidine derivatives bind to lumazine synthase with micromolar dissociation constants and competitively inhibit the catalytic reaction. Structure-based modelling suggested the binding modes of the inhibitors in the active site and allowed an estimation of the possible contacts formed upon binding. The results provide a structural framework for the design of antibiotics active against B. anthracis.

  4. Cloning and heterologous expression of a novel subgroup of class IV polyhydroxyalkanoate synthase genes from the genus Bacillus.

    Science.gov (United States)

    Mizuno, Kouhei; Kihara, Takahiro; Tsuge, Takeharu; Lundgren, Benjamin R; Sarwar, Zaara; Pinto, Atahualpa; Nomura, Christopher T

    2017-01-01

    Many microorganisms harbor genes necessary to synthesize biodegradable plastics known as polyhydroxyalkanoates (PHAs). We surveyed a genomic database and discovered a new cluster of class IV PHA synthase genes (phaRC). These genes are different in sequence and operon structure from any previously reported PHA synthase. The newly discovered PhaRC synthase was demonstrated to produce PHAs in recombinant Escherichia coli.

  5. Insights from the sea: structural biology of marine polyketide synthases.

    Science.gov (United States)

    Akey, David L; Gehret, Jennifer J; Khare, Dheeraj; Smith, Janet L

    2012-10-01

    The world's oceans are a rich source of natural products with extremely interesting chemistry. Biosynthetic pathways have been worked out for a few, and the story is being enriched with crystal structures of interesting pathway enzymes. By far, the greatest number of structural insights from marine biosynthetic pathways has originated with studies of curacin A, a poster child for interesting marine chemistry with its cyclopropane and thiazoline rings, internal cis double bond, and terminal alkene. Using the curacin A pathway as a model, structural details are now available for a novel loading enzyme with remarkable dual decarboxylase and acetyltransferase activities, an Fe(2+)/α-ketoglutarate-dependent halogenase that dictates substrate binding order through conformational changes, a decarboxylase that establishes regiochemistry for cyclopropane formation, and a thioesterase with specificity for β-sulfated substrates that lead to terminal alkene offloading. The four curacin A pathway dehydratases reveal an intrinsic flexibility that may accommodate bulky or stiff polyketide intermediates. In the salinosporamide A pathway, active site volume determines the halide specificity of a halogenase that catalyzes for the synthesis of a halogenated building block. Structures of a number of putative polyketide cyclases may help in understanding reaction mechanisms and substrate specificities although their substrates are presently unknown.

  6. Structural and Functional Trends in Dehydrating Bimodules from trans -Acyltransferase Polyketide Synthases

    Energy Technology Data Exchange (ETDEWEB)

    Wagner, Drew T.; Zeng, Jia; Bailey, Constance B.; Gay, Darren C.; Yuan, Fang; Manion, Hannah R.; Keatinge-Clay, Adrian T. (Texas)

    2017-07-01

    In an effort to uncover the structural motifs and biosynthetic logic of the relatively uncharacterized trans-acyltransferase polyketide synthases, we have begun the dissection of the enigmatic dehydrating bimodules common in these enzymatic assembly lines. We report the 1.98 Å resolution structure of a ketoreductase (KR) from the first half of a type A dehydrating bimodule and the 2.22 Å resolution structure of a dehydratase (DH) from the second half of a type B dehydrating bimodule. The KR, from the third module of the bacillaene synthase, and the DH, from the tenth module of the difficidin synthase, possess features not observed in structurally characterized homologs. The DH architecture provides clues for how it catalyzes a unique double dehydration. Correlations between the chemistries proposed for dehydrating bimodules and bioinformatic analysis indicate that type A dehydrating bimodules generally produce an α/β-cis alkene moiety, while type B dehydrating bimodules generally produce an α/β-trans, γ/δ-cis diene moiety.

  7. Atypical composition and structure of the mitochondrial dimeric ATP synthase from Euglena gracilis.

    Science.gov (United States)

    Yadav, K N Sathish; Miranda-Astudillo, Héctor V; Colina-Tenorio, Lilia; Bouillenne, Fabrice; Degand, Hervé; Morsomme, Pierre; González-Halphen, Diego; Boekema, Egbert J; Cardol, Pierre

    2017-04-01

    Mitochondrial respiratory-chain complexes from Euglenozoa comprise classical subunits described in other eukaryotes (i.e. mammals and fungi) and subunits that are restricted to Euglenozoa (e.g. Euglena gracilis and Trypanosoma brucei). Here we studied the mitochondrial F 1 F O -ATP synthase (or Complex V) from the photosynthetic eukaryote E. gracilis in detail. The enzyme was purified by a two-step chromatographic procedure and its subunit composition was resolved by a three-dimensional gel electrophoresis (BN/SDS/SDS). Twenty-two different subunits were identified by mass-spectrometry analyses among which the canonical α, β, γ, δ, ε, and OSCP subunits, and at least seven subunits previously found in Trypanosoma. The ADP/ATP carrier was also associated to the ATP synthase into a dimeric ATP synthasome. Single-particle analysis by transmission electron microscopy of the dimeric ATP synthase indicated that the structures of both the catalytic and central rotor parts are conserved while other structural features are original. These new features include a large membrane-spanning region joining the monomers, an external peripheral stalk and a structure that goes through the membrane and reaches the inter membrane space below the c-ring, the latter having not been reported for any mitochondrial F-ATPase. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Construction of a genomic DNA library with a TA vector and its application in cloning of the phytoene synthase gene from the cyanobacterium Spirulina platensis M-135

    Science.gov (United States)

    Yoshikazu, Kawata; Shin-Ichi, Yano; Hiroyuki, Kojima

    1998-03-01

    An efficient and simple method for constructing a genomic DNA library using a TA cloning vector is presented. It is based on the sonicative cleavage of genomic DNA and modification of fragment ends with Taq DNA polymerase, followed by ligation using a TA vector. This method was applied for cloning of the phytoene synthase gene crt B from Spirulina platensis. This method is useful when genomic DNA cannot be efficiently digested with restriction enzymes, a problem often encountered during the construction of a genomic DNA library of cyanobacteria.

  9. Expanding the chemical space of polyketides through structure-guided mutagenesis of Vitis vinifera stilbene synthase.

    Science.gov (United States)

    Bhan, Namita; Cress, Brady F; Linhardt, Robert J; Koffas, Mattheos

    2015-08-01

    Several natural polyketides (PKs) have been associated with important pharmaceutical properties. Type III polyketide synthases (PKS) that generate aromatic PK polyketides have been studied extensively for their substrate promiscuity and product diversity. Stilbene synthase-like (STS) enzymes are unique in the type III PKS class as they possess a hydrogen bonding network, furnishing them with thioesterase-like properties, resulting in aldol condensation of the polyketide intermediates formed. Chalcone synthases (CHS) in contrast, lack this hydrogen-bonding network, resulting primarily in the Claisen condensation of the polyketide intermediates formed. We have attempted to expand the chemical space of this interesting class of compounds generated by creating structure-guided mutants of Vitis vinifera STS. Further, we have utilized a previously established workflow to quickly compare the wild-type reaction products to those generated by the mutants and identify novel PKs formed by using XCMS analysis of LC-MS and LC-MS/MS data. Based on this approach, we were able to generate 15 previously unreported PK molecules by exploring the substrate promiscuity of the wild-type enzyme and all mutants using unnatural substrates. These structures were specific to STSs and cannot be formed by their closely related CHS-like counterparts. Copyright © 2015 Elsevier B.V. and Société Française de Biochimie et Biologie Moléculaire (SFBBM). All rights reserved.

  10. Functional Genomics Reveals That a Compact Terpene Synthase Gene Family Can Account for Terpene Volatile Production in Apple1[W

    Science.gov (United States)

    Nieuwenhuizen, Niels J.; Green, Sol A.; Chen, Xiuyin; Bailleul, Estelle J.D.; Matich, Adam J.; Wang, Mindy Y.; Atkinson, Ross G.

    2013-01-01

    Terpenes are specialized plant metabolites that act as attractants to pollinators and as defensive compounds against pathogens and herbivores, but they also play an important role in determining the quality of horticultural food products. We show that the genome of cultivated apple (Malus domestica) contains 55 putative terpene synthase (TPS) genes, of which only 10 are predicted to be functional. This low number of predicted functional TPS genes compared with other plant species was supported by the identification of only eight potentially functional TPS enzymes in apple ‘Royal Gala’ expressed sequence tag databases, including the previously characterized apple (E,E)-α-farnesene synthase. In planta functional characterization of these TPS enzymes showed that they could account for the majority of terpene volatiles produced in cv Royal Gala, including the sesquiterpenes germacrene-D and (E)-β-caryophyllene, the monoterpenes linalool and α-pinene, and the homoterpene (E)-4,8-dimethyl-1,3,7-nonatriene. Relative expression analysis of the TPS genes indicated that floral and vegetative tissues were the primary sites of terpene production in cv Royal Gala. However, production of cv Royal Gala floral-specific terpenes and TPS genes was observed in the fruit of some heritage apple cultivars. Our results suggest that the apple TPS gene family has been shaped by a combination of ancestral and more recent genome-wide duplication events. The relatively small number of functional enzymes suggests that the remaining terpenes produced in floral and vegetative and fruit tissues are maintained under a positive selective pressure, while the small number of terpenes found in the fruit of modern cultivars may be related to commercial breeding strategies. PMID:23256150

  11. The Crystal Structures of the Open and Catalytically Competent Closed Conformation of Escherichia coli Glycogen Synthase

    Energy Technology Data Exchange (ETDEWEB)

    Sheng, Fang; Jia, Xiaofei; Yep, Alejandra; Preiss, Jack; Geiger, James H.; (MSU)

    2009-07-06

    Escherichia coli glycogen synthase (EcGS, EC 2.4.1.21) is a retaining glycosyltransferase (GT) that transfers glucose from adenosine diphosphate glucose to a glucan chain acceptor with retention of configuration at the anomeric carbon. EcGS belongs to the GT-B structural superfamily. Here we report several EcGS x-ray structures that together shed considerable light on the structure and function of these enzymes. The structure of the wild-type enzyme bound to ADP and glucose revealed a 15.2 degrees overall domain-domain closure and provided for the first time the structure of the catalytically active, closed conformation of a glycogen synthase. The main chain carbonyl group of His-161, Arg-300, and Lys-305 are suggested by the structure to act as critical catalytic residues in the transglycosylation. Glu-377, previously thought to be catalytic is found on the alpha-face of the glucose and plays an electrostatic role in the active site and as a glucose ring locator. This is also consistent with the structure of the EcGS(E377A)-ADP-HEPPSO complex where the glucose moiety is either absent or disordered in the active site

  12. Distinct Structural Elements Dictate the Specificity of the Type III Pentaketide Synthase from Neurospora crassa

    Energy Technology Data Exchange (ETDEWEB)

    Rubin-Pitel, Sheryl B.; Zhang, Houjin; Vu, Trang; Brunzelle, Joseph S.; Zhao, Huimin; Nair, Satish K. (UIUC); (NWU)

    2009-01-15

    The fungal type III polyketide synthase 2'-oxoalkylresorcyclic acid synthase (ORAS) primes with a range of acyl-Coenzyme A thioesters (C{sub 4}--C{sub 20}) and extends using malonyl-Coenzyme A to produce pyrones, resorcinols, and resorcylic acids. To gain insight into this unusual substrate specificity and product profile, we have determined the crystal structures of ORAS to 1.75 {angstrom} resolution, the Phe-252{yields}Gly site-directed mutant to 2.1 {angstrom} resolution, and a binary conplex of ORAS with eicosanoic acid to 2.0 {angstrom} resolution. The structures reveal a distinct rearrangement of structural elements near the active site that allows accomodation of long-chain fatty acid esters and a reorientation of the gating mechanism that controls cyclization and polyketide chain length. The roles of these structural elements are further elucidated by characterization of various structure-based site-directed variants. These studies establish an unexpected plasticity to the PKS fold, unanticipated from structural studies of other members of this enzyme family.

  13. Genome-wide analysis of the grapevine stilbene synthase multigenic family: genomic organization and expression profiles upon biotic and abiotic stresses

    Directory of Open Access Journals (Sweden)

    Vannozzi Alessandro

    2012-08-01

    Full Text Available Abstract Background Plant stilbenes are a small group of phenylpropanoids, which have been detected in at least 72 unrelated plant species and accumulate in response to biotic and abiotic stresses such as infection, wounding, UV-C exposure and treatment with chemicals. Stilbenes are formed via the phenylalanine/polymalonate-route, the last step of which is catalyzed by the enzyme stilbene synthase (STS, a type III polyketide synthase (PKS. Stilbene synthases are closely related to chalcone synthases (CHS, the key enzymes of the flavonoid pathway, as illustrated by the fact that both enzymes share the same substrates. To date, STSs have been cloned from peanut, pine, sorghum and grapevine, the only stilbene-producing fruiting-plant for which the entire genome has been sequenced. Apart from sorghum, STS genes appear to exist as a family of closely related genes in these other plant species. Results In this study a complete characterization of the STS multigenic family in grapevine has been performed, commencing with the identification, annotation and phylogenetic analysis of all members and integration of this information with a comprehensive set of gene expression analyses including healthy tissues at differential developmental stages and in leaves exposed to both biotic (downy mildew infection and abiotic (wounding and UV-C exposure stresses. At least thirty-three full length sequences encoding VvSTS genes were identified, which, based on predicted amino acid sequences, cluster in 3 principal groups designated A, B and C. The majority of VvSTS genes cluster in groups B and C and are located on chr16 whereas the few gene family members in group A are found on chr10. Microarray and mRNA-seq expression analyses revealed different patterns of transcript accumulation between the different groups of VvSTS family members and between VvSTSs and VvCHSs. Indeed, under certain conditions the transcriptional response of VvSTS and VvCHS genes appears to be

  14. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

    Directory of Open Access Journals (Sweden)

    Ritland Carol

    2009-08-01

    Full Text Available Abstract Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs and full-length (FLcDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR and a cytochrome P450 (CYP720B4 from a non-arrayed genomic BAC library of white spruce (Picea glauca. Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR and 94 kbp (CYP720B4 long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs, high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene

  15. Biochemistry and Crystal Structure of Ectoine Synthase: A Metal-Containing Member of the Cupin Superfamily.

    Directory of Open Access Journals (Sweden)

    Nils Widderich

    Full Text Available Ectoine is a compatible solute and chemical chaperone widely used by members of the Bacteria and a few Archaea to fend-off the detrimental effects of high external osmolarity on cellular physiology and growth. Ectoine synthase (EctC catalyzes the last step in ectoine production and mediates the ring closure of the substrate N-gamma-acetyl-L-2,4-diaminobutyric acid through a water elimination reaction. However, the crystal structure of ectoine synthase is not known and a clear understanding of how its fold contributes to enzyme activity is thus lacking. Using the ectoine synthase from the cold-adapted marine bacterium Sphingopyxis alaskensis (Sa, we report here both a detailed biochemical characterization of the EctC enzyme and the high-resolution crystal structure of its apo-form. Structural analysis classified the (SaEctC protein as a member of the cupin superfamily. EctC forms a dimer with a head-to-tail arrangement, both in solution and in the crystal structure. The interface of the dimer assembly is shaped through backbone-contacts and weak hydrophobic interactions mediated by two beta-sheets within each monomer. We show for the first time that ectoine synthase harbors a catalytically important metal co-factor; metal depletion and reconstitution experiments suggest that EctC is probably an iron-dependent enzyme. We found that EctC not only effectively converts its natural substrate N-gamma-acetyl-L-2,4-diaminobutyric acid into ectoine through a cyclocondensation reaction, but that it can also use the isomer N-alpha-acetyl-L-2,4-diaminobutyric acid as its substrate, albeit with substantially reduced catalytic efficiency. Structure-guided site-directed mutagenesis experiments targeting amino acid residues that are evolutionarily highly conserved among the extended EctC protein family, including those forming the presumptive iron-binding site, were conducted to functionally analyze the properties of the resulting EctC variants. An assessment of

  16. Structure of the Y94F mutant of Escherichia coli thymidylate synthase

    International Nuclear Information System (INIS)

    Roberts, Sue A.; Hyatt, David C.; Honts, Jerry E.; Changchien, Liming; Maley, Gladys F.; Maley, Frank; Montfort, William R.

    2006-01-01

    Mutation of Tyr94 of E. coli thymidylate synthase to phenylalanine leads to a protein with k cat reduced by a factor of 400. The Y94F structure is essentially identical to the wild-type structure, which is consistent with a catalytic role for the phenolic OH. Tyr94 of Escherichia coli thymidylate synthase is thought to be involved, either directly or by activation of a water molecule, in the abstraction of a proton from C5 of the 2′-deoxyuridine 5′-monophosphate (dUMP) substrate. Mutation of Tyr94 leads to a 400-fold loss in catalytic activity. The structure of the Y94F mutant has been determined in the native state and as a ternary complex with thymidine 5′-monophosphate (dTMP) and 10-propargyl 5,8-dideazafolate (PDDF). There are no structural changes ascribable to the mutation other than loss of a water molecule hydrogen bonded to the tyrosine OH, which is consistent with a catalytic role for the phenolic OH

  17. Structure-function features of a Mycoplasma glycolipid synthase derived from structural data integration, molecular simulations, and mutational analysis.

    Science.gov (United States)

    Romero-García, Javier; Francisco, Carles; Biarnés, Xevi; Planas, Antoni

    2013-01-01

    Glycoglycerolipids are structural components of mycoplasma membranes with a fundamental role in membrane properties and stability. Their biosynthesis is mediated by glycosyltransferases (GT) that catalyze the transfer of glycosyl units from a sugar nucleotide donor to diacylglycerol. The essential function of glycolipid synthases in mycoplasma viability, and the absence of glycoglycerolipids in animal host cells make these GT enzymes a target for drug discovery by designing specific inhibitors. However, rational drug design has been hampered by the lack of structural information for any mycoplasma GT. Most of the annotated GTs in pathogenic mycoplasmas belong to family GT2. We had previously shown that MG517 in Mycoplasma genitalium is a GT-A family GT2 membrane-associated glycolipid synthase. We present here a series of structural models of MG517 obtained by homology modeling following a multiple-template approach. The models have been validated by mutational analysis and refined by long scale molecular dynamics simulations. Based on the models, key structure-function relationships have been identified: The N-terminal GT domain has a GT-A topology that includes a non-conserved variable region involved in acceptor substrate binding. Glu193 is proposed as the catalytic base in the GT mechanism, and Asp40, Tyr126, Tyr169, Ile170 and Tyr218 define the substrates binding site. Mutation Y169F increases the enzyme activity and significantly alters the processivity (or sequential transferase activity) of the enzyme. This is the first structural model of a GT-A glycoglycerolipid synthase and provides preliminary insights into structure and function relationships in this family of enzymes.

  18. Structure and Mechanism of the Farnesyl Diphosphate Synthase from Trypanosoma cruzi: Implications for Drug Design

    Energy Technology Data Exchange (ETDEWEB)

    Gabelli,S.; McLellan, J.; Montalvetti, A.; Oldfield, E.; Docampo, R.; Amzel, L.

    2006-01-01

    Typanosoma cruzi, the causative agent of Chagas disease, has recently been shown to be sensitive to the action of the bisphosphonates currently used in bone resorption therapy. These compounds target the mevalonate pathway by inhibiting farnesyl diphosphate synthase (farnesyl pyrophosphate synthase, FPPS), the enzyme that condenses the diphosphates of C{sub 5} alcohols (isopentenyl and dimethylallyl) to form C{sub 10} and C{sub 15} diphosphates (geranyl and farnesyl). The structures of the T. cruzi FPPS (TcFPPS) alone and in two complexes with substrates and inhibitors reveal that following binding of the two substrates and three Mg2+ ions, the enzyme undergoes a conformational change consisting of a hinge-like closure of the binding site. In this conformation, it would be possible for the enzyme to bind a bisphosphonate inhibitor that spans the sites usually occupied by dimethylallyl diphosphate (DMAPP) and the homoallyl moiety of isopentenyl diphosphate. This observation may lead to the design of new, more potent anti-trypanosomal bisphosphonates, because existing FPPS inhibitors occupy only the DMAPP site. In addition, the structures provide an important mechanistic insight: after its formation, geranyl diphosphate can swing without leaving the enzyme, from the product site to the substrate site to participate in the synthesis of farnesyl diphosphate.

  19. Functional and Structural Characterization of a (+)-Limonene Synthase from Citrus sinensis.

    Science.gov (United States)

    Morehouse, Benjamin R; Kumar, Ramasamy P; Matos, Jason O; Olsen, Sarah Naomi; Entova, Sonya; Oprian, Daniel D

    2017-03-28

    Terpenes make up the largest and most diverse class of natural compounds and have important commercial and medical applications. Limonene is a cyclic monoterpene (C 10 ) present in nature as two enantiomers, (+) and (-), which are produced by different enzymes. The mechanism of production of the (-)-enantiomer has been studied in great detail, but to understand how enantiomeric selectivity is achieved in this class of enzymes, it is important to develop a thorough biochemical description of enzymes that generate (+)-limonene, as well. Here we report the first cloning and biochemical characterization of a (+)-limonene synthase from navel orange (Citrus sinensis). The enzyme obeys classical Michaelis-Menten kinetics and produces exclusively the (+)-enantiomer. We have determined the crystal structure of the apoprotein in an "open" conformation at 2.3 Å resolution. Comparison with the structure of (-)-limonene synthase (Mentha spicata), which is representative of a fully closed conformation (Protein Data Bank entry 2ONG ), reveals that the short H-α1 helix moves nearly 5 Å inward upon substrate binding, and a conserved Tyr flips to point its hydroxyl group into the active site.

  20. Structural genomic variation in ischemic stroke

    Science.gov (United States)

    Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.

    2008-01-01

    Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507

  1. Structure of the Cellulose Synthase Complex of Gluconacetobacter hansenii at 23.4 Å Resolution.

    Directory of Open Access Journals (Sweden)

    Juan Du

    Full Text Available Bacterial crystalline cellulose is used in biomedical and industrial applications, but the molecular mechanisms of synthesis are unclear. Unlike most bacteria, which make non-crystalline cellulose, Gluconacetobacter hansenii extrudes profuse amounts of crystalline cellulose. Its cellulose synthase (AcsA exists as a complex with accessory protein AcsB, forming a 'terminal complex' (TC that has been visualized by freeze-fracture TEM at the base of ribbons of crystalline cellulose. The catalytic AcsAB complex is embedded in the cytoplasmic membrane. The C-terminal portion of AcsC is predicted to form a translocation channel in the outer membrane, with the rest of AcsC possibly interacting with AcsD in the periplasm. It is thus believed that synthesis from an organized array of TCs coordinated with extrusion by AcsC and AcsD enable this bacterium to make crystalline cellulose. The only structural data that exist for this system are the above mentioned freeze-fracture TEM images, fluorescence microscopy images revealing that TCs align in a row, a crystal structure of AcsD bound to cellopentaose, and a crystal structure of PilZ domain of AcsA. Here we advance our understanding of the structural basis for crystalline cellulose production by bacterial cellulose synthase by determining a negative stain structure resolved to 23.4 Å for highly purified AcsAB complex that catalyzed incorporation of UDP-glucose into β-1,4-glucan chains, and responded to the presence of allosteric activator cyclic diguanylate. Although the AcsAB complex was functional in vitro, the synthesized cellulose was not visible in TEM. The negative stain structure revealed that AcsAB is very similar to that of the BcsAB synthase of Rhodobacter sphaeroides, a non-crystalline cellulose producing bacterium. The results indicate that the crystalline cellulose producing and non-crystalline cellulose producing bacteria share conserved catalytic and membrane translocation components, and

  2. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  3. Gene structure, phylogeny and expression profile of the sucrose synthase gene family in cacao (Theobroma cacao L.).

    Science.gov (United States)

    Li, Fupeng; Hao, Chaoyun; Yan, Lin; Wu, Baoduo; Qin, Xiaowei; Lai, Jianxiong; Song, Yinghui

    2015-09-01

    In higher plants, sucrose synthase (Sus, EC 2.4.1.13) is widely considered as a key enzyme involved in sucrose metabolism. Although, several paralogous genes encoding different isozymes of Sus have been identified and characterized in multiple plant genomes, to date detailed information about the Sus genes is lacking for cacao. This study reports the identification of six novel Sus genes from economically important cacao tree. Analyses of the gene structure and phylogeny of the Sus genes demonstrated evolutionary conservation in the Sus family across cacao and other plant species. The expression of cacao Sus genes was investigated via real-time PCR in various tissues, different developmental phases of leaf, flower bud and pod. The Sus genes exhibited distinct but partially redundant expression profiles in cacao, with TcSus1, TcSus5 and TcSus6, being the predominant genes in the bark with phloem, TcSus2 predominantly expressing in the seed during the stereotype stage. TcSus3 and TcSus4 were significantly detected more in the pod husk and seed coat along the pod development, and showed development dependent expression profiles in the cacao pod. These results provide new insights into the evolution, and basic information that will assist in elucidating the functions of cacao Sus gene family.

  4. Applications of new biophysical techniques to supramolecular structure of ATP synthase

    International Nuclear Information System (INIS)

    Zhu Jie; Wang Guodong

    2007-01-01

    The developing modern physical techniques offer a series of abundant and effective methods to study ATP synthase in structure and function. Firstly we stressed on the dialectic relationship between physical techniques and the improvement of science in history, and introduced a lot of physical techniques in common use in protein researches such as mass spectroscopy, nuclear magnetic resonance, synchronization X-ray diffraction, infrared spectroscopy and ultraviolet spectroscopy, and then reviewed their application status in quo to ATP synthase. Secondly we paid out attention to the burgeoning unconventionally instruments, i.e., the atomic force microscope and the fluorescence resonance energy transform (FRET) which have attracted the professional attention, and introduced latest application and researches' achievements. Compared the development of the techniques in recent years, we have set forth the shortcoming and excellence of all kinds of equipments introduced. And it was ended with the conclusion that it is necessary to manage the possible instruments effectively and sufficient for the personalities, and given out the optimum research routes which emphasized on the new techniques and novel methods, i.e., the atomic force microscope and FRET. (authors)

  5. Structural Basis for Cyclization Specificity of Two Azotobacter Type III Polyketide Synthases

    Science.gov (United States)

    Satou, Ryutaro; Miyanaga, Akimasa; Ozawa, Hiroki; Funa, Nobutaka; Katsuyama, Yohei; Miyazono, Ken-ichi; Tanokura, Masaru; Ohnishi, Yasuo; Horinouchi, Sueharu

    2013-01-01

    Type III polyketide synthases (PKSs) show diverse cyclization specificity. We previously characterized two Azotobacter type III PKSs (ArsB and ArsC) with different cyclization specificity. ArsB and ArsC, which share a high sequence identity (71%), produce alkylresorcinols and alkylpyrones through aldol condensation and lactonization of the same polyketomethylene intermediate, respectively. Here we identified a key amino acid residue for the cyclization specificity of each enzyme by site-directed mutagenesis. Trp-281 of ArsB corresponded to Gly-284 of ArsC in the amino acid sequence alignment. The ArsB W281G mutant synthesized alkylpyrone but not alkylresorcinol. In contrast, the ArsC G284W mutant synthesized alkylresorcinol with a small amount of alkylpyrone. These results indicate that this amino acid residue (Trp-281 of ArsB or Gly-284 of ArsC) should occupy a critical position for the cyclization specificity of each enzyme. We then determined crystal structures of the wild-type and G284W ArsC proteins at resolutions of 1.76 and 1.99 Å, respectively. Comparison of these two ArsC structures indicates that the G284W substitution brings a steric wall to the active site cavity, resulting in a significant reduction of the cavity volume. We postulate that the polyketomethylene intermediate can be folded to a suitable form for aldol condensation only in such a relatively narrow cavity of ArsC G284W (and presumably ArsB). This is the first report on the alteration of cyclization specificity from lactonization to aldol condensation for a type III PKS. The ArsC G284W structure is significant as it is the first reported structure of a microbial resorcinol synthase. PMID:24100027

  6. Automating gene library synthesis by structure-based combinatorial protein engineering: examples from plant sesquiterpene synthases.

    Science.gov (United States)

    Dokarry, Melissa; Laurendon, Caroline; O'Maille, Paul E

    2012-01-01

    Structure-based combinatorial protein engineering (SCOPE) is a homology-independent recombination method to create multiple crossover gene libraries by assembling defined combinations of structural elements ranging from single mutations to domains of protein structure. SCOPE was originally inspired by DNA shuffling, which mimics recombination during meiosis, where mutations from parental genes are "shuffled" to create novel combinations in the resulting progeny. DNA shuffling utilizes sequence identity between parental genes to mediate template-switching events (the annealing and extension of one parental gene fragment on another) in PCR reassembly reactions to generate crossovers and hence recombination between parental genes. In light of the conservation of protein structure and degeneracy of sequence, SCOPE was developed to enable the "shuffling" of distantly related genes with no requirement for sequence identity. The central principle involves the use of oligonucleotides to encode for crossover regions to choreograph template-switching events during PCR assembly of gene fragments to create chimeric genes. This approach was initially developed to create libraries of hybrid DNA polymerases from distantly related parents, and later developed to create a combinatorial mutant library of sesquiterpene synthases to explore the catalytic landscapes underlying the functional divergence of related enzymes. This chapter presents a simplified protocol of SCOPE that can be integrated with different mutagenesis techniques and is suitable for automation by liquid-handling robots. Two examples are presented to illustrate the application of SCOPE to create gene libraries using plant sesquiterpene synthases as the model system. In the first example, we outline how to create an active-site library as a series of complex mixtures of diverse mutants. In the second example, we outline how to create a focused library as an array of individual clones to distil minimal combinations of

  7. Benzalacetone Synthase

    Directory of Open Access Journals (Sweden)

    Ikuro eAbe

    2012-03-01

    Full Text Available Benzalacetone synthase, from the medicinal plant Rheum palmatum (Polygonaceae (RpBAS, is a plant-specific chalcone synthase (CHS superfamily of type III polyketide synthase (PKS. RpBAS catalyzes the one-step, decarboxylative condensation of 4-coumaroyl-CoA with malonyl-CoA to produce the C6-C4 benzalacetone scaffold. The X-ray crystal structures of RpBAS confirmed that the diketide-forming activity is attributable to the characteristic substitution of the conserved active-site "gatekeeper" Phe with Leu. Furthermore, the crystal structures suggested that RpBAS employs novel catalytic machinery for the thioester bond cleavage of the enzyme-bound diketide intermediate and the final decarboxylation reaction to produce benzalacetone. Finally, by exploiting the remarkable substrate tolerance and catalytic versatility of RpBAS, precursor-directed biosynthesis efficiently generated chemically and structurally divergent, unnatural novel polyketide scaffolds. These findings provided a structural basis for the functional diversity of the type III PKS enzymes.

  8. Using Genomics for Natural Product Structure Elucidation.

    Science.gov (United States)

    Tietz, Jonathan I; Mitchell, Douglas A

    2016-01-01

    Natural products (NPs) are the most historically bountiful source of chemical matter for drug development-especially for anti-infectives. With insights gleaned from genome mining, interest in natural product discovery has been reinvigorated. An essential stage in NP discovery is structural elucidation, which sheds light not only on the chemical composition of a molecule but also its novelty, properties, and derivatization potential. The history of structure elucidation is replete with techniquebased revolutions: combustion analysis, crystallography, UV, IR, MS, and NMR have each provided game-changing advances; the latest such advance is genomics. All natural products have a genetic basis, and the ability to obtain and interpret genomic information for structure elucidation is increasingly available at low cost to non-specialists. In this review, we describe the value of genomics as a structural elucidation technique, especially from the perspective of the natural product chemist approaching an unknown metabolite. Herein we first introduce the databases and programs of interest to the natural products chemist, with an emphasis on those currently most suited for general usability. We describe strategies for linking observed natural product-linked phenotypes to their corresponding gene clusters. We then discuss techniques for extracting structural information from genes, illustrated with numerous case examples. We also provide an analysis of the biases and limitations of the field with recommendations for future development. Our overview is not only aimed at biologically-oriented researchers already at ease with bioinformatic techniques, but also, in particular, at natural product, organic, and/or medicinal chemists not previously familiar with genomic techniques.

  9. A data management system for structural genomics

    Directory of Open Access Journals (Sweden)

    O'Toole Nicholas

    2004-06-01

    Full Text Available Abstract Background Structural genomics (SG projects aim to determine thousands of protein structures by the development of high-throughput techniques for all steps of the experimental structure determination pipeline. Crucial to the success of such endeavours is the careful tracking and archiving of experimental and external data on protein targets. Results We have developed a sophisticated data management system for structural genomics. Central to the system is an Oracle-based, SQL-interfaced database. The database schema deals with all facets of the structure determination process, from target selection to data deposition. Users access the database via any web browser. Experimental data is input by users with pre-defined web forms. Data can be displayed according to numerous criteria. A list of all current target proteins can be viewed, with links for each target to associated entries in external databases. To avoid unnecessary work on targets, our data management system matches protein sequences weekly using BLAST to entries in the Protein Data Bank and to targets of other SG centers worldwide. Conclusion Our system is a working, effective and user-friendly data management tool for structural genomics projects. In this report we present a detailed summary of the various capabilities of the system, using real target data as examples, and indicate our plans for future enhancements.

  10. Interrogating the druggable genome with structural informatics.

    Science.gov (United States)

    Hambly, Kevin; Danzer, Joseph; Muskal, Steven; Debe, Derek A

    2006-08-01

    Structural genomics projects are producing protein structure data at an unprecedented rate. In this paper, we present the Target Informatics Platform (TIP), a novel structural informatics approach for amplifying the rapidly expanding body of experimental protein structure information to enhance the discovery and optimization of small molecule protein modulators on a genomic scale. In TIP, existing experimental structure information is augmented using a homology modeling approach, and binding sites across multiple target families are compared using a clique detection algorithm. We report here a detailed analysis of the structural coverage for the set of druggable human targets, highlighting drug target families where the level of structural knowledge is currently quite high, as well as those areas where structural knowledge is sparse. Furthermore, we demonstrate the utility of TIP's intra- and inter-family binding site similarity analysis using a series of retrospective case studies. Our analysis underscores the utility of a structural informatics infrastructure for extracting drug discovery-relevant information from structural data, aiding researchers in the identification of lead discovery and optimization opportunities as well as potential "off-target" liabilities.

  11. Crystal structure of plant acetohydroxyacid synthase, the target for several commercial herbicides.

    Science.gov (United States)

    Garcia, Mario Daniel; Wang, Jian-Guo; Lonhienne, Thierry; Guddat, Luke William

    2017-07-01

    Acetohydroxyacid synthase (AHAS, EC 2.2.1.6) is the first enzyme in the branched-chain amino acid biosynthesis pathway. Five of the most widely used commercial herbicides (i.e. sulfonylureas, imidazolinones, triazolopyrimidines, pyrimidinyl-benzoates and sulfonylamino-cabonyl-triazolinones) target this enzyme. Here we have determined the first crystal structure of a plant AHAS in the absence of any inhibitor (2.9 Å resolution) and it shows that the herbicide-binding site adopts a folded state even in the absence of an inhibitor. This is unexpected because the equivalent regions for herbicide binding in uninhibited Saccharomyces cerevisiae AHAS crystal structures are either disordered, or adopt a different fold when the herbicide is not present. In addition, the structure provides an explanation as to why some herbicides are more potent inhibitors of Arabidopsis thaliana AHAS compared to AHASs from other species (e.g. S. cerevisiae). The elucidation of the native structure of plant AHAS provides a new platform for future rational structure-based herbicide design efforts. The coordinates and structure factors for uninhibited AtAHAS have been deposited in the Protein Data Bank (www.pdb.org) with the PDB ID code 5K6Q. © 2017 Federation of European Biochemical Societies.

  12. Occurrence, structure, and evolution of nitric oxide synthase-like proteins in the plant kingdom.

    Science.gov (United States)

    Jeandroz, Sylvain; Wipf, Daniel; Stuehr, Dennis J; Lamattina, Lorenzo; Melkonian, Michael; Tian, Zhijian; Zhu, Ying; Carpenter, Eric J; Wong, Gane Ka-Shu; Wendehenne, David

    2016-03-01

    Nitric oxide (NO) signaling regulates various physiological processes in both animals and plants. In animals, NO synthesis is mainly catalyzed by NO synthase (NOS) enzymes. Although NOS-like activities that are sensitive to mammalian NOS inhibitors have been detected in plant extracts, few bona fide plant NOS enzymes have been identified. We searched the data set produced by the 1000 Plants (1KP) international consortium for the presence of transcripts encoding NOS-like proteins in over 1000 species of land plants and algae. We also searched for genes encoding NOS-like enzymes in 24 publicly available algal genomes. We identified no typical NOS sequences in 1087 sequenced transcriptomes of land plants. In contrast, we identified NOS-like sequences in 15 of the 265 algal species analyzed. Even if the presence of NOS enzymes assembled from multipolypeptides in plants cannot be conclusively discarded, the emerging data suggest that, instead of generating NO with evolutionarily conserved NOS enzymes, land plants have evolved finely regulated nitrate assimilation and reduction processes to synthesize NO through a mechanism different than that in animals. Copyright © 2016, American Association for the Advancement of Science.

  13. Structural and Kinetic Properties of Lumazine Synthase Isoenzymes in the Order Rhizobiales

    Energy Technology Data Exchange (ETDEWEB)

    Klinke,S.; Zylberman, V.; Bonomi, H.; Haase, I.; Guimaraes, B.; Braden, B.; Bacher, A.; Fischer, M.; Goldbaum, F.

    2007-01-01

    6, 7-Dimethyl-8-ribityllumazine synthase (lumazine synthase; LS) catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. This protein is known to exhibit different quaternary assemblies between species, existing as free pentamers, decamers (dimers of pentamers) and icosahedrally arranged dodecamers of pentamers. A phylogenetic analysis on eubacterial, fungal and plant LSs allowed us to classify them into two categories: Type I LSs (pentameric or icosahedral) and Type II LSs (decameric). The Rhizobiales represent an order of ?-proteobacteria that includes, among others, the genera Mesorhizobium, Agrobacterium and Brucella. Here, we present structural and kinetic studies on several LSs from Rhizobiales. Interestingly, Mesorhizobium and Brucella encode both a Type-I LS and a Type-II LS called RibH1 and RibH2, respectively. We show that Type II LSs appear to be almost inactive, whereas Type I LSs present a highly variable catalytic activity according to the genus. Additionally, we have solved four RibH1/RibH2 crystallographic structures from the genera Mesorhizobium and Brucella. The relationship between the active-site architecture and catalytic properties in these isoenzymes is discussed, and a model that describes the enzymatic behavior is proposed. Furthermore, sequence alignment studies allowed us to extend our results to the genus Agrobacterium. Our results suggest that the selective pressure controlling the riboflavin pathway favored the evolution of catalysts with low reaction rates, since the excess of flavins in the intracellular pool in Rhizobiales could act as a negative factor when these bacteria are exposed to oxidative or nitrosative stress.

  14. Structure of a NADH-insensitive hexameric citrate synthase that resists acid inactivation.

    Science.gov (United States)

    Francois, Julie A; Starks, Courtney M; Sivanuntakorn, Sasitorn; Jiang, Hong; Ransome, Aaron E; Nam, Jeong-Won; Constantine, Charles Z; Kappock, T Joseph

    2006-11-14

    Acetobacter aceti converts ethanol to acetic acid, and strains highly resistant to both are used to make vinegar. A. aceti survives acetic acid exposure by tolerating cytoplasmic acidification, which implies an unusual adaptation of cytoplasmic components to acidic conditions. A. aceti citrate synthase (AaCS), a hexameric type II citrate synthase, is required for acetic acid resistance and, therefore, would be expected to function at low pH. Recombinant AaCS has intrinsic acid stability that may be a consequence of strong selective pressure to function at low pH, and unexpectedly high thermal stability for a protein that has evolved to function at approximately 30 degrees C. The crystal structure of AaCS, complexed with oxaloacetate (OAA) and the inhibitor carboxymethyldethia-coenzyme A (CMX), was determined to 1.85 A resolution using protein purified by a tandem affinity purification procedure. This is the first crystal structure of a "closed" type II CS, and its active site residues interact with OAA and CMX in the same manner observed in the corresponding type I chicken CS.OAA.CMX complex. While AaCS is not regulated by NADH, it retains many of the residues used by Escherichia coli CS (EcCS) for NADH binding. The surface of AaCS is abundantly decorated with basic side chains and has many fewer uncompensated acidic charges than EcCS; this constellation of charged residues is stable in varied pH environments and may be advantageous in the A. aceti cytoplasm.

  15. Structure of the Catalytic Domain of the Class I Polyhydroxybutyrate Synthase from Cupriavidus necator.

    Science.gov (United States)

    Wittenborn, Elizabeth C; Jost, Marco; Wei, Yifeng; Stubbe, JoAnne; Drennan, Catherine L

    2016-11-25

    Polyhydroxybutyrate synthase (PhaC) catalyzes the polymerization of 3-(R)-hydroxybutyryl-coenzyme A as a means of carbon storage in many bacteria. The resulting polymers can be used to make biodegradable materials with properties similar to those of thermoplastics and are an environmentally friendly alternative to traditional petroleum-based plastics. A full biochemical and mechanistic understanding of this process has been hindered in part by a lack of structural information on PhaC. Here we present the first structure of the catalytic domain (residues 201-589) of the class I PhaC from Cupriavidus necator (formerly Ralstonia eutropha) to 1.80 Å resolution. We observe a symmetrical dimeric architecture in which the active site of each monomer is separated from the other by ∼33 Å across an extensive dimer interface, suggesting a mechanism in which polyhydroxybutyrate biosynthesis occurs at a single active site. The structure additionally highlights key side chain interactions within the active site that play likely roles in facilitating catalysis, leading to the proposal of a modified mechanistic scheme involving two distinct roles for the active site histidine. We also identify putative substrate entrance and product egress routes within the enzyme, which are discussed in the context of previously reported biochemical observations. Our structure lays a foundation for further biochemical and structural characterization of PhaC, which could assist in engineering efforts for the production of eco-friendly materials. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  16. Structure of the Catalytic Domain of the Class I Polyhydroxybutyrate Synthase from Cupriavidus necator*

    Science.gov (United States)

    Wittenborn, Elizabeth C.; Jost, Marco; Wei, Yifeng; Stubbe, JoAnne; Drennan, Catherine L.

    2016-01-01

    Polyhydroxybutyrate synthase (PhaC) catalyzes the polymerization of 3-(R)-hydroxybutyryl-coenzyme A as a means of carbon storage in many bacteria. The resulting polymers can be used to make biodegradable materials with properties similar to those of thermoplastics and are an environmentally friendly alternative to traditional petroleum-based plastics. A full biochemical and mechanistic understanding of this process has been hindered in part by a lack of structural information on PhaC. Here we present the first structure of the catalytic domain (residues 201–589) of the class I PhaC from Cupriavidus necator (formerly Ralstonia eutropha) to 1.80 Å resolution. We observe a symmetrical dimeric architecture in which the active site of each monomer is separated from the other by ∼33 Å across an extensive dimer interface, suggesting a mechanism in which polyhydroxybutyrate biosynthesis occurs at a single active site. The structure additionally highlights key side chain interactions within the active site that play likely roles in facilitating catalysis, leading to the proposal of a modified mechanistic scheme involving two distinct roles for the active site histidine. We also identify putative substrate entrance and product egress routes within the enzyme, which are discussed in the context of previously reported biochemical observations. Our structure lays a foundation for further biochemical and structural characterization of PhaC, which could assist in engineering efforts for the production of eco-friendly materials. PMID:27742839

  17. Structural and functional annotation of citrate synthase from Aspergillus niger ANJ-120.

    Science.gov (United States)

    Mustafa, Ghulam; Arif, Rawaba; Bukhari, Shazia Anwer; Ali, Muhammad; Sharif, Sumaira; Atta, Asia

    2018-03-01

    Citrate synthase (CS) is involved in citric acid biosynthesis which is a well-established metabolic pathway. The condensation of acetyl-CoA with oxaloacetate is catalyzed by CS. Citric acid (CA) has a number of applications in pharmaceutical industry. CA in combination with bicarbonates is used as an effervescent in the preparations of tablets and powders. It has also been used as an anticoagulant and acidulant to form mild astringent. In current study, detailed structural and functional analyses of CS protein were carried out using various bioinformatics tools. Structural modeling was also done by building 3D model of CS from Aspergillus niger ANJ-120 using Modeller 9.16 software. The 3D Model was then evaluated using different online approaches. Furthermore, superimposition of query and template structures, Root Mean Squared Deviation and visualization of generated model were done through UCSF Chimera 1.5.3. Even though various roles of CS protein were already known and verified experimentally, here we presented a structural analysis of CS protein. The structural investigation of CS protein will be helpful for protein engineering strategies and understanding the interactions among proteins. Due to large number of applications, the production of citric acid by A. niger and its bioinformatics studies will offer substantial improvement in commercial scale intensification of this useful product.

  18. High-throughput Crystallography for Structural Genomics

    Science.gov (United States)

    Joachimiak, Andrzej

    2009-01-01

    Protein X-ray crystallography recently celebrated its 50th anniversary. The structures of myoglobin and hemoglobin determined by Kendrew and Perutz provided the first glimpses into the complex protein architecture and chemistry. Since then, the field of structural molecular biology has experienced extraordinary progress and now over 53,000 proteins structures have been deposited into the Protein Data Bank. In the past decade many advances in macromolecular crystallography have been driven by world-wide structural genomics efforts. This was made possible because of third-generation synchrotron sources, structure phasing approaches using anomalous signal and cryo-crystallography. Complementary progress in molecular biology, proteomics, hardware and software for crystallographic data collection, structure determination and refinement, computer science, databases, robotics and automation improved and accelerated many processes. These advancements provide the robust foundation for structural molecular biology and assure strong contribution to science in the future. In this report we focus mainly on reviewing structural genomics high-throughput X-ray crystallography technologies and their impact. PMID:19765976

  19. Gene Composer in a structural genomics environment

    International Nuclear Information System (INIS)

    Lorimer, Don; Raymond, Amy; Mixon, Mark; Burgin, Alex; Staker, Bart; Stewart, Lance

    2011-01-01

    For structural biology applications, protein-construct engineering is guided by comparative sequence analysis and structural information, which allow the researcher to better define domain boundaries for terminal deletions and nonconserved regions for surface mutants. A database software application called Gene Composer has been developed to facilitate construct design. The structural genomics effort at the Seattle Structural Genomics Center for Infectious Disease (SSGCID) requires the manipulation of large numbers of amino-acid sequences and the underlying DNA sequences which are to be cloned into expression vectors. To improve efficiency in high-throughput protein structure determination, a database software package, Gene Composer, has been developed which facilitates the information-rich design of protein constructs and their underlying gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bioinformatics steps used in modern structure-guided protein engineering and synthetic gene engineering. An example of the structure determination of H1N1 RNA-dependent RNA polymerase PB2 subunit is given

  20. Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

    Energy Technology Data Exchange (ETDEWEB)

    Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang; Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinlzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

    2013-01-24

    The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.

  1. Comparative genome structure, secondary metabolite, and effector coding capacity across Cochliobolus pathogens.

    Directory of Open Access Journals (Sweden)

    Bradford J Condon

    Full Text Available The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus, and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI. The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS, polyketide synthase (PKS, and SSP-encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.

  2. Citrate synthase proteins in extremophilic organisms: Studies within a structure-based model

    International Nuclear Information System (INIS)

    Różycki, Bartosz; Cieplak, Marek

    2014-01-01

    We study four citrate synthase homodimeric proteins within a structure-based coarse-grained model. Two of these proteins come from thermophilic bacteria, one from a cryophilic bacterium and one from a mesophilic organism; three are in the closed and two in the open conformations. Even though the proteins belong to the same fold, the model distinguishes the properties of these proteins in a way which is consistent with experiments. For instance, the thermophilic proteins are more stable thermodynamically than their mesophilic and cryophilic homologues, which we observe both in the magnitude of thermal fluctuations near the native state and in the kinetics of thermal unfolding. The level of stability correlates with the average coordination number for amino acid contacts and with the degree of structural compactness. The pattern of positional fluctuations along the sequence in the closed conformation is different than in the open conformation, including within the active site. The modes of correlated and anticorrelated movements of pairs of amino acids forming the active site are very different in the open and closed conformations. Taken together, our results show that the precise location of amino acid contacts in the native structure appears to be a critical element in explaining the similarities and differences in the thermodynamic properties, local flexibility, and collective motions of the different forms of the enzyme

  3. The crystal structure of spermidine synthase with a multisubstrate adduct inhibitor.

    Energy Technology Data Exchange (ETDEWEB)

    Korolev, S.; Ikeguchi, Y.; Skarina, T.; Beasley, S.; Arrowsmith, C.; Edwards, A.; Joachimiak, A.; Pegg, A. E.; Savchenko, A.; Pennsylvania State Univ. Coll. of Medicine; Milton S. Hershey Medical Center; Banting and Best Department of Medical Research; Univ. of Health Network

    2002-01-01

    Polyamines are essential in all branches of life. Spermidine synthase (putrescine aminopropyltransferase, PAPT) catalyzes the biosynthesis of spermidine, a ubiquitous polyamine. The crystal structure of the PAPT from Thermotoga maritima (TmPAPT) has been solved to 1.5 Angstroms resolution in the presence and absence of AdoDATO (S-adenosyl-1,8-diamino-3-thiooctane), a compound containing both substrate and product moieties. This, the first structure of an aminopropyltransferase, reveals deep cavities for binding substrate and cofactor, and a loop that envelops the active site. The AdoDATO binding site is lined with residues conserved in PAPT enzymes from bacteria to humans, suggesting a universal catalytic mechanism. Other conserved residues act sterically to provide a structural basis for polyamine specificity. The enzyme is tetrameric; each monomer consists of a C-terminal domain with a Rossmann-like fold and an N-terminal {beta}-stranded domain. The tetramer is assembled using a novel barrel-type oligomerization motif.

  4. Genome-Wide Identification, Evolutionary and Expression Analyses of the GALACTINOL SYNTHASE Gene Family in Rapeseed and Tobacco

    Directory of Open Access Journals (Sweden)

    Yonghai Fan

    2017-12-01

    Full Text Available Galactinol synthase (GolS is a key enzyme in raffinose family oligosaccharide (RFO biosynthesis. The finding that GolS accumulates in plants exposed to abiotic stresses indicates RFOs function in environmental adaptation. However, the evolutionary relationships and biological functions of GolS family in rapeseed (Brassica napus and tobacco (Nicotiana tabacum remain unclear. In this study, we identified 20 BnGolS and 9 NtGolS genes. Subcellular localization predictions showed that most of the proteins are localized to the cytoplasm. Phylogenetic analysis identified a lost event of an ancient GolS copy in the Solanaceae and an ancient duplication event leading to evolution of GolS4/7 in the Brassicaceae. The three-dimensional structures of two GolS proteins were conserved, with an important DxD motif for binding to UDP-galactose (uridine diphosphate-galactose and inositol. Expression profile analysis indicated that BnGolS and NtGolS genes were expressed in most tissues and highly expressed in one or two specific tissues. Hormone treatments strongly induced the expression of most BnGolS genes and homologous genes in the same subfamilies exhibited divergent-induced expression. Our study provides a comprehensive evolutionary analysis of GolS genes among the Brassicaceae and Solanaceae as well as an insight into the biological function of GolS genes in hormone response in plants.

  5. Fatty acid synthase inhibitors from plants: isolation, structure elucidation, and SAR studies.

    Science.gov (United States)

    Li, Xing-Cong; Joshi, Alpana S; ElSohly, Hala N; Khan, Shabana I; Jacob, Melissa R; Zhang, Zhizheng; Khan, Ikhlas A; Ferreira, Daneel; Walker, Larry A; Broedel, Sheldon E; Raulli, Robert E; Cihlar, Ronald L

    2002-12-01

    Fatty acid synthase (FAS) has been identified as a potential antifungal target. FAS prepared from Saccharomyces cerevisiae was employed for bioactivity-guided fractionation of Chlorophora tinctoria,Paspalum conjugatum, Symphonia globulifera, Buchenavia parviflora, and Miconia pilgeriana. Thirteen compounds (1-13), including three new natural products (1, 4, 12), were isolated and their structures identified by spectroscopic interpretation. They represented five chemotypes, namely, isoflavones, flavones, biflavonoids, hydrolyzable tannin-related derivatives, and triterpenoids. 3'-Formylgenistein (1) and ellagic acid 4-O-alpha-l-rhamnopyranoside (9) were the most potent compounds against FAS, with IC(50) values of 2.3 and 7.5 microg/mL, respectively. Furthermore, 43 (14-56) analogues of the five chemotypes from our natural product repository and commercial sources were tested for their FAS inhibitory activity. Structure-activity relationships for some chemotypes were investigated. All these compounds were further evaluated for antifungal activity against Candida albicans and Cryptococcus neoformans. Although there were several antifungal compounds in the set, correlation between the FAS inhibitory activity and antifungal activity could not be defined.

  6. 7.5-Å cryo-em structure of the mycobacterial fatty acid synthase.

    Science.gov (United States)

    Boehringer, Daniel; Ban, Nenad; Leibundgut, Marc

    2013-03-11

    The mycobacterial fatty acid synthase (FAS) complex is a giant 2.0-MDa α(6) homohexameric multifunctional enzyme that catalyzes synthesis of fatty acid precursors of mycolic acids, which are major components of the cell wall in Mycobacteria and play an important role in pathogenicity. Here, we present a three-dimensional reconstruction of the Mycobacterium smegmatis FAS complex at 7.5Å, highly homologous to the Mycobacterium tuberculosis multienzyme, by cryo-electron microscopy. Based on the obtained structural data, which allowed us to identify secondary-structure elements, and sequence homology with the fungal FAS, we generated an accurate architectural model of the complex. The FAS system from Mycobacteria resembles a minimized version of the fungal FAS with much larger openings in the reaction chambers. These architectural features of the mycobacterial FAS may be important for the interaction with mycolic acid processing and condensing enzymes that further modify the precursors produced by FAS and for autoactivation of the FAS complex. Copyright © 2012 Elsevier Ltd. All rights reserved.

  7. Structure of Quinolinate Synthase from Pyrococcus horikoshii in the Presence of Its Product, Quinolinic Acid.

    Science.gov (United States)

    Esakova, Olga A; Silakov, Alexey; Grove, Tyler L; Saunders, Allison H; McLaughlin, Martin I; Yennawar, Neela H; Booker, Squire J

    2016-06-15

    Quinolinic acid (QA) is a common intermediate in the biosynthesis of nicotinamide adenine dinucleotide (NAD(+)) and its derivatives in all organisms that synthesize the molecule de novo. In most prokaryotes, it is formed from the condensation of dihydroxyacetone phosphate (DHAP) and aspartate-enamine by the action of quinolinate synthase (NadA). NadA contains a [4Fe-4S] cluster cofactor with a unique, non-cysteinyl-ligated, iron ion (Fea), which is proposed to bind the hydroxyl group of a postulated intermediate in the last step of the reaction to facilitate a dehydration. However, direct evidence for this role in catalysis has yet to be provided. Herein, we present the structure of NadA in the presence of the product of its reaction, QA. We find that N1 and the C7 carboxylate group of QA ligate to Fea in a bidentate fashion, which is confirmed by Hyperfine Sublevel Correlation (HYSCORE) spectroscopy. This binding mode would place the C5 hydroxyl group of the postulated final intermediate distal to Fea and virtually incapable of coordinating to it. The structure shows that three strictly conserved amino acids, Glu198, Tyr109, and Tyr23, are in close proximity to the bound product. Substitution of these amino acids with Gln, Phe, and Phe, respectively, leads to complete loss of activity.

  8. Structure and Subunit Arrangement of the A-type ATP Synthase Complex from the Archaeon Methanococcus jannaschii Visualized by Electron Microscopy

    NARCIS (Netherlands)

    Coskun, Ünal; Chaban, Yuriy L.; Lingl, Astrid; Müller, Volker; Keegstra, Wilko; Boekema, Egbert J.; Grüber, Gerhard; Gruber, 27460

    2004-01-01

    In Archaea, bacteria, and eukarya, ATP provides metabolic energy for energy-dependent processes. It is synthesized by enzymes known as A-type or F-type ATP synthase, which are the smallest rotatory engines in nature. Here, we report the first projected structure of an intact A1A0 ATP synthase from

  9. Characterization and structural features of a chalcone synthase mutation in a white-flowering line of Matthiola incana R. Br. (Brassicaceae).

    Science.gov (United States)

    Hemleben, Vera; Dressel, Angela; Epping, Bernhard; Lukacin, Richard; Martens, Stefan; Austin, Michael

    2004-05-01

    For Matthiola incana (Brassicaceae), used as a model system to study biochemical and genetical aspects of anthocyanin biosynthesis, several nearly isogenic colored wild type lines and white-flowering mutant lines are available, each with a specific defect in the genes responsible for anthocyanin production (genes e, f, and g). For gene f supposed to code for chalcone synthase (CHS; EC 2.3.1.74), the key enzyme of the flavonoid/anthocyanin biosynthesis pathway belonging to the group of type III polyketide synthases (PKS), the wild type genomic sequence of M. incana line 04 was determined in comparison to the white-flowering CHS mutant line 18. The type of mutation in the chs gene was characterized as a single nucleotide substitution in a triplet AGG coding for an evolutionary conserved arginine into AGT coding for serine (R72S). Northern blots and RT-PCR demonstrated that the mutated gene is expressed in flower petals. Heterologous expression of the wild type and mutated CHS cDNA in E. Scherichia coli, verified by Western blotting and enzyme assays with various starter molecules, revealed that the mutant protein had no detectable activity, indicating that the strictly conserved arginine residue is essential for the enzymatic reaction. This mutation, which previously was not detected by mutagenic screening, is discussed in the light of structural and functional information on alfalfa CHS and related type III PKS enzymes.

  10. Eukaryotic beta-alanine synthases are functionally related but have a high degree of structural diversity

    DEFF Research Database (Denmark)

    Gojkovic, Zoran; Sandrini, Michael; Piskur, Jure

    2001-01-01

    beta -Alanine synthase (EC 3.5.1.6), which catalyzes the final step of pyrimidine catabolism, has only been characterized in mammals. A Saccharomyces kluyveri pyd3 mutant that is unable to grow on N-carbamy-beta -alanine as the sole nitrogen source and exhibits diminished beta -alanine synthase...... no pyrimidine catabolic pathway, it enabled growth on N-carbamyl- beta -alanine as the sole nitrogen source. The D. discoideum and D. melanogaster PYD3 gene products are similar to mammalian beta -alanine synthases. In contrast, the S. kluyveri protein is quite different from these and more similar to bacterial...... N- carbamyl amidohydrolases. All three beta -alanine synthases are to some degree related to various aspartate transcarbamylases, which catalyze the second step of the de novo pyrimidine biosynthetic pathway. PYD3 expression in yeast seems to be inducible by dihydrouracil and N...

  11. Structures of the N-terminal modules imply large domain motions during catalysis by methionine synthase.

    Science.gov (United States)

    Evans, John C; Huddler, Donald P; Hilgers, Mark T; Romanchuk, Gail; Matthews, Rowena G; Ludwig, Martha L

    2004-03-16

    B(12)-dependent methionine synthase (MetH) is a large modular enzyme that utilizes the cobalamin cofactor as a methyl donor or acceptor in three separate reactions. Each methyl transfer occurs at a different substrate-binding domain and requires a different arrangement of modules. In the catalytic cycle, the cobalamin-binding domain carries methylcobalamin to the homocysteine (Hcy) domain to form methionine and returns cob(I)alamin to the folate (Fol) domain for remethylation by methyltetrahydrofolate (CH(3)-H(4)folate). Here, we describe crystal structures of a fragment of MetH from Thermotoga maritima comprising the domains that bind Hcy and CH(3)-H(4)folate. These substrate-binding domains are (beta alpha)(8) barrels packed tightly against one another with their barrel axes perpendicular. The properties of the domain interface suggest that the two barrels remain associated during catalysis. The Hcy and CH(3)-H(4)folate substrates are bound at the C termini of their respective barrels in orientations that position them for reaction with cobalamin, but the two active sites are separated by approximately 50 A. To complete the catalytic cycle, the cobalamin-binding domain must travel back and forth between these distant active sites.

  12. Crystal structure of 3,4-dihydroxy-2-butanone 4-phosphate synthase of riboflavin biosynthesis

    Energy Technology Data Exchange (ETDEWEB)

    Liao, D.-I.; Calabrese, J.C.; Wawrzak, Z.; Viitanen, P.V.; Jordan, D.B. (DuPont); (NWU)

    2010-03-05

    3,4-Dihydroxy-2-butanone-4-phosphate synthase catalyzes a commitment step in the biosynthesis of riboflavin. On the enzyme, ribulose 5-phosphate is converted to 3,4-dihydroxy-2-butanone 4-phosphate and formate in steps involving enolization, ketonization, dehydration, skeleton rearrangement, and formate elimination. The enzyme is absent in humans and an attractive target for the discovery of antimicrobials for pathogens incapable of acquiring sufficient riboflavin from their hosts. The homodimer of 23 kDa subunits requires Mg{sup 2+} for activity. The first three-dimensional structure of the enzyme was determined at 1.4 {angstrom} resolution using the multiwavelength anomalous diffraction (MAD) method on Escherichia coli protein crystals containing gold. The protein consists of an {alpha} + {beta} fold having a complex linkage of {beta} strands. Intersubunit contacts are mediated by numerous hydrophobic interactions and three hydrogen bond networks. A proposed active site was identified on the basis of amino acid residues that are conserved among the enzyme from 19 species. There are two well-separated active sites per dimer, each of which comprise residues from both subunits. In addition to three arginines and two threonines, which may be used for recognizing the phosphate group of the substrate, the active site consists of three glutamates, two aspartates, two histidines, and a cysteine which may provide the means for general acid and base catalysis and for coordinating the Mg{sup 2+} cofactor within the active site.

  13. A Genome-Wide Association Study for Culm Cellulose Content in Barley Reveals Candidate Genes Co-Expressed with Members of the CELLULOSE SYNTHASE A Gene Family

    Science.gov (United States)

    Houston, Kelly; Burton, Rachel A.; Sznajder, Beata; Rafalski, Antoni J.; Dhugga, Kanwarpal S.; Mather, Diane E.; Taylor, Jillian; Steffenson, Brian J.; Waugh, Robbie; Fincher, Geoffrey B.

    2015-01-01

    Cellulose is a fundamentally important component of cell walls of higher plants. It provides a scaffold that allows the development and growth of the plant to occur in an ordered fashion. Cellulose also provides mechanical strength, which is crucial for both normal development and to enable the plant to withstand both abiotic and biotic stresses. We quantified the cellulose concentration in the culm of 288 two – rowed and 288 six – rowed spring type barley accessions that were part of the USDA funded barley Coordinated Agricultural Project (CAP) program in the USA. When the population structure of these accessions was analysed we identified six distinct populations, four of which we considered to be comprised of a sufficient number of accessions to be suitable for genome-wide association studies (GWAS). These lines had been genotyped with 3072 SNPs so we combined the trait and genetic data to carry out GWAS. The analysis allowed us to identify regions of the genome containing significant associations between molecular markers and cellulose concentration data, including one region cross-validated in multiple populations. To identify candidate genes we assembled the gene content of these regions and used these to query a comprehensive RNA-seq based gene expression atlas. This provided us with gene annotations and associated expression data across multiple tissues, which allowed us to formulate a supported list of candidate genes that regulate cellulose biosynthesis. Several regions identified by our analysis contain genes that are co-expressed with CELLULOSE SYNTHASE A (HvCesA) across a range of tissues and developmental stages. These genes are involved in both primary and secondary cell wall development. In addition, genes that have been previously linked with cellulose synthesis by biochemical methods, such as HvCOBRA, a gene of unknown function, were also associated with cellulose levels in the association panel. Our analyses provide new insights into the

  14. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    Directory of Open Access Journals (Sweden)

    Brunham Robert C

    2004-07-01

    Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

  15. Genome-wide analysis of the cellulose synthase-like (Csl) gene family in bread wheat (Triticum aestivum L.).

    Science.gov (United States)

    Kaur, Simerjeet; Dhugga, Kanwarpal S; Beech, Robin; Singh, Jaswinder

    2017-11-03

    Hemicelluloses are a diverse group of complex, non-cellulosic polysaccharides, which constitute approximately one-third of the plant cell wall and find use as dietary fibres, food additives and raw materials for biofuels. Genes involved in hemicellulose synthesis have not been extensively studied in small grain cereals. In efforts to isolate the sequences for the cellulose synthase-like (Csl) gene family from wheat, we identified 108 genes (hereafter referred to as TaCsl). Each gene was represented by two to three homeoalleles, which are named as TaCslXY_ZA, TaCslXY_ZB, or TaCslXY_ZD, where X denotes the Csl subfamily, Y the gene number and Z the wheat chromosome where it is located. A quarter of these genes were predicted to have 2 to 3 splice variants, resulting in a total of 137 putative translated products. Approximately 45% of TaCsl genes were located on chromosomes 2 and 3. Sequences from the subfamilies C and D were interspersed between the dicots and grasses but those from subfamily A clustered within each group of plants. Proximity of the dicot-specific subfamilies B and G, to the grass-specific subfamilies H and J, respectively, points to their common origin. In silico expression analysis in different tissues revealed that most of the genes were expressed ubiquitously and some were tissue-specific. More than half of the genes had introns in phase 0, one-third in phase 2, and a few in phase 1. Detailed characterization of the wheat Csl genes has enhanced the understanding of their structural, functional, and evolutionary features. This information will be helpful in designing experiments for genetic manipulation of hemicellulose synthesis with the goal of developing improved cultivars for biofuel production and increased tolerance against various stresses.

  16. Structural Basis of Neuronal Nitric-oxide Synthase Interaction with Dystrophin Repeats 16 and 17*

    Science.gov (United States)

    Molza, Anne-Elisabeth; Mangat, Khushdeep; Le Rumeur, Elisabeth; Hubert, Jean-François; Menhart, Nick; Delalande, Olivier

    2015-01-01

    Duchenne muscular dystrophy is a lethal genetic defect that is associated with the absence of dystrophin protein. Lack of dystrophin protein completely abolishes muscular nitric-oxide synthase (NOS) function as a regulator of blood flow during muscle contraction. In normal muscles, nNOS function is ensured by its localization at the sarcolemma through an interaction of its PDZ domain with dystrophin spectrin-like repeats R16 and R17. Early studies suggested that repeat R17 is the primary site of interaction but ignored the involved nNOS residues, and the R17 binding site has not been described at an atomic level. In this study, we characterized the specific amino acids involved in the binding site of nNOS-PDZ with dystrophin R16–17 using combined experimental biochemical and structural in silico approaches. First, 32 alanine-scanning mutagenesis variants of dystrophin R16–17 indicated the regions where mutagenesis modified the affinity of the dystrophin interaction with the nNOS-PDZ. Second, using small angle x-ray scattering-based models of dystrophin R16–17 and molecular docking methods, we generated atomic models of the dystrophin R16–17·nNOS-PDZ complex that correlated well with the alanine scanning identified regions of dystrophin. The structural regions constituting the dystrophin interaction surface involve the A/B loop and the N-terminal end of helix B of repeat R16 and the N-terminal end of helix A′ and a small fraction of helix B′ and a large part of the helix C′ of repeat R17. The interaction surface of nNOS-PDZ involves its main β-sheet and its specific C-terminal β-finger. PMID:26378238

  17. Structural Basis of Neuronal Nitric-oxide Synthase Interaction with Dystrophin Repeats 16 and 17.

    Science.gov (United States)

    Molza, Anne-Elisabeth; Mangat, Khushdeep; Le Rumeur, Elisabeth; Hubert, Jean-François; Menhart, Nick; Delalande, Olivier

    2015-12-04

    Duchenne muscular dystrophy is a lethal genetic defect that is associated with the absence of dystrophin protein. Lack of dystrophin protein completely abolishes muscular nitric-oxide synthase (NOS) function as a regulator of blood flow during muscle contraction. In normal muscles, nNOS function is ensured by its localization at the sarcolemma through an interaction of its PDZ domain with dystrophin spectrin-like repeats R16 and R17. Early studies suggested that repeat R17 is the primary site of interaction but ignored the involved nNOS residues, and the R17 binding site has not been described at an atomic level. In this study, we characterized the specific amino acids involved in the binding site of nNOS-PDZ with dystrophin R16-17 using combined experimental biochemical and structural in silico approaches. First, 32 alanine-scanning mutagenesis variants of dystrophin R16-17 indicated the regions where mutagenesis modified the affinity of the dystrophin interaction with the nNOS-PDZ. Second, using small angle x-ray scattering-based models of dystrophin R16-17 and molecular docking methods, we generated atomic models of the dystrophin R16-17·nNOS-PDZ complex that correlated well with the alanine scanning identified regions of dystrophin. The structural regions constituting the dystrophin interaction surface involve the A/B loop and the N-terminal end of helix B of repeat R16 and the N-terminal end of helix A' and a small fraction of helix B' and a large part of the helix C' of repeat R17. The interaction surface of nNOS-PDZ involves its main β-sheet and its specific C-terminal β-finger. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  18. Structural studies provide clues for analog design of specific inhibitors of Cryptosporidium hominis thymidylate synthase-dihydrofolate reductase.

    Science.gov (United States)

    Kumar, Vidya P; Cisneros, Jose A; Frey, Kathleen M; Castellanos-Gonzalez, Alejandro; Wang, Yiqiang; Gangjee, Aleem; White, A Clinton; Jorgensen, William L; Anderson, Karen S

    2014-09-01

    Cryptosporidium is the causative agent of a gastrointestinal disease, cryptosporidiosis, which is often fatal in immunocompromised individuals and children. Thymidylate synthase (TS) and dihydrofolate reductase (DHFR) are essential enzymes in the folate biosynthesis pathway and are well established as drug targets in cancer, bacterial infections, and malaria. Cryptosporidium hominis has a bifunctional thymidylate synthase and dihydrofolate reductase enzyme, compared to separate enzymes in the host. We evaluated lead compound 1 from a novel series of antifolates, 2-amino-4-oxo-5-substituted pyrrolo[2,3-d]pyrimidines as an inhibitor of Cryptosporidium hominis thymidylate synthase with selectivity over the human enzyme. Complementing the enzyme inhibition compound 1 also has anti-cryptosporidial activity in cell culture. A crystal structure with compound 1 bound to the TS active site is discussed in terms of several van der Waals, hydrophobic and hydrogen bond interactions with the protein residues and the substrate analog 5-fluorodeoxyuridine monophosphate (TS), cofactor NADPH and inhibitor methotrexate (DHFR). Another crystal structure in complex with compound 1 bound in both the TS and DHFR active sites is also reported here. The crystal structures provide clues for analog design and for the design of ChTS-DHFR specific inhibitors. Copyright © 2014. Published by Elsevier Ltd.

  19. CELLULOSE SYNTHASE-LIKE A2, a Glucomannan Synthase, Is Involved in Maintaining Adherent Mucilage Structure in Arabidopsis Seed1[C][W

    Science.gov (United States)

    Yu, Li; Shi, Dachuan; Li, Junling; Kong, Yingzhen; Yu, Yanchong; Chai, Guohua; Hu, Ruibo; Wang, Juan; Hahn, Michael G.; Zhou, Gongke

    2014-01-01

    Mannans are hemicellulosic polysaccharides that are considered to have both structural and storage functions in the plant cell wall. However, it is not yet known how mannans function in Arabidopsis (Arabidopsis thaliana) seed mucilage. In this study, CELLULOSE SYNTHASE-LIKE A2 (CSLA2; At5g22740) expression was observed in several seed tissues, including the epidermal cells of developing seed coats. Disruption of CSLA2 resulted in thinner adherent mucilage halos, although the total amount of the adherent mucilage did not change compared with the wild type. This suggested that the adherent mucilage in the mutant was more compact compared with that of the wild type. In accordance with the role of CSLA2 in glucomannan synthesis, csla2-1 mucilage contained 30% less mannosyl and glucosyl content than did the wild type. No appreciable changes in the composition, structure, or macromolecular properties were observed for nonmannan polysaccharides in mutant mucilage. Biochemical analysis revealed that cellulose crystallinity was substantially reduced in csla2-1 mucilage; this was supported by the removal of most mucilage cellulose through treatment of csla2-1 seeds with endo-β-glucanase. Mutation in CSLA2 also resulted in altered spatial distribution of cellulose and an absence of birefringent cellulose microfibrils within the adherent mucilage. As with the observed changes in crystalline cellulose, the spatial distribution of pectin was also modified in csla2-1 mucilage. Taken together, our results demonstrate that glucomannans synthesized by CSLA2 are involved in modulating the structure of adherent mucilage, potentially through altering cellulose organization and crystallization. PMID:24569843

  20. Atypical composition and structure of the mitochondrial dimeric ATP synthase from Euglena gracilis

    NARCIS (Netherlands)

    Yadav, K.N. Satish; Miranda-Astudillo, Héctor V; Colina-Tenorio, Lilia; Bouillenne, Fabrice; Degand, Hervé; Morsomme, Pierre; González-Halphen, Diego; Boekema, Egbert J; Cardol, Pierre

    Mitochondrial respiratory-chain complexes from Euglenozoa comprise classical subunits described in other eukaryotes (i.e. mammals and fungi) and subunits that are restricted to Euglenozoa (e.g. Euglena gracilis and Trypanosoma brucei). Here we studied the mitochondrial F1FO-ATP synthase (or Complex

  1. Structure of dimeric, recombinant Sulfolobus solfataricus phosphoribosyl diphosphate synthase

    DEFF Research Database (Denmark)

    Andersen, Rune W.; Lo Leggio, Leila; Hove-Jensen, Bjarne

    2015-01-01

    ion were observed. Sulphate ion, reminiscent of the ammonium sulphate precipitation step of the purification, seems to bind tightly and, therefore, presumably occupies and blocks the ribose 5-phosphate binding site. The activity of S. solfataricus PRPP synthase is independent of phosphate ion....

  2. Confluence of structural and chemical biology: plant polyketide synthases as biocatalysts for a bio-based future.

    Science.gov (United States)

    Stewart, Charles; Vickery, Christopher R; Burkart, Michael D; Noel, Joseph P

    2013-06-01

    Type III plant polyketide synthases (PKSs) biosynthesize a dazzling array of polyphenolic products that serve important roles in both plant and human health. Recent advances in structural characterization of these enzymes and new tools from the field of chemical biology have facilitated exquisite probing of plant PKS iterative catalysis. These tools have also been used to exploit type III PKSs as biocatalysts to generate new chemicals. Going forward, chemical, structural and biochemical analyses will provide an atomic resolution understanding of plant PKSs and will serve as a springboard for bioengineering and scalable production of valuable molecules in vitro, by fermentation and in planta. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. A genome-wide polyketide synthase deletion library uncovers novel genetic links to polyketides and meroterpenoids in Aspergillus nidulans

    DEFF Research Database (Denmark)

    Nielsen, Michael Lynge; Nielsen, Jakob Blæsbjerg; Rank, Christian

    2011-01-01

    by systematically deleting all 32 individual genes encoding polyketide synthases. Wild-type and all mutant strains were challenged on different complex media to provoke induction of the secondary metabolism. Screening of the mutant library revealed direct genetic links to two austinol meroterpenoids and expanded...... the current understanding of the biosynthetic pathways leading to arugosins and violaceols. We expect that the library will be an important resource towards a systemic understanding of polyketide production in A. nidulans....

  4. Structural and functional analysis of validoxylamine A 7'-phosphate synthase ValL involved in validamycin A biosynthesis.

    Directory of Open Access Journals (Sweden)

    Lina Zheng

    Full Text Available Validamycin A (Val-A is an effective antifungal agent widely used in Asian countries as crop protectant. Validoxylamine A, the core structure and intermediate of Val-A, consists of two C(7-cyclitol units connected by a rare C-N bond. In the Val-A biosynthetic gene cluster in Streptomyces hygroscopicus 5008, the ORF valL was initially annotated as a validoxylamine A 7'-phosphate(V7P synthase, whose encoded 497-aa protein shows high similarity with trehalose 6-phosphate(T6P synthase. Gene inactivation of valL abolished both validoxylamine A and validamycin A productivity, and complementation with a cloned valL recovered 10% production of the wild-type in the mutant, indicating the involvement of ValL in validoxylamine A biosynthesis. Also we determined the structures of ValL and ValL/trehalose complex. The structural data indicates that ValL adopts the typical fold of GT-B protein family, featuring two Rossmann-fold domains and an active site at domain junction. The residues in the active site are arranged in a manner homologous to that of Escherichia coli (E.coli T6P synthase OtsA. However, a significant discrepancy is found in the active-site loop region. Also noticeable structural variance is found around the active site entrance in the apo ValL structure while the region takes an ordered configuration upon binding of product analog trehalose. Furthermore, the modeling of V7P in the active site of ValL suggests that ValL might have a similar SNi-like mechanism as OtsA.

  5. Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

    Directory of Open Access Journals (Sweden)

    Jian Li

    Full Text Available The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR mediated by low-copy repeats (LCRs. Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.

  6. Child Development and Structural Variation in the Human Genome

    Science.gov (United States)

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  7. The Dictyostelium discoideum cellulose synthase: Structure/function analysis and identification of interacting proteins

    Energy Technology Data Exchange (ETDEWEB)

    Richard L. Blanton

    2004-02-19

    OAK-B135 The major accomplishments of this project were: (1) the initial characterization of dcsA, the gene for the putative catalytic subunit of cellulose synthase in the cellular slime mold Dictyostelium discoideum; (2) the detection of a developmentally regulated event (unidentified, but perhaps a protein modification or association with a protein partner) that is required for cellulose synthase activity (i.e., the dcsA product is necessary, but not sufficient for cellulose synthesis); (3) the continued exploration of the developmental context of cellulose synthesis and DcsA; (4) the isolation of a GFP-DcsA-expressing strain (work in progress); and (5) the identification of Dictyostelium homologues for plant genes whose products play roles in cellulose biosynthesis. Although our progress was slow and many of our results negative, we did develop a number of promising avenues of investigation that can serve as the foundation for future projects.

  8. Structural biology at York Structural Biology Laboratory; laboratory information management systems for structural genomics

    Czech Academy of Sciences Publication Activity Database

    Dohnálek, Jan

    2005-01-01

    Roč. 12, č. 1 (2005), s. 3 ISSN 1211-5894. [Meeting of Structural Biologists /4./. 10.03.2005-12.03.2005, Nové Hrady] R&D Projects: GA MŠk(CZ) 1K05008 Keywords : structural biology * LIMS * structural genomics Subject RIV: CD - Macromolecular Chemistry

  9. Genome structure analysis of molluscs revealed whole genome duplication and lineage specific repeat variation.

    Science.gov (United States)

    Yoshida, Masa-aki; Ishikura, Yukiko; Moritaki, Takeya; Shoguchi, Eiichi; Shimizu, Kentaro K; Sese, Jun; Ogura, Atsushi

    2011-09-01

    Comparative genome structure analysis allows us to identify novel genes, repetitive sequences and gene duplications. To explore lineage-specific genomic changes of the molluscs that is good model for development of nervous system in invertebrate, we conducted comparative genome structure analyses of three molluscs, pygmy squid, nautilus and scallops using partial genome shotgun sequencing. Most effective elements on the genome structural changes are repetitive elements (REs) causing expansion of genome size and whole genome duplication producing large amount of novel functional genes. Therefore, we investigated variation and proportion of REs and whole genome duplication. We, first, identified variations of REs in the three molluscan genomes by homology-based and de novo RE detection. Proportion of REs were 9.2%, 4.0%, and 3.8% in the pygmy squid, nautilus and scallop, respectively. We, then, estimated genome size of the species as 2.1, 4.2 and 1.8 Gb, respectively, with 2× coverage frequency and DNA sequencing theory. We also performed a gene duplication assay based on coding genes, and found that large-scale duplication events occurred after divergence from the limpet Lottia, an out-group of the three molluscan species. Comparison of all the results suggested that RE expansion did not relate to the increase in genome size of nautilus. Despite close relationships to nautilus, the squid has the largest portion of REs and smaller genome size than nautilus. We also identified lineage-specific RE and gene-family expansions, possibly relate to acquisition of the most complicated eye and brain systems in the three species. Copyright © 2011 Elsevier B.V. All rights reserved.

  10. Genome Structure of the Genus Azospirillum

    Science.gov (United States)

    Martin-Didonet, Claudia C. G.; Chubatsu, Leda S.; Souza, Emanuel M.; Kleina, Margareth; Rego, Fabiane G. M.; Rigo, Liu U.; Yates, M. Geoffrey; Pedrosa, Fabio O.

    2000-01-01

    Azospirillum species are plant-associated diazotrophs of the alpha subclass of Proteobacteria. The genomes of five of the six Azospirillum species were analyzed by pulsed-field gel electrophoresis. All strains possessed several megareplicons, some probably linear, and 16S ribosomal DNA hybridization indicated multiple chromosomes in genomes ranging in size from 4.8 to 9.7 Mbp. The nifHDK operon was identified in the largest replicon. PMID:10869094

  11. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene

  12. Phosphorylation-dependent translocation of glycogen synthase to a novel structure during glycogen resynthesis

    DEFF Research Database (Denmark)

    Prats, Clara; Cadefau, Joan A; Cussó, Roser

    2005-01-01

    Glycogen metabolism has been the subject of extensive research, but the mechanisms by which it is regulated are still not fully understood. It is well accepted that the rate-limiting enzymes in glycogenesis and glycogenolysis are glycogen synthase (GS) and glycogen phosphorylase (GPh), respectively...... stimulation of rabbit tibialis anterior muscle, we show GS and GPh intracellular redistribution at the beginning of glycogen resynthesis after contraction-induced glycogen depletion. We identify a new "player," a new intracellular compartment involved in skeletal muscle glycogen metabolism. They are spherical...

  13. Biochemical and Structural Basis for Inhibition of Enterococcus faecalis Hydroxymethylglutaryl-CoA Synthase, mvaS, by Hymeglusin

    Energy Technology Data Exchange (ETDEWEB)

    Skaff, D. Andrew; Ramyar, Kasra X.; McWhorter, William J.; Barta, Michael L.; Geisbrecht, Brian V.; Miziorko, Henry M. (UMKC)

    2012-07-25

    Hymeglusin (1233A, F244, L-659-699) is established as a specific {beta}-lactone inhibitor of eukaryotic hydroxymethylglutaryl-CoA synthase (HMGCS). Inhibition results from formation of a thioester adduct to the active site cysteine. In contrast, the effects of hymeglusin on bacterial HMG-CoA synthase, mvaS, have been minimally characterized. Hymeglusin blocks growth of Enterococcus faecalis. After removal of the inhibitor from culture media, a growth curve inflection point at 3.1 h is observed (vs 0.7 h for the uninhibited control). Upon hymeglusin inactivation of purified E. faecalis mvaS, the thioester adduct is more stable than that measured for human HMGCS. Hydroxylamine cleaves the thioester adduct; substantial enzyme activity is restored at a rate that is 8-fold faster for human HMGCS than for mvaS. Structural results explain these differences in enzyme-inhibitor thioester adduct stability and solvent accessibility. The E. faecalis mvaS-hymeglusin cocrystal structure (1.95 {angstrom}) reveals virtually complete occlusion of the bound inhibitor in a narrow tunnel that is largely sequestered from bulk solvent. In contrast, eukaryotic (Brassica juncea) HMGCS binds hymeglusin in a more solvent-exposed cavity.

  14. Genome Editing of Structural Variations: Modeling and Gene Correction.

    Science.gov (United States)

    Park, Chul-Yong; Sung, Jin Jea; Kim, Dong-Wook

    2016-07-01

    The analysis of chromosomal structural variations (SVs), such as inversions and translocations, was made possible by the completion of the human genome project and the development of genome-wide sequencing technologies. SVs contribute to genetic diversity and evolution, although some SVs can cause diseases such as hemophilia A in humans. Genome engineering technology using programmable nucleases (e.g., ZFNs, TALENs, and CRISPR/Cas9) has been rapidly developed, enabling precise and efficient genome editing for SV research. Here, we review advances in modeling and gene correction of SVs, focusing on inversion, translocation, and nucleotide repeat expansion. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Effect of chronic ethanol consumption on the subunit structure of the mitochondrial ATP synthase

    Energy Technology Data Exchange (ETDEWEB)

    Coleman, W.B.; Spach, P.I.; Cunningham, C.C. (Wake Forest Univ., Winston-Salem, NC (United States))

    1991-03-11

    The relative concentrations of several subunits of the mitochondrial F{sub 0}F{sub 1} ATP synthase were determined in hepatic mitochondria and submitochrondrial particles (SMP) isolated from ethanol-fed and control rats. Animals were maintained on an ethanol-containing liquid diet for 31 days. The polypeptides were analyzed by densitometry measurements of SDS-polyacrylamide gel electrophoresis patterns. Subunit 8 was decreased 31% in intact mitochondria, whereas subunit 6 could not be measured. In contrast, there were no significant ethanol-related depressions in subunits {alpha}, {beta} and OSCP of the F{sub 0}F{sub 1} or the adenine nucleotide carrier (AdNC) in intact mitochondria. In ethanol SMP subunits 6 and 8 were decreased 41 and 34%, respectively; subunits {alpha}, {beta}, {gamma} and OSCP were also present in lowered amounts due to the loose attachment of F{sub 1} to F{sub 0}. The remainder of the F{sub 1} was observed in the soluble fraction resulting from preparation of SMP. The AdNC was present in normal amounts in ethanol SMP. These results demonstrate that ethanol consumption causes a decrease in the content of mitochondrial synthesized subunits 6 and 8 whereas no effect is exerted on the concentrations of nuclear gene products of the ATP synthase complex. Likewise, the adenine nucleotide transporter, also a nuclear gene product, is unaffected by ethanol consumption.

  16. [Advances in isoprene synthase research].

    Science.gov (United States)

    Gou, Yan; Liu, Zhongchuan; Wang, Ganggang

    2017-11-25

    Isoprene emission can lead to significant consequence for atmospheric chemistry. In addition, isoprene is a chemical compound for various industrial applications. In the organisms, isoprene is produced by isoprene synthase that eliminates the pyrophosphate from the dimethylallyl diphosphate. As a key enzyme of isoprene formation, isoprene synthase plays an important role in the process of natural emission and artificial synthesis of isoprene. So far, isoprene synthase has been found in various plants. Isoprene synthases from different sources are of conservative structural and similar biochemical properties. In this review, the biochemical and structural characteristics of isoprene synthases from different sources were compared, the catalytic mechanism of isoprene synthase was discussed, and the perspective application of the enzyme in bioengineering was proposed.

  17. Structural and thermodynamic basis of the inhibition of Leishmania major farnesyl diphosphate synthase by nitrogen-containing bisphosphonates

    Energy Technology Data Exchange (ETDEWEB)

    Aripirala, Srinivas [Johns Hopkins University, 725 North Wolfe Street WBSB 605, Baltimore, MD 21210 (United States); Gonzalez-Pacanowska, Dolores [López-Neyra Institute of Parasitology and Biomedicine, 18001 Granada (Spain); Oldfield, Eric [University of Illinois at Urbana-Champaign, Urbana, IL 61801 (United States); Kaiser, Marcel [University of Basel, Petersplatz 1, CH-4003 Basel (Switzerland); Amzel, L. Mario, E-mail: mamzel@jhmi.edu [Johns Hopkins University School of Medicine, 725 N. Wolfe Street WBSB 604, Baltimore, MD 21205 (United States); Gabelli, Sandra B., E-mail: mamzel@jhmi.edu [Johns Hopkins University School of Medicine, 725 N. Wolfe Street WBSB 604, Baltimore, MD 21205 (United States); Johns Hopkins University School of Medicine, Baltimore, MD 21205 (United States); Johns Hopkins University, 725 North Wolfe Street WBSB 605, Baltimore, MD 21210 (United States)

    2014-03-01

    Structural insights into L. major farnesyl diphosphate synthase, a key enzyme in the mevalonate pathway, are described. Farnesyl diphosphate synthase (FPPS) is an essential enzyme involved in the biosynthesis of sterols (cholesterol in humans and ergosterol in yeasts, fungi and trypanosomatid parasites) as well as in protein prenylation. It is inhibited by bisphosphonates, a class of drugs used in humans to treat diverse bone-related diseases. The development of bisphosphonates as antiparasitic compounds targeting ergosterol biosynthesis has become an important route for therapeutic intervention. Here, the X-ray crystallographic structures of complexes of FPPS from Leishmania major (the causative agent of cutaneous leishmaniasis) with three bisphosphonates determined at resolutions of 1.8, 1.9 and 2.3 Å are reported. Two of the inhibitors, 1-(2-hydroxy-2,2-diphosphonoethyl)-3-phenylpyridinium (300B) and 3-butyl-1-(2,2-diphosphonoethyl)pyridinium (476A), co-crystallize with the homoallylic substrate isopentenyl diphosphate (IPP) and three Ca{sup 2+} ions. A third inhibitor, 3-fluoro-1-(2-hydroxy-2,2-diphosphonoethyl)pyridinium (46I), was found to bind two Mg{sup 2+} ions but not IPP. Calorimetric studies showed that binding of the inhibitors is entropically driven. Comparison of the structures of L. major FPPS (LmFPPS) and human FPPS provides new information for the design of bisphosphonates that will be more specific for inhibition of LmFPPS. The asymmetric structure of the LmFPPS–46I homodimer indicates that binding of the allylic substrate to both monomers of the dimer results in an asymmetric dimer with one open and one closed homoallylic site. It is proposed that IPP first binds to the open site, which then closes, opening the site on the other monomer, which closes after binding the second IPP, leading to the symmetric fully occupied FPPS dimer observed in other structures.

  18. Structural Genomics of Minimal Organisms: Pipeline and Results

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong-Hae; Kim, Rosalind; Adams, Paul; Chandonia, John-Marc

    2007-09-14

    The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center.

  19. Visualization of RNA structure models within the Integrative Genomics Viewer.

    Science.gov (United States)

    Busan, Steven; Weeks, Kevin M

    2017-07-01

    Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  20. 3D genome structure modeling by Lorentzian objective function.

    Science.gov (United States)

    Trieu, Tuan; Cheng, Jianlin

    2017-02-17

    The 3D structure of the genome plays a vital role in biological processes such as gene interaction, gene regulation, DNA replication and genome methylation. Advanced chromosomal conformation capture techniques, such as Hi-C and tethered conformation capture, can generate chromosomal contact data that can be used to computationally reconstruct 3D structures of the genome. We developed a novel restraint-based method that is capable of reconstructing 3D genome structures utilizing both intra-and inter-chromosomal contact data. Our method was robust to noise and performed well in comparison with a panel of existing methods on a controlled simulated data set. On a real Hi-C data set of the human genome, our method produced chromosome and genome structures that are consistent with 3D FISH data and known knowledge about the human chromosome and genome, such as, chromosome territories and the cluster of small chromosomes in the nucleus center with the exception of the chromosome 18. The tool and experimental data are available at https://missouri.box.com/v/LorDG.

  1. An active site–tail interaction in the structure of hexahistidine-tagged Thermoplasma acidophilum citrate synthase

    Energy Technology Data Exchange (ETDEWEB)

    Murphy, Jesse R.; Donini, Stefano; Kappock, T. Joseph, E-mail: kappock@purdue.edu [Purdue University, 175 South University Street, West Lafayette, IN 47907-2063 (United States)

    2015-09-23

    Citrate synthase from the thermophilic euryarchaeon T. acidophilum fused to a hexahistidine tag was purified and biochemically characterized. The structure of the unliganded enzyme at 2.2 Å resolution contains tail–active site contacts in half of the active sites. Citrate synthase (CS) plays a central metabolic role in aerobes and many other organisms. The CS reaction comprises two half-reactions: a Claisen aldol condensation of acetyl-CoA (AcCoA) and oxaloacetate (OAA) that forms citryl-CoA (CitCoA), and CitCoA hydrolysis. Protein conformational changes that ‘close’ the active site play an important role in the assembly of a catalytically competent condensation active site. CS from the thermoacidophile Thermoplasma acidophilum (TpCS) possesses an endogenous Trp fluorophore that can be used to monitor the condensation reaction. The 2.2 Å resolution crystal structure of TpCS fused to a C-terminal hexahistidine tag (TpCSH6) reported here is an ‘open’ structure that, when compared with several liganded TpCS structures, helps to define a complete path for active-site closure. One active site in each dimer binds a neighboring His tag, the first nonsubstrate ligand known to occupy both the AcCoA and OAA binding sites. Solution data collectively suggest that this fortuitous interaction is stabilized by the crystalline lattice. As a polar but almost neutral ligand, the active site–tail interaction provides a new starting point for the design of bisubstrate-analog inhibitors of CS.

  2. Crystallization, preliminary X-ray diffraction and structure solution of MosA, a dihydrodipicolinate synthase from Sinorhizobium meliloti L5-30

    International Nuclear Information System (INIS)

    Leduc, Yvonne A.; Phenix, Christopher P.; Puttick, Jennifer; Nienaber, Kurt; Palmer, David R. J.; Delbaere, Louis T. J.

    2005-01-01

    MosA from S. meliloti L5-30 has been crystallized in solution with pyruvate and the 2.3 Å resolution structure has been solved by molecular replacement using E. coli dihydrodipicolinate synthase as the model. The structure of MosA, a dihydrodipicolinate synthase and reported methyltransferase from Sinorhizobium meliloti, has been solved using molecular replacement with Escherichia coli dihydrodipicolinate synthase as the model. A crystal grown in the presence of pyruvate diffracted X-rays to 2.3 Å resolution using synchrotron radiation and belonged to the orthorhombic space group C222 1 , with unit-cell parameters a = 69.14, b = 138.87, c = 124.13 Å

  3. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    BACKGROUND: Structural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point...... mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...

  4. Campylobacter jejuni fatty acid synthase II: Structural and functional analysis of [beta]-hydroxyacyl-ACP dehydratase (FabZ)

    Energy Technology Data Exchange (ETDEWEB)

    Kirkpatrick, Andrew S.; Yokoyama, Takeshi; Choi, Kyoung-Jae; Yeo, Hye-Jeong; (Houston)

    2009-08-14

    Fatty acid biosynthesis is crucial for all living cells. In contrast to higher organisms, bacteria use a type II fatty acid synthase (FAS II) composed of a series of individual proteins, making FAS II enzymes excellent targets for antibiotics discovery. The {beta}-hydroxyacyl-ACP dehydratase (FabZ) catalyzes an essential step in the FAS II pathway. Here, we report the structure of Campylobacter jejuni FabZ (CjFabZ), showing a hexamer both in crystals and solution, with each protomer adopting the characteristic hot dog fold. Together with biochemical analysis of CjFabZ, we define the first functional FAS II enzyme from this pathogen, and provide a framework for investigation on roles of FAS II in C. jejuni virulence

  5. In Silico Structure Prediction of Human Fatty Acid Synthase-Dehydratase: A Plausible Model for Understanding Active Site Interactions.

    Science.gov (United States)

    John, Arun; Umashankar, Vetrivel; Samdani, A; Sangeetha, Manoharan; Krishnakumar, Subramanian; Deepa, Perinkulam Ravi

    2016-01-01

    Fatty acid synthase (FASN, UniProt ID: P49327) is a multienzyme dimer complex that plays a critical role in lipogenesis. Consequently, this lipogenic enzyme has gained tremendous biomedical importance. The role of FASN and its inhibition is being extensively researched in several clinical conditions, such as cancers, obesity, and diabetes. X-ray crystallographic structures of some of its domains, such as β-ketoacyl synthase, acetyl transacylase, malonyl transacylase, enoyl reductase, β-ketoacyl reductase, and thioesterase, (TE) are already reported. Here, we have attempted an in silico elucidation of the uncrystallized dehydratase (DH) catalytic domain of human FASN. This theoretical model for DH domain was predicted using comparative modeling methods. Different stand-alone tools and servers were used to validate and check the reliability of the predicted models, which suggested it to be a highly plausible model. The stereochemical analysis showed 92.0% residues in favorable region of Ramachandran plot. The initial physiological substrate β-hydroxybutyryl group was docked into active site of DH domain using Glide. The molecular dynamics simulations carried out for 20 ns in apo and holo states indicated the stability and accuracy of the predicted structure in solvated condition. The predicted model provided useful biochemical insights into the substrate-active site binding mechanisms. This model was then used for identifying potential FASN inhibitors using high-throughput virtual screening of the National Cancer Institute database of chemical ligands. The inhibitory efficacy of the top hit ligands was validated by performing molecular dynamics simulation for 20 ns, where in the ligand NSC71039 exhibited good enzyme inhibition characteristics and exhibited dose-dependent anticancer cytotoxicity in retinoblastoma cancer cells in vitro.

  6. From DNA Sequences to Chemical Structures – Methods for Mining Microbial Genomic and Metagenomic Data Sets for New Natural Products

    Directory of Open Access Journals (Sweden)

    Jurica Zucko

    2010-01-01

    Full Text Available Rapid mining of large genomic and metagenomic data sets for modular polyketide synthases, non-ribosomal peptide synthetases and hybrid polyketide synthase/non-ribosomal peptide synthetase biosynthetic gene clusters has been achieved using the generic computer program packages ClustScan and CompGen. These program packages perform the annotation with the hierarchical structuring into polypeptides, modules and domains, as well as storage and graphical presentations of the data. This aims to achieve the most accurate predictions of the activities and specificities of catalytically active domains that can be made with present knowledge, leading to a prediction of the most likely chemical structures produced by these enzymes. The program packages also allow generation of novel clusters by homologous recombination of the annotated genes in silico. ClustScan and CompGen were used to construct a custom database of known compounds (CSDB and of predicted entirely novel recombinant products (r-CSDB that can be used for in silico screening with computer aided drug design technology. The use of these programs has been exemplified by analysing genomic sequences from terrestrial prokaryotes and eukaryotic microorganisms, a marine metagenomic data set and a newly discovered example of a 'shared metabolic pathway' in marine-microbial endosymbiosis.

  7. Structural dynamics of retroviral genome and the packaging.

    Science.gov (United States)

    Miyazaki, Yasuyuki; Miyake, Ariko; Nomaguchi, Masako; Adachi, Akio

    2011-01-01

    Retroviruses can cause diseases such as AIDS, leukemia, and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5' untranslated region (5' UTR), and contains dimerization site(s). Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5' UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus, human immunodeficiency virus type 1 and 2, and describe the molecular mechanism of retroviral genome packaging.

  8. Structural dynamics of retroviral genome and the packaging

    Directory of Open Access Journals (Sweden)

    Yasuyuki eMiyazaki

    2011-12-01

    Full Text Available Retroviruses can cause diseases such as AIDS, leukemia and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid (NC domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5’ untranslated region (5’ UTR, and contains dimerization site(s. Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5’ UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus (MoMLV, human immunodeficiency virus type 1 (HIV-1 and 2 (HIV-2, and describe the molecular mechanism of retroviral genome packaging.

  9. Crystal structures of human HMG-CoA synthase isoforms provide insights into inherited ketogenesis disorders and inhibitor design.

    Science.gov (United States)

    Shafqat, Naeem; Turnbull, Andrew; Zschocke, Johannes; Oppermann, Udo; Yue, Wyatt W

    2010-05-14

    3-Hydroxy-3-methylglutaryl coenzyme A (CoA) synthase (HMGCS) catalyzes the condensation of acetyl-CoA and acetoacetyl-CoA into 3-hydroxy-3-methylglutaryl CoA. It is ubiquitous across the phylogenetic tree and is broadly classified into three classes. The prokaryotic isoform is essential in Gram-positive bacteria for isoprenoid synthesis via the mevalonate pathway. The eukaryotic cytosolic isoform also participates in the mevalonate pathway but its end product is cholesterol. Mammals also contain a mitochondrial isoform; its deficiency results in an inherited disorder of ketone body formation. Here, we report high-resolution crystal structures of the human cytosolic (hHMGCS1) and mitochondrial (hHMGCS2) isoforms in binary product complexes. Our data represent the first structures solved for human HMGCS and the mitochondrial isoform, allowing for the first time structural comparison among the three isoforms. This serves as a starting point for the development of isoform-specific inhibitors that have potential cholesterol-reducing and antibiotic applications. In addition, missense mutations that cause mitochondrial HMGCS deficiency have been mapped onto the hHMGCS2 structure to rationalize the structural basis for the disease pathology. (c) 2010 Elsevier Ltd. All rights reserved.

  10. Genome sequencing-assisted identification and the first functional validation of N-acyl-homoserine-lactone synthases from the Sphingomonadaceae family

    Directory of Open Access Journals (Sweden)

    Han Ming Gan

    2016-08-01

    Full Text Available Background Members of the genus Novosphingobium have been isolated from a variety of environmental niches. Although genomics analyses have suggested the presence of genes associated with quorum sensing signal production e.g., the N-acyl-homoserine lactone (AHL synthase (luxI homologs in various Novosphingobium species, to date, no luxI homologs have been experimentally validated. Methods In this study, we report the draft genome of the N-(AHL-producing bacterium Novosphingobium subterraneum DSM 12447 and validate the functions of predicted luxI homologs from the bacterium through inducible heterologous expression in Agrobacterium tumefaciens strain NTL4. We developed a two-dimensional thin layer chromatography bioassay and used LC-ESI MS/MS analyses to separate, detect and identify the AHL signals produced by the N. subterraneum DSM 12447 strain. Results Three predicted luxI homologs were annotated to the locus tags NJ75_2841 (NovINsub1, NJ75_2498 (NovINsub2, and NJ75_4146 (NovINsub3. Inducible heterologous expression of each luxI homologs followed by LC-ESI MS/MS and two-dimensional reverse phase thin layer chromatography bioassays followed by bioluminescent ccd camera imaging indicate that the three LuxI homologs are able to produce a variety of medium-length AHL compounds. New insights into the LuxI phylogeny was also gleemed as inferred by Bayesian inference. Discussion This study significantly adds to our current understanding of quorum sensing in the genus Novosphingobium and provide the framework for future characterization of the phylogenetically interesting LuxI homologs from members of the genus Novosphingobium and more generally the family Sphingomonadaceae.

  11. Structural and functional characterization of three polyketide synthase gene clusters in Bacillus amyloliquefaciens FZB 42.

    Science.gov (United States)

    Chen, Xiao-Hua; Vater, Joachim; Piel, Jörn; Franke, Peter; Scholz, Romy; Schneider, Kathrin; Koumoutsi, Alexandra; Hitzeroth, Gabriele; Grammel, Nicolas; Strittmatter, Axel W; Gottschalk, Gerhard; Süssmuth, Roderich D; Borriss, Rainer

    2006-06-01

    Although bacterial polyketides are of considerable biomedical interest, the molecular biology of polyketide biosynthesis in Bacillus spp., one of the richest bacterial sources of bioactive natural products, remains largely unexplored. Here we assign for the first time complete polyketide synthase (PKS) gene clusters to Bacillus antibiotics. Three giant modular PKS systems of the trans-acyltransferase type were identified in Bacillus amyloliquefaciens FZB 42. One of them, pks1, is an ortholog of the pksX operon with a previously unknown function in the sequenced model strain Bacillus subtilis 168, while the pks2 and pks3 clusters are novel gene clusters. Cassette mutagenesis combined with advanced mass spectrometric techniques such as matrix-assisted laser desorption ionization-time of flight mass spectrometry and liquid chromatography-electrospray ionization mass spectrometry revealed that the pks1 (bae) and pks3 (dif) gene clusters encode the biosynthesis of the polyene antibiotics bacillaene and difficidin or oxydifficidin, respectively. In addition, B. subtilis OKB105 (pheA sfp(0)), a transformant of the B. subtilis 168 derivative JH642, was shown to produce bacillaene, demonstrating that the pksX gene cluster directs the synthesis of that polyketide. The GenBank accession numbers for gene clusters pks1(bae), pks2, and pks3(dif) are AJ 634060.2, AJ 6340601.2, and AJ 6340602.2, respectively.

  12. NMR Crystallography of Enzyme Active Sites: Probing Chemically-Detailed, Three-Dimensional Structure in Tryptophan Synthase

    Science.gov (United States)

    Dunn, Michael F.

    2013-01-01

    crystallography for application to enzyme catalysis. We begin with a brief introduction to NMR crystallography and then define the process that we have employed to probe the active site in the β-subunit of tryptophan synthase with unprecedented atomic-level resolution. This approach has resulted in a novel structural hypothesis for the protonation state of the quinonoid intermediate in tryptophan synthase and its surprising role in directing the next step in the catalysis of L-Trp formation. PMID:23537227

  13. Structural genomics of infectious disease drug targets: the SSGCID

    International Nuclear Information System (INIS)

    Stacy, Robin; Begley, Darren W.; Phan, Isabelle; Staker, Bart L.; Van Voorhis, Wesley C.; Varani, Gabriele; Buchko, Garry W.; Stewart, Lance J.; Myler, Peter J.

    2011-01-01

    An introduction and overview of the focus, goals and overall mission of the Seattle Structural Genomics Center for Infectious Disease (SSGCID) is given. The Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a consortium of researchers at Seattle BioMed, Emerald BioStructures, the University of Washington and Pacific Northwest National Laboratory that was established to apply structural genomics approaches to drug targets from infectious disease organisms. The SSGCID is currently funded over a five-year period by the National Institute of Allergy and Infectious Diseases (NIAID) to determine the three-dimensional structures of 400 proteins from a variety of Category A, B and C pathogens. Target selection engages the infectious disease research and drug-therapy communities to identify drug targets, essential enzymes, virulence factors and vaccine candidates of biomedical relevance to combat infectious diseases. The protein-expression systems, purified proteins, ligand screens and three-dimensional structures produced by SSGCID constitute a valuable resource for drug-discovery research, all of which is made freely available to the greater scientific community. This issue of Acta Crystallographica Section F, entirely devoted to the work of the SSGCID, covers the details of the high-throughput pipeline and presents a series of structures from a broad array of pathogenic organisms. Here, a background is provided on the structural genomics of infectious disease, the essential components of the SSGCID pipeline are discussed and a survey of progress to date is presented

  14. Modes of Heme-Binding and Substrate Access for Cytochrome P450 CYP74A Revealed by Crystal Structures of Allene Oxide Synthase

    Science.gov (United States)

    Cytochrome P450s exist ubiquitously in all organisms and are involved in many biological processes. Allene oxide synthase (AOS) is a P450 enzyme that plays a key role in the biosynthesis of oxylipin jasmonates which are involved in signal and defense reactions in higher plants. The crystal structure...

  15. Structure of human farnesyl pyrophosphate synthase in complex with an aminopyridine bisphosphonate and two molecules of inorganic phosphate

    Energy Technology Data Exchange (ETDEWEB)

    Park, Jaeok [McGill University, 3655 Promenade Sir William Osler, Montreal, QC H3G 1Y6 (Canada); Lin, Yih-Shyan [McGill University, 801 Rue Sherbrooke Ouest, Montreal, QC H3A 0B8 (Canada); Tsantrizos, Youla S. [McGill University, 3655 Promenade Sir William Osler, Montreal, QC H3G 1Y6 (Canada); McGill University, 801 Rue Sherbrooke Ouest, Montreal, QC H3A 0B8 (Canada); McGill University, 3649 Promenade Sir William Osler, Montreal, QC H3G 0B1 (Canada); Berghuis, Albert M., E-mail: albert.berghuis@mcgill.ca [McGill University, 3655 Promenade Sir William Osler, Montreal, QC H3G 1Y6 (Canada); McGill University, 3649 Promenade Sir William Osler, Montreal, QC H3G 0B1 (Canada); McGill University, 3775 Rue University, Montreal, QC H3A 2B4 (Canada)

    2014-02-19

    A co-crystal structure of human farnesyl pyrophosphate synthase in complex with an aminopyridine bisphosphonate, YS0470, and two molecules of inorganic phosphate has been determined. The identity of the phosphate ligands was confirmed by anomalous diffraction data. Human farnesyl pyrophosphate synthase (hFPPS) produces farnesyl pyrophos@@phate, an isoprenoid essential for a variety of cellular processes. The enzyme has been well established as the molecular target of the nitrogen-containing bisphosphonates (N-BPs), which are best known for their antiresorptive effects in bone but are also known for their anticancer properties. Crystal structures of hFPPS in ternary complexes with a novel bisphosphonate, YS0470, and the secondary ligands inorganic phosphate (P{sub i}), inorganic pyrophosphate (PP{sub i}) and isopentenyl pyrophosphate (IPP) have recently been reported. Only the co-binding of the bisphosphonate with either PP{sub i} or IPP resulted in the full closure of the C-@@terminal tail of the enzyme, a conformational change that is required for catalysis and that is also responsible for the potent in vivo efficacy of N-BPs. In the present communication, a co-crystal structure of hFPPS in complex with YS0470 and two molecules of P{sub i} is reported. The unusually close proximity between these ligands, which was confirmed by anomalous diffraction data, suggests that they interact with one another, with their anionic charges neutralized in their bound state. The structure also showed the tail of the enzyme to be fully disordered, indicating that simultaneous binding of two P{sub i} molecules with a bisphosphonate cannot induce the tail-closing conformational change in hFPPS. Examination of homologous FPPSs suggested that this ligand-dependent tail closure is only conserved in the mammalian proteins. The prevalence of P{sub i}-bound hFPPS structures in the PDB raises a question regarding the in vivo relevance of P{sub i} binding to the function of the enzyme.

  16. Three-dimensional structures of Plasmodium falciparum spermidine synthase with bound inhibitors suggest new strategies for drug design

    Energy Technology Data Exchange (ETDEWEB)

    Sprenger, Janina [Lund University, SE-221 00 Lund (Sweden); Lund University, SE-221 84 Lund (Sweden); Svensson, Bo [Lund University, SE-221 00 Lund (Sweden); SARomics Biostructures AB, Box 724, SE-220 07 Lund (Sweden); Hålander, Jenny [Lund University, SE-221 00 Lund (Sweden); Carey, Jannette [Princeton University, Princeton, New Jersey (United States); Persson, Lo [Lund University, SE-221 84 Lund (Sweden); Al-Karadaghi, Salam, E-mail: salam.al-karadaghi@biochemistry.lu.se [Lund University, SE-221 00 Lund (Sweden)

    2015-03-01

    In this work, X-ray crystallography was used to examine ligand complexes of spermidine synthase from the malaria parasite Plasmodium falciparum (PfSpdS). The enzymes of the polyamine-biosynthesis pathway have been proposed to be promising drug targets in the treatment of malaria. Spermidine synthase (SpdS; putrescine aminopropyltransferase) catalyzes the transfer of the aminopropyl moiety from decarboxylated S-adenosylmethionine to putrescine, leading to the formation of spermidine and 5′-methylthioadenosine (MTA). In this work, X-ray crystallography was used to examine ligand complexes of SpdS from the malaria parasite Plasmodium falciparum (PfSpdS). Five crystal structures were determined of PfSpdS in complex with MTA and the substrate putrescine, with MTA and spermidine, which was obtained as a result of the enzymatic reaction taking place within the crystals, with dcAdoMet and the inhibitor 4-methylaniline, with MTA and 4-aminomethylaniline, and with a compound predicted in earlier in silico screening to bind to the active site of the enzyme, benzimidazol-(2-yl)pentan-1-amine (BIPA). In contrast to the other inhibitors tested, the complex with BIPA was obtained without any ligand bound to the dcAdoMet-binding site of the enzyme. The complexes with the aniline compounds and BIPA revealed a new mode of ligand binding to PfSpdS. The observed binding mode of the ligands, and the interplay between the two substrate-binding sites and the flexible gatekeeper loop, can be used in the design of new approaches in the search for new inhibitors of SpdS.

  17. Structural Genomics and Drug Discovery for Infectious Diseases

    International Nuclear Information System (INIS)

    Anderson, W.F.

    2009-01-01

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  18. Structural Genomics and Drug Discovery for Infectious Diseases

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, W.F.

    2010-09-03

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  19. Genome segment 6 of Antheraea mylitta cypovirus encodes a structural protein with ATPase activity

    International Nuclear Information System (INIS)

    Chavali, Venkata R.M.; Madhurantakam, Chaithanya; Ghorai, Suvankar; Roy, Sobhan; Das, Amit K.; Ghosh, Ananta K.

    2008-01-01

    The genome segment 6 (S6) of the 11 double stranded RNA genomes from Antheraea mylitta cypovirus was converted into cDNA, cloned and sequenced. S6 consisted of 1944 nucleotides with an ORF of 607 amino acids and could encode a protein of 68 kDa, termed P68. Motif scan and molecular docking analysis of P68 showed the presence of two cystathionine beta synthase (CBS) domains and ATP binding sites. The ORF of AmCPV S6 was expressed in E. coli as His-tag fusion protein and polyclonal antibody was raised. Immunoblot analysis of virus infected gut cells and purified polyhedra using raised anti-p68 polyclonal antibody showed that S6 encodes a viral structural protein. Fluorescence and ATPase assay of soluble P68 produced in Sf-9 cells via baculovirus expression system showed its ability to bind and cleave ATP. These results suggest that P68 may bind viral RNA through CBS domains and help in replication and transcription through ATP binding and hydrolysis

  20. Structural determinants and mechanism of HIV-1 genome packaging.

    Science.gov (United States)

    Lu, Kun; Heng, Xiao; Summers, Michael F

    2011-07-22

    Like all retroviruses, the human immunodeficiency virus selectively packages two copies of its unspliced RNA genome, both of which are utilized for strand-transfer-mediated recombination during reverse transcription-a process that enables rapid evolution under environmental and chemotherapeutic pressures. The viral RNA appears to be selected for packaging as a dimer, and there is evidence that dimerization and packaging are mechanistically coupled. Both processes are mediated by interactions between the nucleocapsid domains of a small number of assembling viral Gag polyproteins and RNA elements within the 5'-untranslated region of the genome. A number of secondary structures have been predicted for regions of the genome that are responsible for packaging, and high-resolution structures have been determined for a few small RNA fragments and protein-RNA complexes. However, major questions regarding the RNA structures (and potentially the structural changes) that are responsible for dimeric genome selection remain unanswered. Here, we review efforts that have been made to identify the molecular determinants and mechanism of human immunodeficiency virus type 1 genome packaging. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. Polyketide synthase from Fusarium

    DEFF Research Database (Denmark)

    Kvesel, Kasper; Wimmer, Reinhard; Sørensen, Jens Laurids

    Fungi produce a wide array of secondary metabolites, with interesting bioactivities by help of a number of enzyme complexes. Polyketide synthases (PKS) are a class of multidomain enzymes, producing a class of secondary metabolites called polyketides1,2. Only few structures of PKS’s have been...

  2. The Impact of Structural Genomics: Expectations and Outcomes

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2005-12-21

    Structural Genomics (SG) projects aim to expand our structural knowledge of biological macromolecules, while lowering the average costs of structure determination. We quantitatively analyzed the novelty, cost, and impact of structures solved by SG centers, and contrast these results with traditional structural biology. The first structure from a protein family is particularly important to reveal the fold and ancient relationships to other proteins. In the last year, approximately half of such structures were solved at a SG center rather than in a traditional laboratory. Furthermore, the cost of solving a structure at the most efficient U.S. center has now dropped to one-quarter the estimated cost of solving a structure by traditional methods. However, top structural biology laboratories are much more efficient than the average, and comparable to SG centers despite working on very challenging structures. Moreover, traditional structural biology papers are cited significantly more often, suggesting greater current impact.

  3. Enzyme That Makes You Cry–Crystal Structure of Lachrymatory Factor Synthase from Allium cepa

    Energy Technology Data Exchange (ETDEWEB)

    Silvaroli, Josie A. [Department; Pleshinger, Matthew J. [Department; College of Wooster, Wooster, Ohio, United States; Banerjee, Surajit [Department; Northeastern; Kiser, Philip D. [Department; Research; Cleveland; Golczak, Marcin [Department; Cleveland

    2017-07-26

    The biochemical pathway that gives onions their savor is part of the chemical warfare against microbes and animals. This defense mechanism involves formation of a volatile lachrymatory factor (LF) ((Z)-propanethial S-oxide) that causes familiar eye irritation associated with onion chopping. LF is produced in a reaction catalyzed by lachrymatory factor synthase (LFS). The principles by which LFS facilitates conversion of a sulfenic acid substrate into LF have been difficult to experimentally examine owing to the inherent substrate reactivity and lability of LF. To shed light on the mechanism of LF production in the onion, we solved crystal structures of LFS in an apo-form and in complex with a substrate analogue, crotyl alcohol. The enzyme closely resembles the helix-grip fold characteristic for plant representatives of the START (star-related lipid transfer) domain-containing protein superfamily. By comparing the structures of LFS to that of the abscisic acid receptor, PYL10, a representative of the START protein superfamily, we elucidated structural adaptations underlying the catalytic activity of LFS. We also delineated the architecture of the active site, and based on the orientation of the ligand, we propose a mechanism of catalysis that involves sequential proton transfer accompanied by formation of a carbanion intermediate. These findings reconcile chemical and biochemical information regarding thioaldehyde S-oxide formation and close a long-lasting gap in understanding of the mechanism responsible for LF production in the onion.

  4. Effect of Phosphate Ion on the Structure of Lumazine Synthase, an Antigen Presentation System From Bacillus anthracis.

    Science.gov (United States)

    Wei, Yangjie; Wahome, Newton; Kumar, Prashant; Whitaker, Neal; Picking, Wendy L; Middaugh, C Russell

    2018-03-01

    Lumazine synthase (LS) is an oligomeric enzyme involved in the biosynthesis of riboflavin in microorganisms, fungi, and plants. LS has become of significant interest to biomedical science because of its critical biological role and attractive structural properties for antigen presentation in vaccines. LS derived from Bacillus anthracis (BaLS) consists of 60 identical subunits forming an icosahedron. Its crystal structure has been solved, but its dynamic conformational properties have not yet been studied. We investigated the conformation of BaLS in response to different stress conditions (e.g., chemical denaturants, pH, and temperature) using a variety of biophysical techniques. The physical basis for these thermal transitions was studied, indicating that a molten globular state was present during chemical unfolding by guanidine HCl. In addition, BaLS showed 2 distinct thermal transitions in phosphate-containing buffers. The first transition was due to the dissociation of phosphate ions from BaLS and the second one came from the dissociation and conformational alteration of its icosahedral structure. A small conformational alteration was induced by the binding/dissociation of phosphate ions to BaLS. This work provides a closer view of the conformational behavior of BaLS and provides important information for the formulation of vaccines which use this protein. Copyright © 2018 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.

  5. Multi-scale structural community organisation of the human genome.

    Science.gov (United States)

    Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

    2017-04-11

    Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.

  6. Delineating the structural, functional and evolutionary relationships of sucrose phosphate synthase gene family II in wheat and related grasses

    Directory of Open Access Journals (Sweden)

    Khalil Zaynali

    2010-06-01

    Full Text Available Abstract Background Sucrose phosphate synthase (SPS is an important component of the plant sucrose biosynthesis pathway. In the monocotyledonous Poaceae, five SPS genes have been identified. Here we present a detailed analysis of the wheat SPSII family in wheat. A set of homoeologue-specific primers was developed in order to permit both the detection of sequence variation, and the dissection of the individual contribution of each homoeologue to the global expression of SPSII. Results The expression in bread wheat over the course of development of various sucrose biosynthesis genes monitored on an Affymetrix array showed that the SPS genes were regulated over time and space. SPSII homoeologue-specific assays were used to show that the three homoeologues contributed differentially to the global expression of SPSII. Genetic mapping placed the set of homoeoloci on the short arms of the homoeologous group 3 chromosomes. A resequencing of the A and B genome copies allowed the detection of four haplotypes at each locus. The 3B copy includes an unspliced intron. A comparison of the sequences of the wheat SPSII orthologues present in the diploid progenitors einkorn, goatgrass and Triticum speltoides, as well as in the more distantly related species barley, rice, sorghum and purple false brome demonstrated that intronic sequence was less well conserved than exonic. Comparative sequence and phylogenetic analysis of SPSII gene showed that false purple brome was more similar to Triticeae than to rice. Wheat - rice synteny was found to be perturbed at the SPS region. Conclusion The homoeologue-specific assays will be suitable to derive associations between SPS functionality and key phenotypic traits. The amplicon sequences derived from the homoeologue-specific primers are informative regarding the evolution of SPSII in a polyploid context.

  7. Evolutionary genomics and population structure of Entamoeba histolytica

    Directory of Open Access Journals (Sweden)

    Koushik Das

    2014-11-01

    Full Text Available Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba.

  8. Structural and kinetic analysis of the unnatural fusion protein 4-coumaroyl-CoA ligase::stilbene synthase

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Yechun; Yi, Hankuil; Wang, Melissa; Yu, Oliver; Jez, Joseph M. (WU); (Danforth)

    2012-10-24

    To increase the biochemical efficiency of biosynthetic systems, metabolic engineers have explored different approaches for organizing enzymes, including the generation of unnatural fusion proteins. Previous work aimed at improving the biosynthesis of resveratrol, a stilbene associated a range of health-promoting activities, in yeast used an unnatural engineered fusion protein of Arabidopsis thaliana (thale cress) 4-coumaroyl-CoA ligase (At4CL1) and Vitis vinifera (grape) stilbene synthase (VvSTS) to increase resveratrol levels 15-fold relative to yeast expressing the individual enzymes. Here we present the crystallographic and biochemical analysis of the 4CL::STS fusion protein. Determination of the X-ray crystal structure of 4CL::STS provides the first molecular view of an artificial didomain adenylation/ketosynthase fusion protein. Comparison of the steady-state kinetic properties of At4CL1, VvSTS, and 4CL::STS demonstrates that the fusion protein improves catalytic efficiency of either reaction less than 3-fold. Structural and kinetic analysis suggests that colocalization of the two enzyme active sites within 70 {angstrom} of each other provides the basis for enhanced in vivo synthesis of resveratrol.

  9. Structural and functional analysis of two di-domain aromatase/cyclases from type II polyketide synthases.

    Science.gov (United States)

    Caldara-Festin, Grace; Jackson, David R; Barajas, Jesus F; Valentic, Timothy R; Patel, Avinash B; Aguilar, Stephanie; Nguyen, MyChi; Vo, Michael; Khanna, Avinash; Sasaki, Eita; Liu, Hung-Wen; Tsai, Shiou-Chuan

    2015-12-15

    Aromatic polyketides make up a large class of natural products with diverse bioactivity. During biosynthesis, linear poly-β-ketone intermediates are regiospecifically cyclized, yielding molecules with defined cyclization patterns that are crucial for polyketide bioactivity. The aromatase/cyclases (ARO/CYCs) are responsible for regiospecific cyclization of bacterial polyketides. The two most common cyclization patterns are C7-C12 and C9-C14 cyclizations. We have previously characterized three monodomain ARO/CYCs: ZhuI, TcmN, and WhiE. The last remaining uncharacterized class of ARO/CYCs is the di-domain ARO/CYCs, which catalyze C7-C12 cyclization and/or aromatization. Di-domain ARO/CYCs can further be separated into two subclasses: "nonreducing" ARO/CYCs, which act on nonreduced poly-β-ketones, and "reducing" ARO/CYCs, which act on cyclized C9 reduced poly-β-ketones. For years, the functional role of each domain in cyclization and aromatization for di-domain ARO/CYCs has remained a mystery. Here we present what is to our knowledge the first structural and functional analysis, along with an in-depth comparison, of the nonreducing (StfQ) and reducing (BexL) di-domain ARO/CYCs. This work completes the structural and functional characterization of mono- and di-domain ARO/CYCs in bacterial type II polyketide synthases and lays the groundwork for engineered biosynthesis of new bioactive polyketides.

  10. Snapshots of catalysis: Structure of covalently bound substrate trapped in Mycobacterium tuberculosis thiazole synthase (ThiG).

    Science.gov (United States)

    Zhang, Jia; Zhang, Bing; Zhao, Yao; Yang, Xiuna; Huang, Min; Cui, Peng; Zhang, Wenhong; Li, Jun; Zhang, Ying

    2018-02-26

    Increasing drug resistance in Mycobacterium tuberculosis (Mtb) has necessitated the design of new anti-mycobacterial drugs with novel targets. Thiazole synthase (ThiG) is an essential enzyme and a potential drug target in Mtb that catalyzes the formation of the thiazole moiety of thiamin-pyrophosphate from 1-deoxy-d-xylulose-5-phosphate (DXP), dehydroglycine and ThiS-thiocarboxylate. To uncover the catalysis mechanism and design potent and selective anti-mycobacterial compounds targeting ThiG, we determined the crystal structure of MtbThiG at 1.5 Å resolution, for the first time, snapshotting a covalently bound substrate trapped in the catalytic pocket. The structure showed a (β/α) 8 barrel overall fold as well as the dimer form of MtbThiG existing in solution. In the central pocket, Lys98 is the key residue forming a protonated carbinolamine intermediate, a functional Schiff base precursor, with DXP. The carbinolamine is further stabilized by active site residues mainly through hydrogen bonds. This work revealed that a protonated carbinolamine is initially formed and then it is dehydrated to the imine form of Schiff base during the early catalysis steps. Our research will provide useful information for understanding the ThiG function and lay the basis for future drug design by targeting this essential protein. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Analysis of the Sequences, Structures, and Functions of Product-Releasing Enzyme Domains in Fungal Polyketide Synthases

    Directory of Open Access Journals (Sweden)

    Lu Liu

    2017-09-01

    Full Text Available Product-releasing enzyme (PRE domains in fungal non-reducing polyketide synthases (NR-PKSs play a crucial role in catalysis and editing during polyketide biosynthesis, especially accelerating final biosynthetic reactions accompanied with product offloading. However, up to date, the systematic knowledge about PRE domains is deficient. In the present study, the relationships between sequences, structures, and functions of PRE domains were analyzed with 574 NR-PKSs of eight groups (I–VIII. It was found that the PRE domains in NR-PKSs could be mainly classified into three types, thioesterase (TE, reductase (R, and metallo-β-lactamase-type TE (MβL-TE. The widely distributed TE or TE-like domains were involved in NR-PKSs of groups I–IV, VI, and VIII. The R domains appeared in NR-PKSs of groups IV and VII, while the physically discrete MβL-TE domains were employed by most NR-PKSs of group V. The changes of catalytic sites and structural characteristics resulted in PRE functional differentiations. The phylogeny revealed that the evolution of TE domains was accompanied by complex functional divergence. The diverse sequence lengths of TE lid-loops affected substrate specificity with different chain lengths. The volume diversification of TE catalytic pockets contributed to catalytic mechanisms with functional differentiations. The above findings may help to understand the crucial catalysis of fungal aromatic polyketide biosyntheses and govern recombination of NR-PKSs to obtain unnatural target products.

  12. Structural Genomics of Bacterial Virulence Factors

    Science.gov (United States)

    2006-05-01

    membrane-inserted PA pore. The model is based on the pre-pore PA63 crystal structure, channel conductance studies, and the crystal structure of α... Cyanobacteria BXA0032 and BXA0033 (pXO1-22), if fused, would belong to the COG0175 family, members of the 3’- phosphoadenosine 5’-phosphosulfate...and thiol sulfur atom directed toward the zinc. For the LF(E687C)–GM6001–Zn2+ complex (Fig. 2c–e), where LF(E687C) represents the LF E687C mutant, the

  13. Decoding the fine-scale structure of a breast cancer genome and transcriptome

    OpenAIRE

    Volik, Stanislav; Raphael, Benjamin J.; Huang, Guiqing; Stratton, Michael R.; Bignel, Graham; Murnane, John; Brebner, John H.; Bajsarowicz, Krystyna; Paris, Pamela L.; Tao, Quanzhou; Kowbel, David; Lapuk, Anna; Shagin, Dmitri A.; Shagina, Irina A.; Gray, Joe W.

    2006-01-01

    A comprehensive understanding of cancer is predicated upon knowledge of the structure of malignant genomes underlying its many variant forms and the molecular mechanisms giving rise to them. It is well established that solid tumor genomes accumulate a large number of genome rearrangements during tumorigenesis. End Sequence Profiling (ESP) maps and clones genome breakpoints associated with all types of genome rearrangements elucidating the structural organization of tumor genomes. Here we exte...

  14. A survey of plant and algal genomes and transcriptomes reveals new insights into the evolution and function of the cellulose synthase superfamily

    Science.gov (United States)

    2014-01-01

    Background Enzymes of the cellulose synthase (CesA) family and CesA-like (Csl) families are responsible for the synthesis of celluloses and hemicelluloses, and thus are of great interest to bioenergy research. We studied the occurrences and phylogenies of CesA/Csl families in diverse plants and algae by comprehensive data mining of 82 genomes and transcriptomes. Results We found that 1) charophytic green algae (CGA) have orthologous genes in CesA, CslC and CslD families; 2) liverwort genes are found in the CesA, CslA, CslC and CslD families; 3) The fern Pteridium aquilinum not only has orthologs in these conserved families but also in the CslB, CslH and CslE families; 4) basal angiosperms, e.g. Aristolochia fimbriata, have orthologs in these families too; 5) gymnosperms have genes forming clusters ancestral to CslB/H and to CslE/J/G respectively; 6) CslG is found in switchgrass and basal angiosperms; 7) CslJ is widely present in dicots and monocots; 8) CesA subfamilies have already diversified in ferns. Conclusions We speculate that: (i) ferns and horsetails might both have CslH enzymes, responsible for the synthesis of mixed-linkage glucans and (ii) CslD and similar genes might be responsible for the synthesis of mannans in CGA. Our findings led to a more detailed model of cell wall evolution and suggested that gene loss played an important role in the evolution of Csl families. We also demonstrated the usefulness of transcriptome data in the study of plant cell wall evolution and diversity. PMID:24708035

  15. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob Hull

    2014-01-01

    for Laurasiatheria (pig, cow, dolphin, horse, cat, dog, hedgehog). CONCLUSIONS: We have obtained one of the most comprehensive annotations for structured ncRNAs of a mammalian genome, which is likely to play central roles in both health modelling and production. The core annotation is available in Ensembl 70...

  16. Structure and sequence motifs in the HIV-1 RNA genome

    NARCIS (Netherlands)

    van Bel, N.

    2015-01-01

    The untranslated leader of the HIV-1 RNA genome contains some 350 nucleotides and is highly conserved among virus isolates. Several characteristic hairpin structures that regulate important virus replication steps, such as dimerization and packaging in virion particles, are clustered in this leader.

  17. cDNA structure, genomic organization and expression patterns of ...

    African Journals Online (AJOL)

    Visfatin was a newly identified adipocytokine, which was involved in various physiologic and pathologic processes of organisms. The cDNA structure, genomic organization and expression patterns of silver Prussian carp visfatin were described in this report. The silver Prussian carp visfatin cDNA cloned from the liver was ...

  18. cDNA structure, genomic organization and expression patterns of ...

    African Journals Online (AJOL)

    use

    2011-11-23

    Nov 23, 2011 ... Visfatin was a newly identified adipocytokine, which was involved in various physiologic and pathologic processes of organisms. The cDNA structure, genomic organization and expression patterns of silver Prussian carp visfatin were described in this report. The silver Prussian carp visfatin. cDNA cloned ...

  19. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob H

    2014-01-01

    BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However......, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure...... lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome...

  20. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  1. Human mitochondrial HMG CoA synthase: Liver cDNA and partial genomic cloning, chromosome mapping to 1p12-p13, and possible role in vertebrate evolution

    Energy Technology Data Exchange (ETDEWEB)

    Boukaftane, Y.; Robert, M.F.; Mitchell, G.A. [Hopital Sainte-Justine, Montreal (Canada)] [and others

    1994-10-01

    Mitochondrial 3-hydroxy-3-methylglutaryl CoA synthase (mHS) is the first enzyme of ketogenesis, whereas the cytoplasmic HS isozyme (cHS) mediates an early step in cholersterol synthesis. We here report the sequence of human and mouse liver mHS cDNAs, the sequence of an HS-like cDNA from Caenorhabditis elegans, the structure of a partial human mHS genomic clone, and the mapping of the human mHS gene to chromosome 1p12-p13. the nucleotide sequence of the human mHS cDNA encodes a mature mHS peptide of 471 residues, with a mean amino acid identity of 66.5% with cHS from mammals and chicken. Comparative analysis of all known mHS and cHS protein and DNA sequences shows a high degree of conservation near the N-terminus that decreases progressively toward the C-terminus and suggests that the two isozymes arose from a common ancestor gene 400-900 million years ago. Comparison of the gene structure of mHS and cHS is also consistant with a recent duplication event. We hypothesize that the physiologic result of the HS gene duplication was the appearance of HS within the mitochondria around the time of emergence of early vertebrates, which linked preexisting pathways of beta oxidation and leucine catabolism and created the HMG CoA pathway of ketogenesis, thus providing a lipid-derived energy source for the vertebrate brain. 56 refs., 4 figs., 2 tabs.

  2. 6-Pyruvoyltetrahydropterin synthase orthologs of either a single or dual domain structure are responsible for tetrahydrobiopterin synthesis in bacteria.

    Science.gov (United States)

    Kong, Jin Sun; Kang, Ji-Youn; Kim, Hye Lim; Kwon, O-Seob; Lee, Kon Ho; Park, Young Shik

    2006-09-04

    6-Pyruvoyltetrahydropterin synthase (PTPS) catalyzes the second step of tetrahydrobiopterin (BH4) synthesis. We previously identified PTPS orthologs (bPTPS-Is) in bacteria which do not produce BH4. In this study we disrupted the gene encoding bPTPS-I in Synechococcus sp. PCC 7942, which produces BH4-glucoside. The mutant was normal in BH4-glucoside production, demonstrating that bPTPS-I does not participate in BH4 synthesis in vivo and bringing us a new PTPS ortholog (bPTPS-II) of a bimodular polypeptide. The recombinant Synechococcus bPTPS-II was assayed in vitro to show PTPS activity higher than human enzyme. Further computational analysis revealed the presence of mono and bimodular bPTPS-II orthologs mostly in green sulfur bacteria and cyanobacteria, respectively, which are well known for BH4-glycoside production. In summary we found new bacterial PTPS orthologs, having either a single or dual domain structure and being responsible for BH4 synthesis in vivo, thereby disclosing all the bacterial PTPS homologs.

  3. Structures of Mycobacterium Tuberculosis Folylpolyglutamate Synthase Complexed With ADP And AMPPCD

    Energy Technology Data Exchange (ETDEWEB)

    Young, P.G.; Smith, C.A.; Metcalf, P.; Baker, E.N.

    2009-05-28

    Folate derivatives are essential vitamins for cell growth and replication, primarily because of their central role in reactions of one-carbon metabolism. Folates require polyglutamation to be efficiently retained within the cell and folate-dependent enzymes have a higher affinity for the polyglutamylated forms of this cofactor. Polyglutamylation is dependent on the enzyme folylpolyglutamate synthetase (FPGS), which catalyzes the sequential addition of several glutamates to folate. FPGS is essential for the growth and survival of important bacterial species, including Mycobacterium tuberculosis, and is a potential drug target. Here, the crystal structures of M. tuberculosis FPGS in complex with ADP and AMPPCP are reported at 2.0 and 2.3 angstroms resolution, respectively. The structures reveal a deeply buried nucleotide-binding site, as in the Escherichia coli and Lactobacillus casei FPGS structures, and a long extended groove for the binding of folate substrates. Differences from the E. coli and L. casei FPGS structures are seen in the binding of a key divalent cation, the carbamylation state of an essential lysine side chain and the adoption of an 'open' position by the active-site beta5-alpha6 loop. These changes point to coordinated events that are associated with dihydropteroate/folate binding and the catalysis of the new amide bond with an incoming glutamate residue.

  4. Yeast beta-alanine synthase shares a structural scaffold and origin with dizinc-dependent exopeptidases

    DEFF Research Database (Denmark)

    Lundgren, S.; Gojkovic, Zoran; Piskur, Jure

    2003-01-01

    of the intersubunit contacts. Both domains exhibit a mixed alpha/beta-topology. Surprisingly, the observed high structural homology to a family of dizinc-dependent exopeptidases suggests that these two enzyme groups have a common origin. Alterations in the ligand composition of the metal-binding site can be explained...

  5. Two crystal structures of dihydrofolate reductase-thymidylate synthase from Cryptosporidium hominis reveal protein–ligand interactions including a structural basis for observed antifolate resistance

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Amy C., E-mail: aca@dartmouth.edu [Dartmouth College, Department of Chemistry, Burke Laboratories, Hanover, NH 03755 (United States)

    2005-03-01

    An analysis of the protein–ligand interactions in two crystal structures of DHFR-TS from C. hominis reveals a possible structural basis for observed antifolate resistance in C. hominis DHFR. A comparison with the structure of human DHFR reveals residue substitutions that may be exploited for the design of species-selective inhibitors. Cryptosporidium hominis is a protozoan parasite that causes acute gastrointestinal illness. There are no effective therapies for cryptosporidiosis, highlighting the need for new drug-lead discovery. An analysis of the protein–ligand interactions in two crystal structures of dihydrofolate reductase-thymidylate synthase (DHFR-TS) from C. hominis, determined at 2.8 and 2.87 Å resolution, reveals that the interactions of residues Ile29, Thr58 and Cys113 in the active site of C. hominis DHFR provide a possible structural basis for the observed antifolate resistance. A comparison with the structure of human DHFR reveals active-site differences that may be exploited for the design of species-selective inhibitors.

  6. Putative Chitin Synthases from Branchiostoma floridae Show Extracellular Matrix-related Domains and Mosaic Structures

    OpenAIRE

    Guerriero, Gea

    2012-01-01

    The transition from unicellular to multicellular life forms requires the development of a specialized structural component, the extracellular matrix (ECM). In Metazoans, there are two main supportive systems, which are based on chitin and collagen/hyaluronan, respectively. Chitin is the major constituent of fungal cell walls and arthropod exoskeleton. However, presence of chitin/chitooligosaccharides has been reported in lower chordates and during specific stages of vertebrate development. In...

  7. Structure-Based Alignment and Consensus Secondary Structures for Three HIV-Related RNA Genomes.

    Science.gov (United States)

    Lavender, Christopher A; Gorelick, Robert J; Weeks, Kevin M

    2015-05-01

    HIV and related primate lentiviruses possess single-stranded RNA genomes. Multiple regions of these genomes participate in critical steps in the viral replication cycle, and the functions of many RNA elements are dependent on the formation of defined structures. The structures of these elements are still not fully understood, and additional functional elements likely exist that have not been identified. In this work, we compared three full-length HIV-related viral genomes: HIV-1NL4-3, SIVcpz, and SIVmac (the latter two strains are progenitors for all HIV-1 and HIV-2 strains, respectively). Model-free RNA structure comparisons were performed using whole-genome structure information experimentally derived from nucleotide-resolution SHAPE reactivities. Consensus secondary structures were constructed for strongly correlated regions by taking into account both SHAPE probing structural data and nucleotide covariation information from structure-based alignments. In these consensus models, all known functional RNA elements were recapitulated with high accuracy. In addition, we identified multiple previously unannotated structural elements in the HIV-1 genome likely to function in translation, splicing and other replication cycle processes; these are compelling targets for future functional analyses. The structure-informed alignment strategy developed here will be broadly useful for efficient RNA motif discovery.

  8. Crystal Structure of Methylornithine Synthase (PylB): Insights into the Pyrrolysine Biosynthesis

    KAUST Repository

    Quitterer, Felix

    2011-11-16

    Made by the barrel load: The biosynthetic pathway of the recently discovered 22nd amino acid, pyrrolysine, starts with an isomerization of lysine to methylornithine, catalyzed by PylB. The X-ray crystal structure of PylB is determined (see picture) and shows it has a TIM barrel fold. The sealed central cavity contains a [4Fe-4S] cluster, S-adenosylmethionine (SAM), and methylornithine, whose 2R,3R configuration could be confirmed. The data suggest a fragmentation-recombination mechanism via a glycyl radical intermediate. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Genetic linkage map of a wild genome: genomic structure, recombination and sexual dimorphism in bighorn sheep

    Science.gov (United States)

    2010-01-01

    Background The construction of genetic linkage maps in free-living populations is a promising tool for the study of evolution. However, such maps are rare because it is difficult to develop both wild pedigrees and corresponding sets of molecular markers that are sufficiently large. We took advantage of two long-term field studies of pedigreed individuals and genomic resources originally developed for domestic sheep (Ovis aries) to construct a linkage map for bighorn sheep, Ovis canadensis. We then assessed variability in genomic structure and recombination rates between bighorn sheep populations and sheep species. Results Bighorn sheep population-specific maps differed slightly in contiguity but were otherwise very similar in terms of genomic structure and recombination rates. The joint analysis of the two pedigrees resulted in a highly contiguous map composed of 247 microsatellite markers distributed along all 26 autosomes and the X chromosome. The map is estimated to cover about 84% of the bighorn sheep genome and contains 240 unique positions spanning a sex-averaged distance of 3051 cM with an average inter-marker distance of 14.3 cM. Marker synteny, order, sex-averaged interval lengths and sex-averaged total map lengths were all very similar between sheep species. However, in contrast to domestic sheep, but consistent with the usual pattern for a placental mammal, recombination rates in bighorn sheep were significantly greater in females than in males (~12% difference), resulting in an autosomal female map of 3166 cM and an autosomal male map of 2831 cM. Despite differing genome-wide patterns of heterochiasmy between the sheep species, sexual dimorphism in recombination rates was correlated between orthologous intervals. Conclusions We have developed a first-generation bighorn sheep linkage map that will facilitate future studies of the genetic architecture of trait variation in this species. While domestication has been hypothesized to be responsible for the

  10. Chromatin structure and evolution in the human genome

    Directory of Open Access Journals (Sweden)

    Dunlop Malcolm G

    2007-05-01

    Full Text Available Abstract Background Evolutionary rates are not constant across the human genome but genes in close proximity have been shown to experience similar levels of divergence and selection. The higher-order organisation of chromosomes has often been invoked to explain such phenomena but previously there has been insufficient data on chromosome structure to investigate this rigorously. Using the results of a recent genome-wide analysis of open and closed human chromatin structures we have investigated the global association between divergence, selection and chromatin structure for the first time. Results In this study we have shown that, paradoxically, synonymous site divergence (dS at non-CpG sites is highest in regions of open chromatin, primarily as a result of an increased number of transitions, while the rates of other traditional measures of mutation (intergenic, intronic and ancient repeat divergence as well as SNP density are highest in closed regions of the genome. Analysis of human-chimpanzee divergence across intron-exon boundaries indicates that although genes in relatively open chromatin generally display little selection at their synonymous sites, those in closed regions show markedly lower divergence at their fourfold degenerate sites than in neighbouring introns and intergenic regions. Exclusion of known Exonic Splice Enhancer hexamers has little affect on the divergence observed at fourfold degenerate sites across chromatin categories; however, we show that closed chromatin is enriched with certain classes of ncRNA genes whose RNA secondary structure may be particularly important. Conclusion We conclude that, overall, non-CpG mutation rates are lowest in open regions of the genome and that regions of the genome with a closed chromatin structure have the highest background mutation rate. This might reflect lower rates of DNA damage or enhanced DNA repair processes in regions of open chromatin. Our results also indicate that dS is a poor

  11. Highly divergent mitochondrial ATP synthase complexes in Tetrahymena thermophila.

    Directory of Open Access Journals (Sweden)

    Praveen Balabaskaran Nina

    2010-07-01

    Full Text Available The F-type ATP synthase complex is a rotary nano-motor driven by proton motive force to synthesize ATP. Its F(1 sector catalyzes ATP synthesis, whereas the F(o sector conducts the protons and provides a stator for the rotary action of the complex. Components of both F(1 and F(o sectors are highly conserved across prokaryotes and eukaryotes. Therefore, it was a surprise that genes encoding the a and b subunits as well as other components of the F(o sector were undetectable in the sequenced genomes of a variety of apicomplexan parasites. While the parasitic existence of these organisms could explain the apparent incomplete nature of ATP synthase in Apicomplexa, genes for these essential components were absent even in Tetrahymena thermophila, a free-living ciliate belonging to a sister clade of Apicomplexa, which demonstrates robust oxidative phosphorylation. This observation raises the possibility that the entire clade of Alveolata may have invented novel means to operate ATP synthase complexes. To assess this remarkable possibility, we have carried out an investigation of the ATP synthase from T. thermophila. Blue native polyacrylamide gel electrophoresis (BN-PAGE revealed the ATP synthase to be present as a large complex. Structural study based on single particle electron microscopy analysis suggested the complex to be a dimer with several unique structures including an unusually large domain on the intermembrane side of the ATP synthase and novel domains flanking the c subunit rings. The two monomers were in a parallel configuration rather than the angled configuration previously observed in other organisms. Proteomic analyses of well-resolved ATP synthase complexes from 2-D BN/BN-PAGE identified orthologs of seven canonical ATP synthase subunits, and at least 13 novel proteins that constitute subunits apparently limited to the ciliate lineage. A mitochondrially encoded protein, Ymf66, with predicted eight transmembrane domains could be a

  12. Crystal Structures of the Iron–Sulfur Cluster-Dependent Quinolinate Synthase in Complex with Dihydroxyacetone Phosphate, Iminoaspartate Analogues, and Quinolinate

    Energy Technology Data Exchange (ETDEWEB)

    Fenwick, Michael K. [Cornell Univ., Ithaca, NY (United States); Ealick, Steven E. [Cornell Univ., Ithaca, NY (United States)

    2016-07-12

    The quinolinate synthase of prokaryotes and photosynthetic eukaryotes, NadA, contains a [4Fe-4S] cluster with unknown function. We report crystal structures of Pyrococcus horikoshii NadA in complex with dihydroxyacetone phosphate (DHAP), iminoaspartate analogues, and quinolinate. DHAP adopts a nearly planar conformation and chelates the [4Fe-4S] cluster via its keto and hydroxyl groups. The active site architecture suggests that the cluster acts as a Lewis acid in enediolate formation, like zinc in class II aldolases. The DHAP and putative iminoaspartate structures suggest a model for a condensed intermediate. The ensemble of structures suggests a two-state system, which may be exploited in early steps.

  13. Elucidation of Operon Structures across Closely Related Bacterial Genomes

    Science.gov (United States)

    Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components. PMID:24959722

  14. Elucidation of operon structures across closely related bacterial genomes.

    Science.gov (United States)

    Zhou, Chuan; Ma, Qin; Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components.

  15. Novel Structural and Functional Motifs in cellulose synthase (CesA Genes of Bread Wheat (Triticum aestivum, L..

    Directory of Open Access Journals (Sweden)

    Simerjeet Kaur

    Full Text Available Cellulose is the primary determinant of mechanical strength in plant tissues. Late-season lodging is inversely related to the amount of cellulose in a unit length of the stem. Wheat is the most widely grown of all the crops globally, yet information on its CesA gene family is limited. We have identified 22 CesA genes from bread wheat, which include homoeologs from each of the three genomes, and named them as TaCesAXA, TaCesAXB or TaCesAXD, where X denotes the gene number and the last suffix stands for the respective genome. Sequence analyses of the CESA proteins from wheat and their orthologs from barley, maize, rice, and several dicot species (Arabidopsis, beet, cotton, poplar, potato, rose gum and soybean revealed motifs unique to monocots (Poales or dicots. Novel structural motifs CQIC and SVICEXWFA were identified, which distinguished the CESAs involved in the formation of primary and secondary cell wall (PCW and SCW in all the species. We also identified several new motifs specific to monocots or dicots. The conserved motifs identified in this study possibly play functional roles specific to PCW or SCW formation. The new insights from this study advance our knowledge about the structure, function and evolution of the CesA family in plants in general and wheat in particular. This information will be useful in improving culm strength to reduce lodging or alter wall composition to improve biofuel production.

  16. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  17. Functional genomic analysis supports conservation of function among cellulose synthase-like a gene family members and suggests diverse roles of mannans in plants

    DEFF Research Database (Denmark)

    Liepman, Aaron H; Nairn, C Joseph; Willats, William G T

    2007-01-01

    in insect cells, and each CslA protein catalyzed mannan and glucomannan synthase reactions in vitro. Microarray mining and quantitative real-time reverse transcription-polymerase chain reaction analysis demonstrated that transcripts of Arabidopsis and loblolly pine (Pinus taeda) CslA genes display tissue...... they are prevalent at cell junctions and in buds. Taken together, these results demonstrate that members of the CslA gene family from diverse plant species encode glucomannan synthases and support the hypothesis that mannans function in metabolic networks devoted to other cellular processes in addition to cell wall...

  18. The mitochondrial genome of soybean reveals complex genome structures and gene evolution at intercellular and phylogenetic levels.

    Directory of Open Access Journals (Sweden)

    Shengxin Chang

    Full Text Available Determining mitochondrial genomes is important for elucidating vital activities of seed plants. Mitochondrial genomes are specific to each plant species because of their variable size, complex structures and patterns of gene losses and gains during evolution. This complexity has made research on the soybean mitochondrial genome difficult compared with its nuclear and chloroplast genomes. The present study helps to solve a 30-year mystery regarding the most complex mitochondrial genome structure, showing that pairwise rearrangements among the many large repeats may produce an enriched molecular pool of 760 circles in seed plants. The soybean mitochondrial genome harbors 58 genes of known function in addition to 52 predicted open reading frames of unknown function. The genome contains sequences of multiple identifiable origins, including 6.8 kb and 7.1 kb DNA fragments that have been transferred from the nuclear and chloroplast genomes, respectively, and some horizontal DNA transfers. The soybean mitochondrial genome has lost 16 genes, including nine protein-coding genes and seven tRNA genes; however, it has acquired five chloroplast-derived genes during evolution. Four tRNA genes, common among the three genomes, are derived from the chloroplast. Sizeable DNA transfers to the nucleus, with pericentromeric regions as hotspots, are observed, including DNA transfers of 125.0 kb and 151.6 kb identified unambiguously from the soybean mitochondrial and chloroplast genomes, respectively. The soybean nuclear genome has acquired five genes from its mitochondrial genome. These results provide biological insights into the mitochondrial genome of seed plants, and are especially helpful for deciphering vital activities in soybean.

  19. CLYBL is a polymorphic human enzyme with malate synthase and β-methylmalate synthase activity

    Science.gov (United States)

    Strittmatter, Laura; Li, Yang; Nakatsuka, Nathan J.; Calvo, Sarah E.; Grabarek, Zenon; Mootha, Vamsi K.

    2014-01-01

    CLYBL is a human mitochondrial enzyme of unknown function that is found in multiple eukaryotic taxa and conserved to bacteria. The protein is expressed in the mitochondria of all mammalian organs, with highest expression in brown fat and kidney. Approximately 5% of all humans harbor a premature stop polymorphism in CLYBL that has been associated with reduced levels of circulating vitamin B12. Using comparative genomics, we now show that CLYBL is strongly co-expressed with and co-evolved specifically with other components of the mitochondrial B12 pathway. We confirm that the premature stop polymorphism in CLYBL leads to a loss of protein expression. To elucidate the molecular function of CLYBL, we used comparative operon analysis, structural modeling and enzyme kinetics. We report that CLYBL encodes a malate/β-methylmalate synthase, converting glyoxylate and acetyl-CoA to malate, or glyoxylate and propionyl-CoA to β-methylmalate. Malate synthases are best known for their established role in the glyoxylate shunt of plants and lower organisms and are traditionally described as not occurring in humans. The broader role of a malate/β-methylmalate synthase in human physiology and its mechanistic link to vitamin B12 metabolism remain unknown. PMID:24334609

  20. Analysis of the Thiocapsa pfennigii polyhydroxyalkanoate synthase: subcloning, molecular characterization and generation of hybrid synthases with the corresponding Chromatium vinosum enzyme.

    Science.gov (United States)

    Liebergesell, M; Rahalkar, S; Steinbüchel, A

    2000-08-01

    The PHA synthase structural gene of Thiocapsa pfennigii was identified and subcloned on a 2.8-kbp BamHI restriction fragment, which was cloned recently from a genomic 15.6-kbp EcoRI restriction fragment. Nucleotide sequence analysis of this fragment revealed three open reading frames (ORFs), representing coding regions. Two ORFs encoded for the PhaE (Mr 40,950) and PhaC (Mr 40,190) subunits of the PHA synthase from T. pfennigii and exhibited high homology with the corresponding proteins of the Chromatium vinosum (52.8% and 85.2% amino acid identity) and the Thiocystis violacea (52.5% and 82.4%) PHA synthases, respectively. This confirmed that the T. pfennigii PHA synthase was composed of two different subunits. Also, with respect to the molecular organization of phaE and phaC, this region of the T. pfennigii genome resembled very much the corresponding regions of C. vinosum and of Thiocystis violacea. A recombinant strain of Pseudomonas putida, which overexpressed phaE and phaC from T. pfennigii, was used to isolate the PHA synthase by a two-step procedure including chromatography on Procion Blue H-ERD and hydroxyapatite. The isolated PHA synthase consisted of two proteins exhibiting the molecular weights predicted for PhaE and PhaC. Hybrid PHA synthases composed of PhaE from T. pfennigii and PhaC from C. vinosum and vice versa were constructed and functionally expressed in a PHA-negative mutant of P. putida; and the resulting PHAs were analyzed.

  1. Structure-function mapping of key determinants for hydrocarbon biosynthesis by squalene and squalene synthase-like enzymes from the green alga Botryococcus braunii race B.

    Science.gov (United States)

    Bell, Stephen A; Niehaus, Thomas D; Nybo, S Eric; Chappell, Joseph

    2014-12-09

    Squalene and botryococcene are branched-chain, triterpene compounds that arise from the head-to-head condensation of two molecules of farnesyl diphosphate to yield 1'-1 and 1'-3 linkages, respectively. The enzymes that catalyze their formation have attracted considerable interest from the medical field as potential drug targets and the renewable energy sector for metabolic engineering efforts. Recently, the enzymes responsible for botryococcene and squalene biosynthesis in the green alga Botryococcus braunii race B were characterized. To better understand how the specificity for the 1'-1 and 1'-3 linkages was controlled, we attempted to identify the functional residues and/or domains responsible for this step in the catalytic cascade. Existing crystal structures for the mammalian squalene synthase and Staphylococcus dehydrosqualene synthase enzymes were exploited to develop molecular models for the B. braunii botryococcene and squalene synthase enzymes. Residues within the active sites that could mediate catalytic specificity were identified, and reciprocal mutants were created in an attempt to interconvert the reaction product specificity of the enzymes. We report here the identification of several amino acid positions contributing to the rearrangement of the cyclopropyl intermediate to squalene, but these same positions do not appear to be sufficient to account for the cyclopropyl rearrangement to give botryococcene.

  2. Solution structure of the tandem acyl carrier protein domains from a polyunsaturated fatty acid synthase reveals beads-on-a-string configuration.

    Directory of Open Access Journals (Sweden)

    Uldaeliz Trujillo

    Full Text Available The polyunsaturated fatty acid (PUFA synthases from deep-sea bacteria invariably contain multiple acyl carrier protein (ACP domains in tandem. This conserved tandem arrangement has been implicated in both amplification of fatty acid production (additive effect and in structural stabilization of the multidomain protein (synergistic effect. While the more accepted model is one in which domains act independently, recent reports suggest that ACP domains may form higher oligomers. Elucidating the three-dimensional structure of tandem arrangements may therefore give important insights into the functional relevance of these structures, and hence guide bioengineering strategies. In an effort to elucidate the three-dimensional structure of tandem repeats from deep-sea anaerobic bacteria, we have expressed and purified a fragment consisting of five tandem ACP domains from the PUFA synthase from Photobacterium profundum. Analysis of the tandem ACP fragment by analytical gel filtration chromatography showed a retention time suggestive of a multimeric protein. However, small angle X-ray scattering (SAXS revealed that the multi-ACP fragment is an elongated monomer which does not form a globular unit. Stokes radii calculated from atomic monomeric SAXS models were comparable to those measured by analytical gel filtration chromatography, showing that in the gel filtration experiment, the molecular weight was overestimated due to the elongated protein shape. Thermal denaturation monitored by circular dichroism showed that unfolding of the tandem construct was not cooperative, and that the tandem arrangement did not stabilize the protein. Taken together, these data are consistent with an elongated beads-on-a-string arrangement of the tandem ACP domains in PUFA synthases, and speak against synergistic biocatalytic effects promoted by quaternary structuring. Thus, it is possible to envision bioengineering strategies which simply involve the artificial linking of

  3. Solution Structure of the Tandem Acyl Carrier Protein Domains from a Polyunsaturated Fatty Acid Synthase Reveals Beads-on-a-String Configuration

    KAUST Repository

    Trujillo, Uldaeliz

    2013-02-28

    The polyunsaturated fatty acid (PUFA) synthases from deep-sea bacteria invariably contain multiple acyl carrier protein (ACP) domains in tandem. This conserved tandem arrangement has been implicated in both amplification of fatty acid production (additive effect) and in structural stabilization of the multidomain protein (synergistic effect). While the more accepted model is one in which domains act independently, recent reports suggest that ACP domains may form higher oligomers. Elucidating the three-dimensional structure of tandem arrangements may therefore give important insights into the functional relevance of these structures, and hence guide bioengineering strategies. In an effort to elucidate the three-dimensional structure of tandem repeats from deep-sea anaerobic bacteria, we have expressed and purified a fragment consisting of five tandem ACP domains from the PUFA synthase from Photobacterium profundum. Analysis of the tandem ACP fragment by analytical gel filtration chromatography showed a retention time suggestive of a multimeric protein. However, small angle X-ray scattering (SAXS) revealed that the multi-ACP fragment is an elongated monomer which does not form a globular unit. Stokes radii calculated from atomic monomeric SAXS models were comparable to those measured by analytical gel filtration chromatography, showing that in the gel filtration experiment, the molecular weight was overestimated due to the elongated protein shape. Thermal denaturation monitored by circular dichroism showed that unfolding of the tandem construct was not cooperative, and that the tandem arrangement did not stabilize the protein. Taken together, these data are consistent with an elongated beads-on-a-string arrangement of the tandem ACP domains in PUFA synthases, and speak against synergistic biocatalytic effects promoted by quaternary structuring. Thus, it is possible to envision bioengineering strategies which simply involve the artificial linking of multiple ACP

  4. Fine population structure analysis method for genomes of many.

    Science.gov (United States)

    Pan, Xuedong; Wang, Yi; Wong, Emily H M; Telenti, Amalio; Venter, J Craig; Jin, Li

    2017-10-03

    Fine population structure can be examined through the clustering of individuals into subpopulations. The clustering of individuals in large sequence datasets into subpopulations makes the calculation of subpopulation specific allele frequency possible, which may shed light on selection of candidate variants for rare diseases. However, as the magnitude of the data increases, computational burden becomes a challenge in fine population structure analysis. To address this issue, we propose fine population structure analysis (FIPSA), which is an individual-based non-parametric method for dissecting fine population structure. FIPSA maximizes the likelihood ratio of the contingency table of the allele counts multiplied by the group. We demonstrated that its speed and accuracy were superior to existing non-parametric methods when the simulated sample size was up to 5,000 individuals. When applied to real data, the method showed high resolution on the Human Genome Diversity Project (HGDP) East Asian dataset. FIPSA was independently validated on 11,257 human genomes. The group assignment given by FIPSA was 99.1% similar to those assigned based on supervised learning. Thus, FIPSA provides high resolution and is compatible with a real dataset of more than ten thousand individuals.

  5. Target selection and deselection at the Berkeley Structural Genomics Center.

    Science.gov (United States)

    Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E

    2006-02-01

    At the Berkeley Structural Genomics Center (BSGC), our goal is to obtain a near-complete structural complement of proteins in the minimal organisms Mycoplasma genitalium and M. pneumoniae, two closely related pathogens. Current targets for structure determination have been selected in six major stages, starting with those predicted to be most tractable to high throughput study and likely to yield new structural information. We report on the process used to select these proteins, as well as our target deselection procedure. Target deselection reduces experimental effort by eliminating targets similar to those recently solved by the structural biology community or other centers. We measure the impact of the 69 structures solved at the BSGC as of July 2004 on structure prediction coverage of the M. pneumoniae and M. genitalium proteomes. The number of Mycoplasma proteins for which the fold could first be reliably assigned based on structures solved at the BSGC (24 M. pneumoniae and 21 M. genitalium) is approximately 25% of the total resulting from work at all structural genomics centers and the worldwide structural biology community (94 M. pneumoniae and 86 M. genitalium) during the same period. As the number of structures contributed by the BSGC during that period is less than 1% of the total worldwide output, the benefits of a focused target selection strategy are apparent. If the structures of all current targets were solved, the percentage of M. pneumoniae proteins for which folds could be reliably assigned would increase from approximately 57% (391 of 687) at present to around 80% (550 of 687), and the percentage of the proteome that could be accurately modeled would increase from around 37% (254 of 687) to about 64% (438 of 687). In M. genitalium, the percentage of the proteome that could be structurally annotated based on structures of our remaining targets would rise from 72% (348 of 486) to around 76% (371 of 486), with the percentage of accurately modeled

  6. Suppression of the phytoene synthase gene (EgcrtB) alters carotenoid content and intracellular structure of Euglena gracilis.

    Science.gov (United States)

    Kato, Shota; Soshino, Mika; Takaichi, Shinichi; Ishikawa, Takahiro; Nagata, Noriko; Asahina, Masashi; Shinomura, Tomoko

    2017-07-17

    Photosynthetic organisms utilize carotenoids for photoprotection as well as light harvesting. Our previous study revealed that high-intensity light increases the expression of the gene for phytoene synthase (EgcrtB) in Euglena gracilis (a unicellular phytoflagellate), the encoded enzyme catalyzes the first committed step of the carotenoid biosynthesis pathway. To examine carotenoid synthesis of E. gracilis in response to light stress, we analyzed carotenoid species and content in cells grown under various light intensities. In addition, we investigated the effect of suppressing EgcrtB with RNA interference (RNAi) on growth and carotenoid content. After cultivation for 7 days under continuous light at 920 μmol m -2  s -1 , β-carotene, diadinoxanthin (Ddx), and diatoxanthin (Dtx) content in cells was significantly increased compared with standard light intensity (55 μmol m -2  s -1 ). The high-intensity light (920 μmol m -2  s -1 ) increased the pool size of diadinoxanthin cycle pigments (i.e., Ddx + Dtx) by 1.2-fold and the Dtx/Ddx ratio from 0.05 (control) to 0.09. In contrast, the higher-intensity light treatment caused a 58% decrease in chlorophyll (a + b) content and diminished the number of thylakoid membranes in chloroplasts by approximately half compared with control cells, suggesting that the high-intensity light-induced accumulation of carotenoids is associated with an increase in both the number and size of lipid globules in chloroplasts and the cytoplasm. Transient suppression of EgcrtB in this alga by RNAi resulted in significant decreases in cell number, chlorophyll, and total major carotenoid content by 82, 82 and 86%, respectively, relative to non-electroporated cells. Furthermore, suppression of EgcrtB decreased the number of chloroplasts and thylakoid membranes and increased the Dtx/Ddx ratio by 1.6-fold under continuous illumination even at the standard light intensity, indicating that blocking carotenoid synthesis increased the

  7. Structure of the Varicella Zoster Virus Thymidylate Synthase Establishes Functional and Structural Similarities as the Human Enzyme and Potentiates Itself as a Target of Brivudine.

    Directory of Open Access Journals (Sweden)

    Kelly Hew

    Full Text Available Varicella zoster virus (VZV is a highly infectious human herpesvirus that is the causative agent for chicken pox and shingles. VZV encodes a functional thymidylate synthase (TS, which is the sole enzyme that produces dTMP from dUMP de novo. To study substrate binding, the complex structure of TSVZV with dUMP was determined to a resolution of 2.9 Å. In the absence of a folate co-substrate, dUMP binds in the conserved TS active site and is coordinated similarly as in the human encoded TS (TSHS in an open conformation. The interactions between TSVZV with dUMP and a cofactor analog, raltitrexed, were also studied using differential scanning fluorimetry (DSF, suggesting that TSVZV binds dUMP and raltitrexed in a sequential binding mode like other TS. The DSF also revealed interactions between TSVZV and in vitro phosphorylated brivudine (BVDUP, a highly potent anti-herpesvirus drug against VZV infections. The binding of BVDUP to TSVZV was further confirmed by the complex structure of TSVZV and BVDUP solved at a resolution of 2.9 Å. BVDUP binds similarly as dUMP in the TSHS but it induces a closed conformation of the active site. The structure supports that the 5-bromovinyl substituent on BVDUP is likely to inhibit TSVZV by preventing the transfer of a methylene group from its cofactor and the subsequent formation of dTMP. The interactions between TSVZV and BVDUP are consistent with that TSVZV is indeed a target of brivudine in vivo. The work also provided the structural basis for rational design of more specific TSVZV inhibitors.

  8. Macromolecular structure determination in the post-genome era

    CERN Document Server

    Kuhn, P

    2001-01-01

    Recent advances in genetics, molecular biology and crystallographic instrumentation and methodology have led to a revolution in the field of Structural Molecular Biology (SMB). These combined advances have paved the way to a more complete and detailed understanding of the biological macromolecules that make up an organism, both in terms of their individual functions and also the interactions between them. In this paper we describe a large-scale, genomic approach to the three-dimensional structure determination of macromolecules and their complexes, using high-throughput methodology to streamline all aspects of the process. This task requires the development of automated high-intensity synchrotron beam lines for X-ray diffraction data collection from single crystal samples. Furthermore, these beam lines must be operated within a sophisticated software and hardware environment, which is capable of delivering a completely automated structure determination pipeline. The SMB resource at SSRL is developing a system...

  9. Refining the structure and content of clinical genomic reports.

    Science.gov (United States)

    Dorschner, Michael O; Amendola, Laura M; Shirts, Brian H; Kiedrowski, Lesli; Salama, Joseph; Gordon, Adam S; Fullerton, Stephanie M; Tarczy-Hornoch, Peter; Byers, Peter H; Jarvik, Gail P

    2014-03-01

    To effectively articulate the results of exome and genome sequencing we refined the structure and content of molecular test reports. To communicate results of a randomized control trial aimed at the evaluation of exome sequencing for clinical medicine, we developed a structured narrative report. With feedback from genetics and non-genetics professionals, we developed separate indication-specific and incidental findings reports. Standard test report elements were supplemented with research study-specific language, which highlighted the limitations of exome sequencing and provided detailed, structured results, and interpretations. The report format we developed to communicate research results can easily be transformed for clinical use by removal of research-specific statements and disclaimers. The development of clinical reports for exome sequencing has shown that accurate and open communication between the clinician and laboratory is ideally an ongoing process to address the increasing complexity of molecular genetic testing. © 2014 Wiley Periodicals, Inc.

  10. Structural Injury after Lithium Treatment in Human and Rat Kidney involves Glycogen Synthase Kinase-3β Positive Epithelium

    DEFF Research Database (Denmark)

    Kjærsgaard, Gitte; Madsen, Kirsten; Marcussen, Niels

    2011-01-01

    of glycogen synthase kinase-3β (GSK-3β). GSK-3β and pGSK-3β was investigated in a developing series of rat kidney cortex and medulla. Li+ was given to female wistar rats with litters through food pellets at postnatal (P) days 7-28. In human fetal and adult kidney the expression of GSK-3β was examined and also...

  11. Training set optimization under population structure in genomic selection.

    Science.gov (United States)

    Isidro, Julio; Jannink, Jean-Luc; Akdemir, Deniz; Poland, Jesse; Heslot, Nicolas; Sorrells, Mark E

    2015-01-01

    Population structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance. The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.

  12. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  13. Recognizing genes and other components of genomic structure

    Energy Technology Data Exchange (ETDEWEB)

    Burks, C. (Los Alamos National Lab., NM (USA)); Myers, E. (Arizona Univ., Tucson, AZ (USA). Dept. of Computer Science); Stormo, G.D. (Colorado Univ., Boulder, CO (USA). Dept. of Molecular, Cellular and Developmental Biology)

    1991-01-01

    The Aspen Center for Physics (ACP) sponsored a three-week workshop, with 26 scientists participating, from 28 May to 15 June, 1990. The workshop, entitled Recognizing Genes and Other Components of Genomic Structure, focussed on discussion of current needs and future strategies for developing the ability to identify and predict the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians. No participant stayed for less than one week, and most attended for two or three weeks. Computers, software, and databases were available for use as electronic blackboards'' and as the basis for collaborative exploration of ideas being discussed and developed at the workshop. 23 refs., 2 tabs.

  14. Structural constraints in the packaging of bluetongue virus genomic segments.

    Science.gov (United States)

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C; Roy, Polly

    2014-10-01

    The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by biochemical data analysis suggested that a conformational motif formed by interaction of the 5' and 3' ends of the molecule was necessary and sufficient for packaging. A similar structural signal was also identified in S8 of BTV serotype 1. Furthermore, the same conformational analysis of secondary structures for positive-sense ssRNAs was used to generate a chimeric segment that maintained the putative packaging motif but contained unrelated internal sequences. This chimeric segment was packaged successfully, confirming that the motif identified directs the correct packaging of the segment. © 2014 The Authors.

  15. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  16. Structure determination of glycogen synthase kinase-3 from Leishmania major and comparative inhibitor structure-activity relationships with Trypanosoma brucei GSK-3

    Energy Technology Data Exchange (ETDEWEB)

    Ojo, Kayode K; Arakaki, Tracy L; Napuli, Alberto J; Inampudi, Krishna K; Keyloun, Katelyn R; Zhang, Li; Hol, Wim G.J.; Verlind, Christophe L.M.J.; Merritt, Ethan A; Van Voorhis, Wesley C [UWASH

    2012-04-24

    Glycogen synthase kinase-3 (GSK-3) is a drug target under intense investigation in pharmaceutical companies and constitutes an attractive piggyback target for eukaryotic pathogens. Two different GSKs are found in trypanosomatids, one about 150 residues shorter than the other. GSK-3 short (GeneDB: Tb927.10.13780) has previously been validated genetically as a drug target in Trypanosoma brucei by RNAi induced growth retardation; and chemically by correlation between enzyme and in vitro growth inhibition. Here, we report investigation of the equivalent GSK-3 short enzymes of L. major (LmjF18.0270) and L. infantum (LinJ18_V3.0270, identical in amino acid sequences to LdonGSK-3 short) and a crystal structure of LmajGSK-3 short at 2 Å resolution. The inhibitor structure-activity relationships (SARs) of L. major and L. infantum are virtually identical, suggesting that inhibitors could be useful for both cutaneous and visceral leishmaniasis. Leishmania spp. GSK-3 short has different inhibitor SARs than TbruGSK-3 short, which can be explained mostly by two variant residues in the ATP-binding pocket. Indeed, mutating these residues in the ATP-binding site of LmajGSK-3 short to the TbruGSK-3 short equivalents results in a mutant LmajGSK-3 short enzyme with SAR more similar to that of TbruGSK-3 short. The differences between human GSK-3β (HsGSK-3β) and LmajGSK-3 short SAR suggest that compounds which selectively inhibit LmajGSK-3 short may be found.

  17. Novel class III phosphoribosyl diphosphate synthase: structure and properties of the tetrameric, phosphate-activated, non-allosterically inhibited enzyme from Methanocaldococcus jannaschii

    DEFF Research Database (Denmark)

    Kadziola, Anders; Jepsen, Clemens H; Johansson, Eva

    2005-01-01

    The prs gene encoding phosphoribosyl diphosphate (PRPP) synthase of the hyperthermophilic autotrophic methanogenic archaeon Methanocaldococcus jannaschii has been cloned and expressed in Escherichia coli. Subsequently, M.jannaschii PRPP synthase has been purified, characterised, crystallised, and...

  18. High-resolution haplotype block structure in the cattle genome

    Directory of Open Access Journals (Sweden)

    Choi Jungwoo

    2009-04-01

    similarities in haplotype block structure between dairy and beef breeds make them non-differentiable. Finally, our findings suggest that ~30,000 uniformly distributed SNPs would be necessary to construct a complete genome LD map in Bos taurus breeds, and ~580,000 SNPs would be necessary to characterize the haplotype block structure across the complete cattle genome.

  19. Draft Genome Sequence of Saccharomonospora sp. Strain LRS4.154, a Moderately Halophilic Actinobacterium with the Biotechnologically Relevant Polyketide Synthase and Nonribosomal Peptide Synthetase Systems

    Science.gov (United States)

    Alonso-Carmona, Scarlett; Vera-Gargallo, Blanca; de la Haba, Rafael R.; Ventosa, Antonio; Sandoval-Trujillo, Horacio

    2017-01-01

    ABSTRACT The draft genome sequence of Saccharomonospora sp. strain LRS4.154, a moderately halophilic actinobacterium, has been determined. The genome has 4,860,108 bp, a G+C content of 71.0%, and 4,525 open reading frames (ORFs). The clusters of PKS and NRPS genes, responsible for the biosynthesis of a large number of biomolecules, were identified in the genome. PMID:28546487

  20. RNA structural constraints in the evolution of the influenza A virus genome NP segment

    NARCIS (Netherlands)

    A.P. Gultyaev (Alexander); A. Tsyganov-Bodounov (Anton); M.I. Spronken (Monique); S. Van Der Kooij (Sander); R.A.M. Fouchier (Ron); R.C.L. Olsthoorn (René)

    2014-01-01

    textabstractConserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length,

  1. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    Energy Technology Data Exchange (ETDEWEB)

    Arneodo, Alain, E-mail: alain.arneodo@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Vaillant, Cedric, E-mail: cedric.vaillant@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Audit, Benjamin, E-mail: benjamin.audit@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Argoul, Francoise, E-mail: francoise.argoul@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); D' Aubenton-Carafa, Yves, E-mail: daubenton@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France); Thermes, Claude, E-mail: claude.thermes@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France)

    2011-02-15

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  2. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain

    DEFF Research Database (Denmark)

    Sükösd, Zsuzsanna; Andersen, Ebbe Sloth; Seemann, Ernst Stefan

    2015-01-01

    of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping...

  3. Regulation of Aerobic Energy Metabolism in Podospora anserina by Two Paralogous Genes Encoding Structurally Different c-Subunits of ATP Synthase.

    Directory of Open Access Journals (Sweden)

    Carole H Sellem

    2016-07-01

    Full Text Available Most of the ATP in living cells is produced by an F-type ATP synthase. This enzyme uses the energy of a transmembrane electrochemical proton gradient to synthesize ATP from ADP and inorganic phosphate. Proton movements across the membrane domain (FO of the ATP synthase drive the rotation of a ring of 8-15 c-subunits, which induces conformational changes in the catalytic part (F1 of the enzyme that ultimately promote ATP synthesis. Two paralogous nuclear genes, called Atp9-5 and Atp9-7, encode structurally different c-subunits in the filamentous fungus Podospora anserina. We have in this study identified differences in the expression pattern for the two genes that correlate with the mitotic activity of cells in vegetative mycelia: Atp9-7 is transcriptionally active in non-proliferating (stationary cells while Atp9-5 is expressed in the cells at the extremity (apex of filaments that divide and are responsible for mycelium growth. When active, the Atp9-5 gene sustains a much higher rate of c-subunit synthesis than Atp9-7. We further show that the ATP9-7 and ATP9-5 proteins have antagonist effects on the longevity of P. anserina. Finally, we provide evidence that the ATP9-5 protein sustains a higher rate of mitochondrial ATP synthesis and yield in ATP molecules per electron transferred to oxygen than the c-subunit encoded by Atp9-7. These findings reveal that the c-subunit genes play a key role in the modulation of ATP synthase production and activity along the life cycle of P. anserina. Such a degree of sophistication for regulating aerobic energy metabolism has not been described before.

  4. Genomic structure and expression of immunoglobulins in Squamata.

    Science.gov (United States)

    Olivieri, David N; Garet, Elina; Estevez, Olivia; Sánchez-Espinel, Christian; Gambón-Deza, Francisco

    2016-04-01

    The Squamata order represents a major evolutionary reptile lineage, yet the structure and expression of immunoglobulins in this order has been scarcely studied in detail. From the genome sequences of four Squamata species (Gekko japonicus, Ophisaurus gracilis, Pogona vitticeps and Ophiophagus hannah) and RNA-seq datasets from 18 other Squamata species, we identified the immunoglobulins present in these animals as well as the tissues in which they are found. All Squamata have at least three immunoglobulin classes; namely, the immunoglobulins M, D, and Y. Unlike mammals, however, we provide evidence that some Squamata lineages possess more than one Cμ gene which is located downstream from the Cδ gene. The existence of two evolutionary lineages of immunoglobulin Y is shown. Additionally, it is demonstrated that while all Squamata species possess the λ light chain, only Iguanidae species possess the κ light chain. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Structural characterization of genomes by large scale sequence-structure threading

    Directory of Open Access Journals (Sweden)

    Cherkasov Artem

    2004-04-01

    Full Text Available Abstract Background Using sequence-structure threading we have conducted structural characterization of complete proteomes of 37 archaeal, bacterial and eukaryotic organisms (including worm, fly, mouse and human totaling 167,888 genes. Results The reported data represent first rather general evaluation of performance of full sequence-structure threading on multiple genomes providing opportunity to evaluate its general applicability for large scale studies. According to the estimated results the sequence-structure threading has assigned protein folds to more then 60% of eukaryotic, 68% of archaeal and 70% of bacterial proteomes. The repertoires of protein classes, architectures, topologies and homologous superfamilies (according to the CATH 2.4 classification have been established for distant organisms and superkingdoms. It has been found that the average abundance of CATH classes decreases from "alpha and beta" to "mainly beta", followed by "mainly alpha" and "few secondary structures". 3-Layer (aba Sandwich has been characterized as the most abundant protein architecture and Rossman fold as the most common topology. Conclusion The analysis of genomic occurrences of CATH 2.4 protein homologous superfamilies and topologies has revealed the power-law character of their distributions. The corresponding double logarithmic "frequency – genomic occurrence" dependences characteristic of scale-free systems have been established for individual organisms and for three superkingdoms. Supplementary materials to this works are available at 1.

  6. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    Science.gov (United States)

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome. Results Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella. Conclusions When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution. PMID:21619600

  7. Insular Celtic population structure and genomic footprints of migration.

    Science.gov (United States)

    Byrne, Ross P; Martiniano, Rui; Cassidy, Lara M; Carrigan, Matthew; Hellenthal, Garrett; Hardiman, Orla; Bradley, Daniel G; McLaughlin, Russell L

    2018-01-01

    Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

  8. Insular Celtic population structure and genomic footprints of migration.

    Directory of Open Access Journals (Sweden)

    Ross P Byrne

    2018-01-01

    Full Text Available Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

  9. seq-seq-pan: building a computational pan-genome data structure on whole genome alignment.

    Science.gov (United States)

    Jandrasits, Christine; Dabrowski, Piotr W; Fuchs, Stephan; Renard, Bernhard Y

    2018-01-15

    The increasing application of next generation sequencing technologies has led to the availability of thousands of reference genomes, often providing multiple genomes for the same or closely related species. The current approach to represent a species or a population with a single reference sequence and a set of variations cannot represent their full diversity and introduces bias towards the chosen reference. There is a need for the representation of multiple sequences in a composite way that is compatible with existing data sources for annotation and suitable for established sequence analysis methods. At the same time, this representation needs to be easily accessible and extendable to account for the constant change of available genomes. We introduce seq-seq-pan, a framework that provides methods for adding or removing new genomes from a set of aligned genomes and uses these to construct a whole genome alignment. Throughout the sequential workflow the alignment is optimized for generating a representative linear presentation of the aligned set of genomes, that enables its usage for annotation and in downstream analyses. By providing dynamic updates and optimized processing, our approach enables the usage of whole genome alignment in the field of pan-genomics. In addition, the sequential workflow can be used as a fast alternative to existing whole genome aligners for aligning closely related genomes. seq-seq-pan is freely available at https://gitlab.com/rki_bioinformatics.

  10. Implications of secondary structure prediction and amino acid sequence comparison of class I and class II phosphoribosyl diphosphate synthases on catalysis, regulation, and quaternary structure

    DEFF Research Database (Denmark)

    Krath, B N; Hove-Jensen, B

    2001-01-01

    Spinach 5-phospho-D-ribosyl alpha-1-diphosphate (PRPP) synthase isozyme 4 was synthesized in Escherichia coli and purified to near homogeneity. The activity of the enzyme is independent of P(i); it is inhibited by ADP in a competitive manner, indicating a lack of an allosteric site; and it accepts...

  11. Structural basis for cyclization specificity of two Azotobacter type III polyketide synthases: a single amino acid substitution reverses their cyclization specificity.

    Science.gov (United States)

    Satou, Ryutaro; Miyanaga, Akimasa; Ozawa, Hiroki; Funa, Nobutaka; Katsuyama, Yohei; Miyazono, Ken-ichi; Tanokura, Masaru; Ohnishi, Yasuo; Horinouchi, Sueharu

    2013-11-22

    Type III polyketide synthases (PKSs) show diverse cyclization specificity. We previously characterized two Azotobacter type III PKSs (ArsB and ArsC) with different cyclization specificity. ArsB and ArsC, which share a high sequence identity (71%), produce alkylresorcinols and alkylpyrones through aldol condensation and lactonization of the same polyketomethylene intermediate, respectively. Here we identified a key amino acid residue for the cyclization specificity of each enzyme by site-directed mutagenesis. Trp-281 of ArsB corresponded to Gly-284 of ArsC in the amino acid sequence alignment. The ArsB W281G mutant synthesized alkylpyrone but not alkylresorcinol. In contrast, the ArsC G284W mutant synthesized alkylresorcinol with a small amount of alkylpyrone. These results indicate that this amino acid residue (Trp-281 of ArsB or Gly-284 of ArsC) should occupy a critical position for the cyclization specificity of each enzyme. We then determined crystal structures of the wild-type and G284W ArsC proteins at resolutions of 1.76 and 1.99 Å, respectively. Comparison of these two ArsC structures indicates that the G284W substitution brings a steric wall to the active site cavity, resulting in a significant reduction of the cavity volume. We postulate that the polyketomethylene intermediate can be folded to a suitable form for aldol condensation only in such a relatively narrow cavity of ArsC G284W (and presumably ArsB). This is the first report on the alteration of cyclization specificity from lactonization to aldol condensation for a type III PKS. The ArsC G284W structure is significant as it is the first reported structure of a microbial resorcinol synthase.

  12. Learning directed acyclic graphical structures with genetical genomics data.

    Science.gov (United States)

    Gao, Bin; Cui, Yuehua

    2015-12-15

    Large amount of research efforts have been focused on estimating gene networks based on gene expression data to understand the functional basis of a living organism. Such networks are often obtained by considering pairwise correlations between genes, thus may not reflect the true connectivity between genes. By treating gene expressions as quantitative traits while considering genetic markers, genetical genomics analysis has shown its power in enhancing the understanding of gene regulations. Previous works have shown the improved performance on estimating the undirected network graphical structure by incorporating genetic markers as covariates. Knowing that gene expressions are often due to directed regulations, it is more meaningful to estimate the directed graphical network. In this article, we introduce a covariate-adjusted Gaussian graphical model to estimate the Markov equivalence class of the directed acyclic graphs (DAGs) in a genetical genomics analysis framework. We develop a two-stage estimation procedure to first estimate the regression coefficient matrix by [Formula: see text] penalization. The estimated coefficient matrix is then used to estimate the mean values in our multi-response Gaussian model to estimate the regulatory networks of gene expressions using PC-algorithm. The estimation consistency for high dimensional sparse DAGs is established. Simulations are conducted to demonstrate our theoretical results. The method is applied to a human Alzheimer's disease dataset in which differential DAGs are identified between cases and controls. R code for implementing the method can be downloaded at http://www.stt.msu.edu/∼cui. R code for implementing the method is freely available at http://www.stt.msu.edu/∼cui/software.html. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Endothelial nitric oxide synthase gene polymorphisms associated ...

    African Journals Online (AJOL)

    Endothelial nitric oxide synthase (NOS3) is involved in key steps of immune response. Genetic factors predispose individuals to periodontal disease. This study's aim was to explore the association between NOS3 gene polymorphisms and clinical parameters in patients with periodontal disease. Genomic DNA was obtained ...

  14. From structure prediction to genomic screens for novel non-coding RNAs

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.

    2011-01-01

    methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch....... This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early...... upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other....

  15. Producing genome structure populations with the dynamic and automated PGS software.

    Science.gov (United States)

    Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank

    2018-05-01

    Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.

  16. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  17. Structure of the genome of equine herpesvirus type 3.

    Science.gov (United States)

    Sullivan, D C; Atherton, S S; Staczek, J; O'Callaghan, D J

    1984-01-30

    Restriction endonuclease mapping studies were performed to determine the molecular structure of the genome of equine herpesvirus type 3 (EHV-3). Purified EHV-3 DNA, either unlabeled or 32P-labeled, was analyzed using the restriction enzymes BamHI, BclI, BglII, EcoRI, and HindIII. The findings that four 0.5 M (molar) fragments were present, that two of these were terminal fragments, and that all 0.5 M fragments contained homologous DNA sequences as judged by DNA hybridization analyses indicated that DNA sequences located at one terminus are repeated within the molecule and that two populations of molecules exist with regard to the arrangement of this pair of shared sequences. Mapping of BamHI, BclI, BglII, EcoRI, and HindIII fragments by double digestion of intact EHV-3 DNA, reciprocal digestion of isolated restriction enzyme fragments, and blot hybridization experiments revealed that the EHV-3 genome is a linear, double-stranded DNA molecule with a molecular size of 96.2 +/- 0.48 MDa and is comprised of two covalently linked segments, designated L (long) and S (short). The S region is approximately 22.9 MDa in size and consists of a unique segment (Us) of approximately 5.8 MDa bracketed by 8.5 MDa inverted repeat sequences that allow the S region to invert relative to the fixed L region which is approximately 73.3 MDa in size and consists only of unique sequences. Thus, these data confirm that EHV-3 DNA exists in two isomeric forms and has a molecular structure similar to that of the genomes of EHV-1 (B. E. Henry, S. A. Robinson, S. A. Dauenhauer, S. S. Atherton, G. S. Hayward, and D. J. O'Callaghan, Virology 115, 97-114, 1981; D. J. O'Callaghan, G. A. Gentry, and C. C. Randall, "The Herpesvirus," Vol. 2, pp. 215-318, Plenum, New York, 1983; D. J. O'Callaghan, B. E. Henry, J. H. Wharton, S. A. Dauenhauer, R. B. Vance, J. Staczek, and R. A. Robinson, "Developments in Molecular Virology," Vol. 1, pp. 387-418, Nijhoff, The Hague, 1981; W. T. Ruyechan, S. A. Dauenhauer

  18. Modes of heme binding and substrate access for cytochrome P450 CYP74A revealed by crystal structures of allene oxide synthase

    Energy Technology Data Exchange (ETDEWEB)

    Li, Lenong; Chang, Zhenzhan; Pan, Zhiqiang; Fu, Zheng-Qing; Wang, Xiaoqiang (US-Agriculture); (SRNF); (Georgia)

    2009-01-12

    Cytochrome P450s exist ubiquitously in all organisms and are involved in many biological processes. Allene oxide synthase (AOS) is a P450 enzyme that plays a key role in the biosynthesis of oxylipin jasmonates, which are involved in signal and defense reactions in higher plants. The crystal structures of guayule (Parthenium argentatum) AOS (CYP74A2) and its complex with the substrate analog 13(S)-hydroxyoctadeca-9Z,11E-dienoic acid have been determined. The structures exhibit a classic P450 fold but possess a heme-binding mode with an unusually long heme binding loop and a unique I-helix. The structures also reveal two channels through which substrate and product may access and leave the active site. The entrances are defined by a loop between {beta}3-2 and {beta}3-3. Asn-276 in the substrate binding site may interact with the substrate's hydroperoxy group and play an important role in catalysis, and Lys-282 at the entrance may control substrate access and binding. These studies provide both structural insights into AOS and related P450s and a structural basis to understand the distinct reaction mechanism.

  19. Crystal structure of mouse thymidylate synthase in tertiary complex with dUMP and raltitrexed reveals N-terminus architecture and two different active site conformations.

    Science.gov (United States)

    Dowierciał, Anna; Wilk, Piotr; Rypniewski, Wojciech; Rode, Wojciech; Jarmuła, Adam

    2014-01-01

    The crystal structure of mouse thymidylate synthase (mTS) in complex with substrate dUMP and antifolate inhibitor Raltitrexed is reported. The structure reveals, for the first time in the group of mammalian TS structures, a well-ordered segment of 13 N-terminal amino acids, whose ordered conformation is stabilized due to specific crystal packing. The structure consists of two homodimers, differing in conformation, one being more closed (dimer AB) and thus supporting tighter binding of ligands, and the other being more open (dimer CD) and thus allowing weaker binding of ligands. This difference indicates an asymmetrical effect of the binding of Raltitrexed to two independent mTS molecules. Conformational changes leading to a ligand-induced closing of the active site cleft are observed by comparing the crystal structures of mTS in three different states along the catalytic pathway: ligand-free, dUMP-bound, and dUMP- and Raltitrexed-bound. Possible interaction routes between hydrophobic residues of the mTS protein N-terminal segment and the active site are also discussed.

  20. Structural and In Vivo Studies on Trehalose-6-Phosphate Synthase from Pathogenic Fungi Provide Insights into Its Catalytic Mechanism, Biological Necessity, and Potential for Novel Antifungal Drug Design

    Directory of Open Access Journals (Sweden)

    Yi Miao

    2017-07-01

    Full Text Available The disaccharide trehalose is critical to the survival of pathogenic fungi in their human host. Trehalose-6-phosphate synthase (Tps1 catalyzes the first step of trehalose biosynthesis in fungi. Here, we report the first structures of eukaryotic Tps1s in complex with substrates or substrate analogues. The overall structures of Tps1 from Candida albicans and Aspergillus fumigatus are essentially identical and reveal N- and C-terminal Rossmann fold domains that form the glucose-6-phosphate and UDP-glucose substrate binding sites, respectively. These Tps1 structures with substrates or substrate analogues reveal key residues involved in recognition and catalysis. Disruption of these key residues severely impaired Tps1 enzymatic activity. Subsequent cellular analyses also highlight the enzymatic function of Tps1 in thermotolerance, yeast-hypha transition, and biofilm development. These results suggest that Tps1 enzymatic functionality is essential for the fungal stress response and virulence. Furthermore, structures of Tps1 in complex with the nonhydrolyzable inhibitor, validoxylamine A, visualize the transition state and support an internal return-like catalytic mechanism that is generalizable to other GT-B-fold retaining glycosyltransferases. Collectively, our results depict key Tps1-substrate interactions, unveil the enzymatic mechanism of these fungal proteins, and pave the way for high-throughput inhibitor screening buttressed and guided by the current structures and those of high-affinity ligand-Tps1 complexes.

  1. Alignment-free comparative genomic screen for structured RNAs using coarse-grained secondary structure dot plots

    DEFF Research Database (Denmark)

    Kato, Yuki; Gorodkin, Jan; Havgaard, Jakob Hull

    2017-01-01

    . Methods: Here we present a fast and efficient method, DotcodeR, for detecting structurally similar RNAs in genomic sequences by comparing their corresponding coarse-grained secondary structure dot plots at string level. This allows us to perform an all-against-all scan of all window pairs from two genomes...... without alignment. Results: Our computational experiments with simulated data and real chromosomes demonstrate that the presented method has good sensitivity. Conclusions: DotcodeR can be useful as a pre-filter in a genomic comparative scan for structured RNAs....

  2. Defining the genome structure of 'Tongil' rice, an important cultivar in the Korean "Green Revolution".

    Science.gov (United States)

    Kim, Backki; Kim, Dong-Gwan; Lee, Gileung; Seo, Jeonghwan; Choi, Ik-Young; Choi, Beom-Soon; Yang, Tae-Jin; Kim, Kwang Soo; Lee, Joohyun; Chin, Joong Hyoun; Koh, Hee-Jong

    2014-12-01

    Tongil (IR667-98-1-2) rice, developed in 1972, is a high-yield rice variety derived from a three-way cross between indica and japonica varieties. Tongil contributed to the self-sufficiency of staple food production in Korea during a period known as the 'Korean Green Revolution'. We analyzed the nucleotide-level genome structure of Tongil rice and compared it to those of the parental varieties. A total of 17.3 billion Illumina Hiseq reads, 47× genome coverage, were generated for Tongil rice. Three parental accessions of Tongil rice, two indica types and one japonica type, were also sequenced at approximately 30x genome coverage. A total of 2,149,991 SNPs were detected between Tongil and Nipponbare varieties. The average SNP frequency of Tongil was 5.77 per kb. Genome composition was determined based on SNP data by comparing Tongil with three parental genome sequences using the sliding window approach. Analyses revealed that 91.8% of the Tongil genome originated from the indica parents and 7.9% from the japonica parent. Copy numbers of SSR motifs, ORF gene distribution throughout the whole genome, gene ontology (GO) annotation, and some yield-related QTLs or gene locations were also comparatively analyzed between Tongil and parental varieties using sequence-based tools. Each genetic factor was transferred from the parents into Tongil rice in amounts that were in proportion to the whole genome composition. Tongil was derived from a three-way cross among two indica and one japonica varieties. Defining the genome structure of Tongil rice demonstrates that the Tongil genome is derived primarily from the indica genome with a small proportion of japonica genome introgression. Comparative gene distribution, SSR, GO, and yield-related gene analysis support the finding that the Tongil genome is primarily made up of the indica genome.

  3. Decoding the fine-scale structure of a breast cancer genome and transcriptome.

    Science.gov (United States)

    Volik, Stanislav; Raphael, Benjamin J; Huang, Guiqing; Stratton, Michael R; Bignel, Graham; Murnane, John; Brebner, John H; Bajsarowicz, Krystyna; Paris, Pamela L; Tao, Quanzhou; Kowbel, David; Lapuk, Anna; Shagin, Dmitri A; Shagina, Irina A; Gray, Joe W; Cheng, Jan-Fang; de Jong, Pieter J; Pevzner, Pavel; Collins, Colin

    2006-03-01

    A comprehensive understanding of cancer is predicated upon knowledge of the structure of malignant genomes underlying its many variant forms and the molecular mechanisms giving rise to them. It is well established that solid tumor genomes accumulate a large number of genome rearrangements during tumorigenesis. End Sequence Profiling (ESP) maps and clones genome breakpoints associated with all types of genome rearrangements elucidating the structural organization of tumor genomes. Here we extend the ESP methodology in several directions using the breast cancer cell line MCF-7. First, targeted ESP is applied to multiple amplified loci, revealing a complex process of rearrangement and co-amplification in these regions reminiscent of breakage/fusion/bridge cycles. Second, genome breakpoints identified by ESP are confirmed using a combination of DNA sequencing and PCR. Third, in vitro functional studies assign biological function to a rearranged tumor BAC clone, demonstrating that it encodes anti-apoptotic activity. Finally, ESP is extended to the transcriptome identifying four novel fusion transcripts and providing evidence that expression of fusion genes may be common in tumors. These results demonstrate the distinct advantages of ESP including: (1) the ability to detect all types of rearrangements and copy number changes; (2) straightforward integration of ESP data with the annotated genome sequence; (3) immortalization of the genome; (4) ability to generate tumor-specific reagents for in vitro and in vivo functional studies. Given these properties, ESP could play an important role in a tumor genome project.

  4. Association between population structure and allele frequencies of the glycogen synthase 1 mutation in the Austrian Noriker draft horse.

    Science.gov (United States)

    Druml, T; Grilz-Seger, G; Neuditschko, M; Brem, G

    2017-02-01

    The aim of this study was to determine the allele frequency of the glycogen synthase 1 (GYS1) mutation associated with polysaccharide storage myopathy type 1 in the Austrian Noriker horse. Furthermore, we examined the influence of population substructures on the allele distribution. The study was based upon a comprehensive population sample (208 breeding stallions and 309 mares) and a complete cohort of unselected offspring from the year 2014 (1553 foals). The mean proportion of GYS1 carrier animals in the foal cohort was 33%, ranging from 15% to 50% according to population substructures based on coat colours. In 517 mature breeding horses the mutation carrier frequency reached 34%, ranging on a wider scale from 4% to 62% within genetic substructures. We could show that the occurrence of the mutated GYS1 allele is influenced by coat colour; genetic bottlenecks; and assortative, rotating and random mating strategies. Highest GYS1 carrier frequencies were observed in the chestnut sample comprising 50% in foals, 54% in mares and 62% in breeding stallions. The mean inbreeding of homozygous carrier animals reached 4.10%, whereas non-carrier horses were characterized by an inbreeding coefficient of 3.48%. Lowest GYS1 carrier frequencies were observed in the leopard spotted Noriker subpopulation. Here the mean carrier frequency reached 15% in foals, 17% in mares and 4% in stallions and inbreeding decreased from 3.28% in homozygous non-carrier horses to 2.70% in heterozygous horses and 0.94% in homozygous carriers. This study illustrates that lineage breeding and specified mating strategies result in genetic substructures, which affect the frequencies of the GYS1 gene mutation. © 2016 Stichting International Foundation for Animal Genetics.

  5. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain.

    Science.gov (United States)

    Sükösd, Zsuzsanna; Andersen, Ebbe S; Seemann, Stefan E; Jensen, Mads Krogh; Hansen, Mathias; Gorodkin, Jan; Kjems, Jørgen

    2015-12-02

    A distance constrained secondary structural model of the ≈10 kb RNA genome of the HIV-1 has been predicted but higher-order structures, involving long distance interactions, are currently unknown. We present the first global RNA secondary structure model for the HIV-1 genome, which integrates both comparative structure analysis and information from experimental data in a full-length prediction without distance constraints. Besides recovering known structural elements, we predict several novel structural elements that are conserved in HIV-1 evolution. Our results also indicate that the structure of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping protein-coding regions the COS is supported by a particular high frequency of compensatory base changes, suggesting functional importance for this element. This new structural element potentially organizes the whole genome into three major domains protruding from a conserved core structure with potential roles in replication and evolution for the virus. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Cellulose synthesis via the FEI2 RLK/SOS5 pathway and cellulose synthase 5 is required for the structure of seed coat mucilage in Arabidopsis.

    Science.gov (United States)

    Harpaz-Saad, Smadar; McFarlane, Heather E; Xu, Shouling; Divi, Uday K; Forward, Bronwen; Western, Tamara L; Kieber, Joseph J

    2011-12-01

    The seeds of Arabidopsis thaliana and many other plants are surrounded by a pectinaceous mucilage that aids in seed hydration and germination. Mucilage is synthesized during seed development within maternally derived seed coat mucilage secretory cells (MSCs), and is released to surround the seed upon imbibition. The FEI1/FEI2 receptor-like kinases and the SOS5 extracellular GPI-anchored protein were shown previously to act on a pathway that regulates the synthesis of cellulose in Arabidopsis roots. Here, we demonstrate that both FEI2 and SOS5 also play a role in the synthesis of seed mucilage. Disruption of FEI2 or SOS5 leads to a reduction in the rays of cellulose observed across the seed mucilage inner layer, which alters the structure of the mucilage in response to hydration. Mutations in CESA5, which disrupts an isoform of cellulose synthase involved in primary cell wall synthesis, result in a similar seed mucilage phenotype. The data indicate that CESA5-derived cellulose plays an important role in the synthesis and structure of seed coat mucilage and that the FEI2/SOS5 pathway plays a role in the regulation of cellulose synthesis in MSCs. Moreover, these results establish a novel structural role for cellulose in anchoring the pectic component of seed coat mucilage to the seed surface. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.

  7. The Chloroplast Genome of Symplocarpus renifolius: A Comparison of Chloroplast Genome Structure in Araceae

    Science.gov (United States)

    Park, Kyu Tae

    2017-01-01

    Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops (Colocasia, commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna, Spirodela, Wolffiella, Wolffia, Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus. In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region. PMID:29144427

  8. The Chloroplast Genome of Symplocarpus renifolius: A Comparison of Chloroplast Genome Structure in Araceae.

    Science.gov (United States)

    Choi, Kyoung Su; Park, Kyu Tae; Park, SeonJoo

    2017-11-16

    Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops ( Colocasia , commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna , Spirodela , Wolffiella , Wolffia , Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus . In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region.

  9. Local chromatin structure of heterochromatin regulates repeated DNA stability, nucleolus structure, and genome integrity

    Energy Technology Data Exchange (ETDEWEB)

    Peng, Jamy C. [Univ. of California, Berkeley, CA (United States)

    2007-01-01

    Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) that binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in

  10. Sequence analysis and structure prediction of type II Pseudomonas sp. USM 4–55 PHA synthase and an insight into its catalytic mechanism

    Directory of Open Access Journals (Sweden)

    Ahmad Khairudin Nurul

    2006-11-01

    Full Text Available Abstract Background Polyhydroxyalkanoates (PHA, are biodegradable polyesters derived from many microorganisms such as the pseudomonads. These polyesters are in great demand especially in the packaging industries, the medical line as well as the paint industries. The enzyme responsible in catalyzing the formation of PHA is PHA synthase. Due to the limited structural information, its functional properties including catalysis are lacking. Therefore, this study seeks to investigate the structural properties as well as its catalytic mechanism by predicting the three-dimensional (3D model of the Type II Pseudomonas sp. USM 4–55 PHA synthase 1 (PhaC1P.sp USM 4–55. Results Sequence analysis demonstrated that PhaC1P.sp USM 4–55 lacked similarity with all known structures in databases. PSI-BLAST and HMM Superfamily analyses demonstrated that this enzyme belongs to the alpha/beta hydrolase fold family. Threading approach revealed that the most suitable template to use was the human gastric lipase (PDB ID: 1HLG. The superimposition of the predicted PhaC1P.sp USM 4–55 model with 1HLG covering 86.2% of the backbone atoms showed an RMSD of 1.15 Å. The catalytic residues comprising of Cys296, Asp451 and His479 were found to be conserved and located adjacent to each other. In addition to this, an extension to the catalytic mechanism was also proposed whereby two tetrahedral intermediates were believed to form during the PHA biosynthesis. These transition state intermediates were further postulated to be stabilized by the formation of oxyanion holes. Based on the sequence analysis and the deduced model, Ser297 was postulated to contribute to the formation of the oxyanion hole. Conclusion The 3D model of the core region of PhaC1P.sp USM 4–55 from residue 267 to residue 484 was developed using computational techniques and the locations of the catalytic residues were identified. Results from this study for the first time highlighted Ser297 potentially

  11. A sequence-based survey of the complex structural organization of tumor genomes

    Energy Technology Data Exchange (ETDEWEB)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  12. From structure prediction to genomic screens for novel non-coding RNAs.

    Science.gov (United States)

    Gorodkin, Jan; Hofacker, Ivo L

    2011-08-01

    Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  13. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  14. Large-scale trends in the evolution of gene structures within 11 animal genomes.

    Directory of Open Access Journals (Sweden)

    Mark Yandell

    2006-03-01

    Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.

  15. Chiral hydroxylation at the mononuclear nonheme Fe(II center of 4-(S hydroxymandelate synthase--a structure-activity relationship analysis.

    Directory of Open Access Journals (Sweden)

    Cristiana M L Di Giuro

    Full Text Available (S-Hydroxymandelate synthase (Hms is a nonheme Fe(II dependent dioxygenase that catalyzes the oxidation of 4-hydroxyphenylpyruvate to (S-4-hydroxymandelate by molecular oxygen. In this work, the substrate promiscuity of Hms is characterized in order to assess its potential for the biosynthesis of chiral α-hydroxy acids. Enzyme kinetic analyses, the characterization of product spectra, quantitative structure activity relationship (QSAR analyses and in silico docking studies are used to characterize the impact of substrate properties on particular steps of catalysis. Hms is found to accept a range of α-oxo acids, whereby the presence of an aromatic substituent is crucial for efficient substrate turnover. A hydrophobic substrate binding pocket is identified as the likely determinant of substrate specificity. Upon introduction of a steric barrier, which is suspected to obstruct the accommodation of the aromatic ring in the hydrophobic pocket during the final hydroxylation step, the racemization of product is obtained. A steady state kinetic analysis reveals that the turnover number of Hms strongly correlates with substrate hydrophobicity. The analysis of product spectra demonstrates high regioselectivity of oxygenation and a strong coupling efficiency of C-C bond cleavage and subsequent hydroxylation for the tested substrates. Based on these findings the structural basis of enantioselectivity and enzymatic activity is discussed.

  16. Effects of aminoguanidine, a potent nitric oxide synthase inhibitor, on myocardial and organ structure in a rat model of hemorrhagic shock

    Directory of Open Access Journals (Sweden)

    Mona M Soliman

    2014-01-01

    Full Text Available Background: Nitric oxide (NO has been shown to increase following hemorrhagic shock (HS. Peroxynitrite is produced by the reaction of NO with reactive oxygen species, leads to nitrosative stress mediated organ injury. We examined the protective effects of a potent inhibitor of NO synthase, aminoguanidine (AG, on myocardial and multiple organ structure in a rat model of HS. Materials and Methods: Male Sprague Dawley rats (300-350 g were assigned to 3 experimental groups (n = 6 per group: (1 Normotensive rats (N, (2 HS rats and (3 HS rats treated with AG (HS-AG. Rats were hemorrhaged over 60 min to reach a mean arterial blood pressure of 40 mmHg. Rats were treated with 1 ml of 60 mg/kg AG intra-arterially after 60 min HS. Resuscitation was performed in vivo by the reinfusion of the shed blood for 30 min to restore normo-tension. Biopsy samples were taken for light and electron microscopy. Results: Histological examination of hemorrhagic shocked untreated rats revealed structural damage. Less histological damage was observed in multiple organs in AG-treated rats. AG-treatment decreased the number of inflammatory cells and mitochondrial swollen in myocardial cells. Conclusion: AG treatment reduced microscopic damage and injury in multiple organs in a HS model in rats.

  17. Structural analysis of a 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase with an N-terminal chorismate mutase-like regulatory domain

    Energy Technology Data Exchange (ETDEWEB)

    Light, Samuel H.; Halavaty, Andrei S.; Minasov, George; Shuvalova, Ludmilla; Anderson, Wayne F. (NWU)

    2012-06-27

    3-Deoxy-D-arabino-heptulosonate 7-phosphate synthase (DAHPS) catalyzes the first step in the biosynthesis of a number of aromatic metabolites. Likely because this reaction is situated at a pivotal biosynthetic gateway, several DAHPS classes distinguished by distinct mechanisms of allosteric regulation have independently evolved. One class of DAHPSs contains a regulatory domain with sequence homology to chorismate mutase - an enzyme further downstream of DAHPS that catalyzes the first committed step in tyrosine/phenylalanine biosynthesis - and is inhibited by chorismate mutase substrate (chorismate) and product (prephenate). Described in this work, structures of the Listeria monocytogenes chorismate/prephenate regulated DAHPS in complex with Mn{sup 2+} and Mn{sup 2+} + phosphoenolpyruvate reveal an unusual quaternary architecture: DAHPS domains assemble as a tetramer, from either side of which chorismate mutase-like (CML) regulatory domains asymmetrically emerge to form a pair of dimers. This domain organization suggests that chorismate/prephenate binding promotes a stable interaction between the discrete regulatory and catalytic domains and supports a mechanism of allosteric inhibition similar to tyrosine/phenylalanine control of a related DAHPS class. We argue that the structural similarity of chorismate mutase enzyme and CML regulatory domain provides a unique opportunity for the design of a multitarget antibacterial.

  18. Structural and mechanistic studies on carboxymethylproline synthase (CarB), a unique member of the crotonase superfamily catalyzing the first step in carbapenem biosynthesis.

    Science.gov (United States)

    Sleeman, Mark C; Sorensen, John L; Batchelar, Edward T; McDonough, Michael A; Schofield, Christopher J

    2005-10-14

    The first step in the biosynthesis of the medicinally important carbapenem family of beta-lactam antibiotics is catalyzed by carboxymethylproline synthase (CarB), a unique member of the crotonase superfamily. CarB catalyzes formation of (2S,5S)-carboxymethylproline [(2S,5S)-t-CMP] from malonyl-CoA and l-glutamate semialdehyde. In addition to using a cosubstrate, CarB catalyzes C-C and C-N bond formation processes as well as an acyl-coenzyme A hydrolysis reaction. We describe the crystal structure of CarB in the presence and absence of acetyl-CoA at 2.24 A and 3.15 A resolution, respectively. The structures reveal that CarB contains a conserved oxy-anion hole probably required for decarboxylation of malonyl-CoA and stabilization of the resultant enolate. Comparison of the structures reveals that conformational changes (involving His(229)) in the cavity predicted to bind l-glutamate semialdehyde occur on (co)substrate binding. Mechanisms for the formation of the carboxymethylproline ring are discussed in the light of the structures and the accompanying studies using isotopically labeled substrates; cyclization via 1,4-addition is consistent with the observed labeling results (providing that hydrogen exchange at the C-6 position of carboxymethylproline does not occur). The side chain of Glu(131) appears to be positioned to be involved in hydrolysis of the carboxymethylproline-CoA ester intermediate. Labeling experiments ruled out the possibility that hydrolysis proceeds via an anhydride in which water attacks a carbonyl derived from Glu(131), as proposed for 3-hydroxyisobutyryl-CoA hydrolase. The structural work will aid in mutagenesis studies directed at altering the selectivity of CarB to provide intermediates for the production of clinically useful carbapenems.

  19. Structural constraints in the packaging of bluetongue virus genomic segments

    OpenAIRE

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C.; Roy, Polly

    2014-01-01

    : The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by bioche...

  20. Comparative genomics of 274 Vibrio cholerae genomes reveals mobile functions structuring three niche dimensions

    NARCIS (Netherlands)

    Dutilh, Bas E; Thompson, Cristiane C; Vicente, Ana C P; Marin, Michel A; Lee, Clarence; Silva, Genivaldo G Z; Schmieder, Robert; Andrade, Bruno G N; Chimetto, Luciane; Cuevas, Daniel; Garza, Daniel R; Okeke, Iruka N; Aboderin, Aaron Oladipo; Spangler, Jessica; Ross, Tristen; Dinsdale, Elizabeth A; Thompson, Fabiano L; Harkins, Timothy T; Edwards, Robert A

    2014-01-01

    BACKGROUND: Vibrio cholerae is a globally dispersed pathogen that has evolved with humans for centuries, but also includes non-pathogenic environmental strains. Here, we identify the genomic variability underlying this remarkable persistence across the three major niche dimensions space, time, and

  1. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    Science.gov (United States)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  2. Studying Cattle Genomic Structural Variations in the Green Economy Era

    Science.gov (United States)

    Transgenic cattle carrying multiple genomic modifications have been produced by serial rounds of somatic cell chromatin transfer (cloning) of sequentially genetically targeted somatic cells. However, cloning efficiency tends to decline with the increase of rounds of cloning. It is possible that mult...

  3. Structures of dihydrofolate reductase-thymidylate synthase of Trypanosoma cruzi in the folate-free state and in complex with two antifolate drugs, trimetrexate and methotrexate

    Energy Technology Data Exchange (ETDEWEB)

    Senkovich, Olga; Schormann, Norbert; Chattopadhyay, Debasish; (UAB)

    2010-11-22

    The flagellate protozoan parasite Trypanosoma cruzi is the pathogenic agent of Chagas disease (also called American trypanosomiasis), which causes approximately 50 000 deaths annually. The disease is endemic in South and Central America. The parasite is usually transmitted by a blood-feeding insect vector, but can also be transmitted via blood transfusion. In the chronic form, Chagas disease causes severe damage to the heart and other organs. There is no satisfactory treatment for chronic Chagas disease and no vaccine is available. There is an urgent need for the development of chemotherapeutic agents for the treatment of T. cruzi infection and therefore for the identification of potential drug targets. The dihydrofolate reductase activity of T. cruzi, which is expressed as part of a bifunctional enzyme, dihydrofolate reductase-thymidylate synthase (DHFR-TS), is a potential target for drug development. In order to gain a detailed understanding of the structure-function relationship of T. cruzi DHFR, the three-dimensional structure of this protein in complex with various ligands is being studied. Here, the crystal structures of T. cruzi DHFR-TS with three different compositions of the DHFR domain are reported: the folate-free state, the complex with the lipophilic antifolate trimetrexate (TMQ) and the complex with the classical antifolate methotrexate (MTX). These structures reveal that the enzyme is a homodimer with substantial interactions between the two TS domains of neighboring subunits. In contrast to the enzymes from Cryptosporidium hominis and Plasmodium falciparum, the DHFR and TS active sites of T. cruzi lie on the same side of the monomer. As in other parasitic DHFR-TS proteins, the N-terminal extension of the T. cruzi enzyme is involved in extensive interactions between the two domains. The DHFR active site of the T. cruzi enzyme shows subtle differences compared with its human counterpart. These differences may be exploited for the development of

  4. Population Structure Analysis of Bull Genomes of European and Western Ancestry

    DEFF Research Database (Denmark)

    Chung, Neo Christopher; Szyda, Joanna; Frąszczak, Magdalena

    2017-01-01

    for individual-specific allele frequencies that directly capture a wide range of complex structure from genome-wide genotypes. As measured by magnitude of differentiation, selection pressure on SNPs within genes is substantially greater than that on intergenic regions. Additionally, broad regions of chromosome 6...... harboring largest genetic differentiation suggest positive selection underlying population structure. We carried out gene set analysis using SNP annotations to identify enriched functional categories such as energy-related processes and multiple development stages. Our population structure analysis of bull...... genomes can support genetic management strategies that capture structural complexity and promote sustainable genetic breadth....

  5. A soluble starch synthase I gene, IbSSI, alters the content, composition, granule size and structure of starch in transgenic sweet potato.

    Science.gov (United States)

    Wang, Yannan; Li, Yan; Zhang, Huan; Zhai, Hong; Liu, Qingchang; He, Shaozhen

    2017-05-24

    Soluble starch synthase I (SSI) is a key enzyme in the biosynthesis of plant amylopectin. In this study, the gene named IbSSI, was cloned from sweet potato, an important starch crop. A high expression level of IbSSI was detected in the leaves and storage roots of the sweet potato. Its overexpression significantly increased the content and granule size of starch and the proportion of amylopectin by up-regulating starch biosynthetic genes in the transgenic plants compared with wild-type plants (WT) and RNA interference plants. The frequency of chains with degree of polymerization (DP) 5-8 decreased in the amylopectin fraction of starch, whereas the proportion of chains with DP 9-25 increased in the IbSSI-overexpressing plants compared with WT plants. Further analysis demonstrated that IbSSI was responsible for the synthesis of chains with DP ranging from 9 to 17, which represents a different chain length spectrum in vivo from its counterparts in rice and wheat. These findings suggest that the IbSSI gene plays important roles in determining the content, composition, granule size and structure of starch in sweet potato. This gene may be utilized to improve the content and quality of starch in sweet potato and other plants.

  6. RNA 3D modules in genome-wide predictions of RNA 2D structure

    DEFF Research Database (Denmark)

    Theis, Corinna; Zirbel, Craig L; Zu Siederdissen, Christian Höner

    2015-01-01

    Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational...... approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution....... These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D...

  7. Structural genomic variation as risk factor for idiopathic recurrent miscarriage

    DEFF Research Database (Denmark)

    Nagirnaja, Liina; Palta, Priit; Kasak, Laura

    2014-01-01

    Recurrent miscarriage (RM) is a multifactorial disorder with acknowledged genetic heritability that affects ∼3% of couples aiming at childbirth. As copy number variants (CNVs) have been shown to contribute to reproductive disease susceptibility, we aimed to describe genome-wide profile of CNVs an...... similar low duplication prevalence worldwide (0.7%-1.2%) compared to RM cases of this study (6.6%-7.5%). The CNV disrupts PDZD2 and GOLPH3 genes predominantly expressed in placenta and it may represent a novel risk factor for pregnancy complications....... and identify common rearrangements modulating risk to RM. Genome-wide screening of Estonian RM patients and fertile controls identified excessive cumulative burden of CNVs (5.4 and 6.1 Mb per genome) in two RM cases possibly increasing their individual disease risk. Functional profiling of all rearranged genes...... within RM study group revealed significant enrichment of loci related to innate immunity and immunoregulatory pathways essential for immune tolerance at fetomaternal interface. As a major finding, we report a multicopy duplication (61.6 kb) at 5p13.3 conferring increased maternal risk to RM in Estonia...

  8. Structure and genome organization of AFV2, a novel archaeal lipothrixvirus with unusual terminal and core structures

    DEFF Research Database (Denmark)

    Häring, Monika; Vestergaard, Gisle Alberg; Brügger, Kim

    2005-01-01

    A novel filamentous virus, AFV2, from the hyperthermophilic archaeal genus Acidianus shows structural similarity to lipothrixviruses but differs from them in its unusual terminal and core structures. The double-stranded DNA genome contains 31,787 bp and carries eight open reading frames homologous...

  9. Split photosystem protein, linear-mapping topology, and growth of structural complexity in the plastid genome of chromera velia

    KAUST Repository

    Janouškovec, Jan

    2013-08-22

    The canonical photosynthetic plastid genomes consist of a single circular-mapping chromosome that encodes a highly conserved protein core, involved in photosynthesis and ATP generation. Here, we demonstrate that the plastid genome of the photosynthetic relative of apicomplexans, Chromera velia, departs from this view in several unique ways. Core photosynthesis proteins PsaA and AtpB have been broken into two fragments, which we show are independently transcribed, oligoU-tailed, translated, and assembled into functional photosystem I and ATP synthase complexes. Genome-wide transcription profiles support expression of many other highly modified proteins, including several that contain extensions amounting to hundreds of amino acids in length. Canonical gene clusters and operons have been fragmented and reshuffled into novel putative transcriptional units. Massive genomic coverage by paired-end reads, coupled with pulsed-field gel electrophoresis and polymerase chain reaction, consistently indicate that the C. velia plastid genome is linear-mapping, a unique state among all plastids. Abundant intragenomic duplication probably mediated by recombination can explain protein splits, extensions, and genome linearization and is perhaps the key driving force behind the many features that defy the conventional ways of plastid genome architecture and function. © The Author 2013.

  10. G2S: A web-service for annotating genomic variants on 3D protein structures.

    Science.gov (United States)

    Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong

    2018-01-27

    Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that support programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design conception and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  11. Three-dimensional Structure of a Viral Genome-delivery Portal Vertex

    Energy Technology Data Exchange (ETDEWEB)

    A Olia; P Prevelige Jr.; J Johnson; G Cingolani

    2011-12-31

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-{angstrom}-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a {approx}1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a {approx}200-{angstrom}-long {alpha}-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell.

  12. Structure-based mutational studies ofO-acetylserine sulfhydrylase reveal the reason for the loss of cysteine synthase complex formation inBrucella abortus.

    Science.gov (United States)

    Dharavath, Sudhaker; Raj, Isha; Gourinath, Samudrala

    2017-03-23

    Cysteine biosynthesis takes place via a two-step pathway in bacteria, fungi, plants and protozoan parasites, but not in humans, and hence, the machinery of cysteine biosynthesis is an opportune target for therapeutics. The decameric cysteine synthase complex (CSC) is formed when the C-terminal tail of serine acetyltransferase (SAT) binds in the active site of O -acetylserine sulfydrylase (OASS), playing a role in the regulation of this pathway. Here, we show that OASS from Brucella abortus (BaOASS) does not interact with its cognate SAT C-terminal tail. Crystal structures of native BaOASS showed that residues Gln96 and Tyr125 occupy the active-site pocket and interfere with the entry of the SAT C-terminal tail. The BaOASS (Q96A-Y125A) mutant showed relatively strong binding ( K d  = 32.4 μM) to BaSAT C-terminal peptides in comparison with native BaOASS. The mutant structure looks similar except that the active-site pocket has enough space to bind the SAT C-terminal end. Surface plasmon resonance results showed a relatively strong (7.3 μM K d ) interaction between BaSAT and the BaOASS (Q96A-Y125A), but no interaction with native BaOASS. Taken together, our observations suggest that the CSC does not form in B. abortus . © 2017 The Author(s); published by Portland Press Limited on behalf of the Biochemical Society.

  13. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

    DEFF Research Database (Denmark)

    Parker, Brian John; Moltke, Ida; Roth, Adam

    2011-01-01

    a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein...... identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one...... involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we...

  14. Matrix attachment regions and structural colinearity in the genomes of two grass species.

    OpenAIRE

    Avramova, Z; Tikhonov, A; Chen, M; Bennetzen, J L

    1998-01-01

    In order to gain insights into the relationship between spatial organization of the genome and genome function we have initiated studies of the co-linear Sh2/A1- homologous regions of rice (30 kb) and sorghum (50 kb). We have identified the locations of matrix attachment regions (MARs) in these homologous chromosome segments, which could serve as anchors for individual structural units or loops. Despite the fact that the nucleotide sequences serving as MARs were not detectably conserved, the ...

  15. Structured RNAs in the ENCODE selected regions of the human genome

    DEFF Research Database (Denmark)

    Washietl, Stefan; Pedersen, Jakob Skou; Korbel, Jan O

    2007-01-01

    Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack...... with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz...

  16. CTP synthase forms cytoophidia in the cytoplasm and nucleus

    Energy Technology Data Exchange (ETDEWEB)

    Gou, Ke-Mian [MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT (United Kingdom); State Key Laboratory for Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193 (China); Chang, Chia-Chun [Institute of Biotechnology, National Taiwan University, Taipei, Taiwan, ROC (China); Shen, Qing-Ji [MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT (United Kingdom); Sung, Li-Ying, E-mail: liyingsung@ntu.edu.tw [Institute of Biotechnology, National Taiwan University, Taipei, Taiwan, ROC (China); Agricultural Biotechnology Research Center, Academia Sinica, Taipei 115, Taiwan, ROC (China); Liu, Ji-Long, E-mail: jilong.liu@dpag.ox.ac.uk [MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT (United Kingdom)

    2014-04-15

    CTP synthase is an essential metabolic enzyme responsible for the de novo synthesis of CTP. Multiple studies have recently showed that CTP synthase protein molecules form filamentous structures termed cytoophidia or CTP synthase filaments in the cytoplasm of eukaryotic cells, as well as in bacteria. Here we report that CTP synthase can form cytoophidia not only in the cytoplasm, but also in the nucleus of eukaryotic cells. Both glutamine deprivation and glutamine analog treatment promote formation of cytoplasmic cytoophidia (C-cytoophidia) and nuclear cytoophidia (N-cytoophidia). N-cytoophidia are generally shorter and thinner than their cytoplasmic counterparts. In mammalian cells, both CTP synthase 1 and CTP synthase 2 can form cytoophidia. Using live imaging, we have observed that both C-cytoophidia and N-cytoophidia undergo multiple rounds of fusion upon glutamine analog treatment. Our study reveals the coexistence of cytoophidia in the cytoplasm and nucleus, therefore providing a good opportunity to investigate the intracellular compartmentation of CTP synthase. - Highlights: • CTP synthase forms cytoophidia not only in the cytoplasm but also in the nucleus. • Glutamine deprivation and Glutamine analogs promotes cytoophidium formation. • N-cytoophidia exhibit distinct morphology when compared to C-cytoophidia. • Both CTP synthase 1 and CTP synthase 2 form cytoophidia in mammalian cells. • Fusions of cytoophidia occur in the cytoplasm and nucleus.

  17. Genome-wide identification of structural variants in genes encoding drug targets

    DEFF Research Database (Denmark)

    Rasmussen, Henrik Berg; Dahmcke, Christina Mackeprang

    2012-01-01

    The objective of the present study was to identify structural variants of drug target-encoding genes on a genome-wide scale. We also aimed at identifying drugs that are potentially amenable for individualization of treatments based on knowledge about structural variation in the genes encoding...

  18. IDENTIFICATION AND CHARACTERIZATION OF THE SUCROSE SYNTHASE 2 GENE (Sus2 IN DURUM WHEAT

    Directory of Open Access Journals (Sweden)

    Mariateresa eVolpicella

    2016-03-01

    Full Text Available Sucrose transport is the central system for the allocation of carbon resources in vascular plants. Sucrose synthase, which reversibly catalyzes sucrose synthesis and cleavage, represents a key enzyme in the control of the flow of carbon into starch biosynthesis. In the present study the genomic identification and characterization of the Sus2-2A and Sus2-2B genes coding for sucrose synthase in durum wheat (cultivars Ciccio and Svevo is reported. The genes were analyzed for their expression in different tissues and at different seed maturation stages, in four tetraploid wheat genotypes (Svevo, Ciccio, Primadur and 5-BIL42. The activity of the encoded proteins was evaluated by specific activity assays on endosperm extracts and their structure established by modelling approaches. The combined results of SUS2 expression and activity levels were then considered in the light of their possible involvement in starch yield.

  19. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    2009-07-01

    Full Text Available The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen, with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs. SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome. To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of

  20. Crystallization and preliminary crystallographic analysis of mannosyl-3-phosphoglycerate synthase from Rubrobacter xylanophilus

    International Nuclear Information System (INIS)

    Sá-Moura, Bebiana; Albuquerque, Luciana; Empadinhas, Nuno; Costa, Milton S. da; Pereira, Pedro José Barbosa; Macedo-Ribeiro, Sandra

    2008-01-01

    The enzyme mannosyl-3-phosphoglycerate synthase from R. xylanophilus has been expressed, purified and crystallized. The crystals belong to the hexagonal space group P6 5 22 and diffract to 2.2 Å resolution. Rubrobacter xylanophilus is the only Gram-positive bacterium known to synthesize the compatible solute mannosylglycerate (MG), which is commonly found in hyperthermophilic archaea and some thermophilic bacteria. Unlike the salt-dependent pattern of accumulation observed in (hyper)thermophiles, in R. xylanophilus MG accumulates constitutively. The synthesis of MG in R. xylanophilus was tracked from GDP-mannose and 3-phosphoglycerate, but the genome sequence of the organism failed to reveal any of the genes known to be involved in this pathway. The native enzyme was purified and its N-terminal sequence was used to identify the corresponding gene (mpgS) in the genome of R. xylanophilus. The gene encodes a highly divergent mannosyl-3-phosphoglycerate synthase (MpgS) without relevant sequence homology to known mannosylphosphoglycerate synthases. In order to understand the specificity and enzymatic mechanism of this novel enzyme, it was expressed in Escherichia coli, purified and crystallized. The crystals thus obtained belonged to the hexagonal space group P6 5 22 and contained two protein molecules per asymmetric unit. The structure was solved by SIRAS using a mercury derivative

  1. Crystallization and preliminary crystallographic analysis of mannosyl-3-phosphoglycerate synthase from Rubrobacter xylanophilus

    Energy Technology Data Exchange (ETDEWEB)

    Sá-Moura, Bebiana [IBMC - Instituto de Biologia Molecular e Celular, Universidade do Porto, Porto (Portugal); Albuquerque, Luciana; Empadinhas, Nuno [Centro de Neurociências e Biologia Celular, Departamento de Zoologia, Universidade de Coimbra, Coimbra (Portugal); Costa, Milton S. da [Departamento de Bioquímica, Universidade de Coimbra, Coimbra (Portugal); Pereira, Pedro José Barbosa; Macedo-Ribeiro, Sandra, E-mail: sribeiro@ibmc.up.pt [IBMC - Instituto de Biologia Molecular e Celular, Universidade do Porto, Porto (Portugal)

    2008-08-01

    The enzyme mannosyl-3-phosphoglycerate synthase from R. xylanophilus has been expressed, purified and crystallized. The crystals belong to the hexagonal space group P6{sub 5}22 and diffract to 2.2 Å resolution. Rubrobacter xylanophilus is the only Gram-positive bacterium known to synthesize the compatible solute mannosylglycerate (MG), which is commonly found in hyperthermophilic archaea and some thermophilic bacteria. Unlike the salt-dependent pattern of accumulation observed in (hyper)thermophiles, in R. xylanophilus MG accumulates constitutively. The synthesis of MG in R. xylanophilus was tracked from GDP-mannose and 3-phosphoglycerate, but the genome sequence of the organism failed to reveal any of the genes known to be involved in this pathway. The native enzyme was purified and its N-terminal sequence was used to identify the corresponding gene (mpgS) in the genome of R. xylanophilus. The gene encodes a highly divergent mannosyl-3-phosphoglycerate synthase (MpgS) without relevant sequence homology to known mannosylphosphoglycerate synthases. In order to understand the specificity and enzymatic mechanism of this novel enzyme, it was expressed in Escherichia coli, purified and crystallized. The crystals thus obtained belonged to the hexagonal space group P6{sub 5}22 and contained two protein molecules per asymmetric unit. The structure was solved by SIRAS using a mercury derivative.

  2. In silico Prediction and Docking of Tertiary Structure of LuxI, an Inducer Synthase of Vibrio fischeri

    Directory of Open Access Journals (Sweden)

    Mohammed Zaghlool Saeed Al-Khayyat

    2016-05-01

    Full Text Available Background: LuxI is a component of the quorum sensing signaling pathway in Vibrio fischeri responsible for the inducer synthesis that is essential for bioluminescence. Methods: Homology modeling of LuxI was carried out using Phyre2 and refined with the GalaxyWEB server. Five models were generated and evaluated by ERRAT, ANOLEA, QMEAN6, and Procheck. Results: Five refined models were generated by the GalaxyWEB server, with Model 4 having the greatest quality based on the QMEAN6 score of 0.732. ERRAT analysis revealed an overall quality of 98.9%, while the overall quality of the initial model was 54%. The mean force potential energy, as analyzed by ANOLEA, were better compared to the initial model. Sterochemical quality estimation by Procheck showed that the refined Model 4 had a reliable structure, and was therefore submitted to the protein model database. Drug Discovery Workbench V.2 was used to screen 2700 experimental compounds from the DrugBank database to identify inhibitors that can bind to the active site between amino acids 24 and 110. Ten compounds with high negative scores were selected as the best in binding. Conclusion: The model produced, and the predicted acteyltransferase binding site, could be useful in modeling homologous sequences from other microorganisms and the design of new antimicrobials.

  3. From structure prediction to genomic screens for novel non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Jan Gorodkin

    2011-08-01

    Full Text Available Non-coding RNAs (ncRNAs are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs. A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  4. Protein Production for Structural Genomics Using E. coli Expression

    OpenAIRE

    Makowska-Grzyska, Magdalena; Kim, Youngchang; Maltseva, Natalia; Li, Hui; Zhou, Min; Joachimiak, Grazyna; Babnigg, Gyorgy; Joachimiak, Andrzej

    2014-01-01

    The goal of structural biology is to reveal details of the molecular structure of proteins in order to understand their function and mechanism. X-ray crystallography and NMR are the two best methods for atomic level structure determination. However, these methods require milligram quantities of proteins. In this chapter a reproducible methodology for large-scale protein production applicable to a diverse set of proteins is described. The approach is based on protein expression in E. coli as a...

  5. Structure and expression profile of the sucrose synthase gene family in the rubber tree: indicative of roles in stress response and sucrose utilization in the laticifers.

    Science.gov (United States)

    Xiao, Xiaohu; Tang, Chaorong; Fang, Yongjun; Yang, Meng; Zhou, Binhui; Qi, Jiyan; Zhang, Yi

    2014-01-01

    Sucrose synthase (Sus, EC 2.4.1.13) is widely recognized as a key enzyme in sucrose metabolism in plants. However, nothing is known about this gene family in Hevea brasiliensis (para rubber tree). Here, we identified six Sus genes in H. brasiliensis that comprise the entire Sus family in this species. Analysis of the gene structure and phylogeny of the Sus genes demonstrates evolutionary conservation in the Sus families across Hevea and other plant species. The expression of Sus genes was investigated via Solexa sequencing and quantitative PCR in various tissues, at various phases of leaf development, and under abiotic stresses and ethylene treatment. The Sus genes exhibited distinct but partially redundant expression profiles. Each tissue has one abundant Sus isoform, with HbSus3, 4 and 5 being the predominant isoforms in latex (cytoplasm of rubber-producing laticifers), bark and root, respectively. HbSus1 and 6 were barely expressed in any tissue examined. In mature leaves (source), all HbSus genes were expressed at low levels, but HbSus3 and 4 were abundantly expressed in immature leaves (sink). Low temperature and drought treatments conspicuously induced HbSus5 expression in root and leaf, suggesting a role in stress responses. HbSus2 and 3 transcripts were decreased by ethylene treatment, consistent with the reduced sucrose-synthesizing activity of Sus enzymes in the latex in response to ethylene stimulation. Our results are beneficial to further determination of functions for the Sus genes in Hevea trees, especially roles in regulating latex regeneration. © 2013 FEBS.

  6. Trehalose metabolism in the blue crab Callinectes sapidus: isolation of multiple structural cDNA isoforms of trehalose-6-phosphate synthase and their expression in muscles.

    Science.gov (United States)

    Shi, Q; Chung, J Sook

    2014-02-15

    Adult blue crab Callinectes sapidus exhibit behavioral and ecological dimorphisms: females migrating from the low salinity water to the high salinity area vs. males remaining in the same areas. The flesh basal muscle of the swimming paddle shows a dimorphic color pattern in that levator (Lev) and depressor (Dep) of females tend to be much darker than those of males, while both genders have the same light colored remoter (Rem) and promoter (Pro). The full-length cDNA sequence of four structural isoforms of trehalose-6-phosphate synthase (TPS) is isolated from chela muscles of an adult female, C. sapidus. Two isoforms of the C. sapidus TPS encode functional domains of TPS and trehalose-6-phosphorylase (TPP) in tandem as a fused gene product of Escherichia coli Ost A and Ost B. The other two isoforms contain only a single TPS domain. In both males and females, the darker (Lev+Dep) muscles exhibit greater amounts of trehalose, TPS and trehalase activities than the light colored (Rem+Pro). The fact that adult females show higher levels of trehalase activity in the basal muscles and of glucose in Lev+Dep than those of adult males suggests that there may be a metabolic dimorphism. Moreover, the involvement of trehalose in energy metabolism that was examined under the condition of strenuous swimming activity mimicked in adult females demonstrates the intrinsic trehalose metabolism in Lev+Dep, which subsequently results in hemolymphatic hyperglycemia and hyperlactemia. Our data support that trehalose serves as an additional carbohydrate source of hemolymphatic hyperglycemia in this species. Behavioral and ecological dimorphisms of C. sapidus adults may be supported by a functional dimorphism in energy metabolism. Copyright © 2013 Elsevier B.V. All rights reserved.

  7. The FEI2-SOS5 pathway and CELLULOSE SYNTHASE 5 are required for cellulose biosynthesis in the Arabidopsis seed coat and affect pectin mucilage structure.

    Science.gov (United States)

    Harpaz-Saad, Smadar; Western, Tamara L; Kieber, Joseph J

    2012-02-01

    A common adaptation in angiosperms is the deposition of hydrophilic mucilage into the apoplast of seed coat epidermal cells during the course of their differentiation. Upon imbibition, seed mucilage, composed mainly of carbohydrates (i.e. pectins, hemicelluloses and glycans) expands rapidly, encapsulating the seed and aiding in seed dispersal and germination. The FEI1/FEI2 receptor-like kinases and the SOS5 extracellular GPI-anchored protein were previously shown to act on a pathway regulating cellulose biosynthesis during Arabidopsis root elongation. In the highlighted study, we demonstrated that FEI2 and SOS5 regulate the production of the cellulosic rays deposited across the inner adherent-layer of seed mucilage. Mutations in either fei2 or sos5 disrupted the formation of rays, which was associated with an increase in the soluble, outer layer of pectin mucilage and accompanied by a reduction in the inner adherent-layer. Mutations in CELLULOSE SYNTHASE 5 also led to reduced rays and mal-partitioning of the pectic component of seed mucilage, further establishing a structural role for cellulose in seed mucilage. Here, we show that FEI2 expressed from a CaMV 35S promoter complemented both root and seed mucilage defects of the fei1 fei2 double mutant. In contrast, expression of FEI1 from a 35S promoter complemented the root, but not the seed phenotype of the fei1 fei2 double mutant, suggesting that unlike in the root, FEI2 plays a unique and non-redundant role in the regulation of cellulose synthesis in seed mucilage. Altogether, these data suggest a novel role for cellulose in anchoring the pectic component of seed mucilage to the seed surface and indicate that the FEI2 protein has a function distinct from that of FEI1, despite the high sequence similarity of these RLKs.

  8. Exploring the role of genome and structural ions in preventing viral capsid collapse during dehydration

    Science.gov (United States)

    Martín-González, Natalia; Guérin Darvas, Sofía M.; Durana, Aritz; Marti, Gerardo A.; Guérin, Diego M. A.; de Pablo, Pedro J.

    2018-03-01

    Even though viruses evolve mainly in liquid milieu, their horizontal transmission routes often include episodes of dry environment. Along their life cycle, some insect viruses, such as viruses from the Dicistroviridae family, withstand dehydrated conditions with presently unknown consequences to their structural stability. Here, we use atomic force microscopy to monitor the structural changes of viral particles of Triatoma virus (TrV) after desiccation. Our results demonstrate that TrV capsids preserve their genome inside, conserving their height after exposure to dehydrating conditions, which is in stark contrast with other viruses that expel their genome when desiccated. Moreover, empty capsids (without genome) resulted in collapsed particles after desiccation. We also explored the role of structural ions in the dehydration process of the virions (capsid containing genome) by chelating the accessible cations from the external solvent milieu. We observed that ion suppression helps to keep the virus height upon desiccation. Our results show that under drying conditions, the genome of TrV prevents the capsid from collapsing during dehydration, while the structural ions are responsible for promoting solvent exchange through the virion wall.

  9. The admixed population structure in Danish Jersey dairy cattle challenges accurate genomic predictions

    DEFF Research Database (Denmark)

    Thomasen, Jørn Rind; Sørensen, Anders Christian; Su, Guosheng

    2013-01-01

    The main purpose of this study is to evaluate whether the population structure in Danish Jersey known from the history of the breed also is reflected in the markers. This is done by comparing the linkage disequilibrium and persistence of phase for subgroups of Jersey animals with high proportions...... of Danish or US origin. Furthermore, it is investigated whether a model explicitly incorporating breed origin of animals, inferred either through the known pedigree or from SNP marker data, leads to improved genomic predictions compared to a model ignoring breed origin. The study of the population structure...... origin were analyzed and compared to a basic genomic model that assumes a homogeneous breed structure. The main finding in this study is that the importation of germ plasma from the US Jersey population is readily reflected in the genomes of modern Danish Jersey animals. Firstly, linkage disequilibrium...

  10. GeneViTo: Visualizing gene-product functional and structural features in genomic datasets

    Directory of Open Access Journals (Sweden)

    Promponas Vasilis J

    2003-10-01

    Full Text Available Abstract Background The availability of increasing amounts of sequence data from completely sequenced genomes boosts the development of new computational methods for automated genome annotation and comparative genomics. Therefore, there is a need for tools that facilitate the visualization of raw data and results produced by bioinformatics analysis, providing new means for interactive genome exploration. Visual inspection can be used as a basis to assess the quality of various analysis algorithms and to aid in-depth genomic studies. Results GeneViTo is a JAVA-based computer application that serves as a workbench for genome-wide analysis through visual interaction. The application deals with various experimental information concerning both DNA and protein sequences (derived from public sequence databases or proprietary data sources and meta-data obtained by various prediction algorithms, classification schemes or user-defined features. Interaction with a Graphical User Interface (GUI allows easy extraction of genomic and proteomic data referring to the sequence itself, sequence features, or general structural and functional features. Emphasis is laid on the potential comparison between annotation and prediction data in order to offer a supplement to the provided information, especially in cases of "poor" annotation, or an evaluation of available predictions. Moreover, desired information can be output in high quality JPEG image files for further elaboration and scientific use. A compilation of properly formatted GeneViTo input data for demonstration is available to interested readers for two completely sequenced prokaryotes, Chlamydia trachomatis and Methanococcus jannaschii. Conclusions GeneViTo offers an inspectional view of genomic functional elements, concerning data stemming both from database annotation and analysis tools for an overall analysis of existing genomes. The application is compatible with Linux or Windows ME-2000-XP operating

  11. SL1 revisited: functional analysis of the structure and conformation of HIV-1 genome RNA.

    Science.gov (United States)

    Sakuragi, Sayuri; Yokoyama, Masaru; Shioda, Tatsuo; Sato, Hironori; Sakuragi, Jun-Ichi

    2016-11-11

    The dimer initiation site/dimer linkage sequence (DIS/DLS) region of HIV is located on the 5' end of the viral genome and suggested to form complex secondary/tertiary structures. Within this structure, stem-loop 1 (SL1) is believed to be most important and an essential key to dimerization, since the sequence and predicted secondary structure of SL1 are highly stable and conserved among various virus subtypes. In particular, a six-base palindromic sequence is always present at the hairpin loop of SL1 and the formation of kissing-loop structure at this position between the two strands of genomic RNA is suggested to trigger dimerization. Although the higher-order structure model of SL1 is well accepted and perhaps even undoubted lately, there could be stillroom for consideration to depict the functional SL1 structure while in vivo (in virion or cell). In this study, we performed several analyses to identify the nucleotides and/or basepairing within SL1 which are necessary for HIV-1 genome dimerization, encapsidation, recombination and infectivity. We unexpectedly found that some nucleotides that are believed to contribute the formation of the stem do not impact dimerization or infectivity. On the other hand, we found that one G-C basepair involved in stem formation may serve as an alternative dimer interactive site. We also report on our further investigation of the roles of the palindromic sequences on viral replication. Collectively, we aim to assemble a more-comprehensive functional map of SL1 on the HIV-1 viral life cycle. We discovered several possibilities for a novel structure of SL1 in HIV-1 DLS. The newly proposed structure model suggested that the hairpin loop of SL1 appeared larger, and genome dimerization process might consist of more complicated mechanism than previously understood. Further investigations would be still required to fully understand the genome packaging and dimerization of HIV.

  12. Prioritisation of structural variant calls in cancer genomes

    Directory of Open Access Journals (Sweden)

    Miika J. Ahdesmäki

    2017-04-01

    Full Text Available Sensitivity of short read DNA-sequencing for gene fusion detection is improving, but is hampered by the significant amount of noise composed of uninteresting or false positive hits in the data. In this paper we describe a tiered prioritisation approach to extract high impact gene fusion events from existing structural variant calls. Using cell line and patient DNA sequence data we improve the annotation and interpretation of structural variant calls to best highlight likely cancer driving fusions. We also considerably improve on the automated visualisation of the high impact structural variants to highlight the effects of the variants on the resulting transcripts. The resulting framework greatly improves on readily detecting clinically actionable structural variants.

  13. Protein structure similarity clustering (PSSC) and natural product structure as inspiration sources for drug development and chemical genomics

    NARCIS (Netherlands)

    Dekker, Frank J; Koch, Marcus A; Waldmann, Herbert; Dekker, Frans

    Finding small molecules that modulate protein function is of primary importance in drug development and in the emerging field of chemical genomics. To facilitate the identification of such molecules, we developed a novel strategy making use of structural conservatism found in protein domain

  14. Unique opportunities for NMR methods in structural genomics.

    Science.gov (United States)

    Montelione, Gaetano T; Arrowsmith, Cheryl; Girvin, Mark E; Kennedy, Michael A; Markley, John L; Powers, Robert; Prestegard, James H; Szyperski, Thomas

    2009-04-01

    This Perspective, arising from a workshop held in July 2008 in Buffalo NY, provides an overview of the role NMR has played in the United States Protein Structure Initiative (PSI), and a vision of how NMR will contribute to the forthcoming PSI-Biology program. NMR has contributed in key ways to structure production by the PSI, and new methods have been developed which are impacting the broader protein NMR community.

  15. Gene order data from a model amphibian (Ambystoma: new perspectives on vertebrate genome structure and evolution

    Directory of Open Access Journals (Sweden)

    Voss S Randal

    2006-08-01

    Full Text Available Abstract Background Because amphibians arise from a branch of the vertebrate evolutionary tree that is juxtaposed between fishes and amniotes, they provide important comparative perspective for reconstructing character changes that have occurred during vertebrate evolution. Here, we report the first comparative study of vertebrate genome structure that includes a representative amphibian. We used 491 transcribed sequences from a salamander (Ambystoma genetic map and whole genome assemblies for human, mouse, rat, dog, chicken, zebrafish, and the freshwater pufferfish Tetraodon nigroviridis to compare gene orders and rearrangement rates. Results Ambystoma has experienced a rate of genome rearrangement that is substantially lower than mammalian species but similar to that of chicken and fish. Overall, we found greater conservation of genome structure between Ambystoma and tetrapod vertebrates, nevertheless, 57% of Ambystoma-fish orthologs are found in conserved syntenies of four or more genes. Comparisons between Ambystoma and amniotes reveal extensive conservation of segmental homology for 57% of the presumptive Ambystoma-amniote orthologs. Conclusion Our analyses suggest relatively constant interchromosomal rearrangement rates from the euteleost ancestor to the origin of mammals and illustrate the utility of amphibian mapping data in establishing ancestral amniote and tetrapod gene orders. Comparisons between Ambystoma and amniotes reveal some of the key events that have structured the human genome since diversification of the ancestral amniote lineage.

  16. The genomic structure of the human UFO receptor.

    Science.gov (United States)

    Schulz, A S; Schleithoff, L; Faust, M; Bartram, C R; Janssen, J W

    1993-02-01

    Using a DNA transfection-tumorigenicity assay we have recently identified the UFO oncogene. It encodes a tyrosine kinase receptor characterized by the juxtaposition of two immunoglobulin-like and two fibronectin type III repeats in its extracellular domain. Here we describe the genomic organization of the human UFO locus. The UFO receptor is encoded by 20 exons that are distributed over a region of 44 kb. Different isoforms of UFO mRNA are generated by alternative splicing of exon 10 and differential usage of two imperfect polyadenylation sites resulting in the presence or absence of 1.5-kb 3' untranslated sequences. Primer extension and S1 nuclease analyses revealed multiple transcriptional initiation sites including a major site 169 bp upstream of the translation start site. The promoter region is GC rich, lacks TATA and CAAT boxes, but contains potential recognition sites for a variety of trans-acting factors, including Sp1, AP-2 and the cyclic AMP response element-binding protein. Proto-UFO and its oncogenic counterpart exhibit identical cDNA and promoter regions sequences. Possible modes of UFO activation are discussed.

  17. Genome structure and primitive sex chromosome revealed in Populus

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Yin, Tongming [ORNL; Gunter, Lee E [ORNL; Blaudez, D [UMR, France

    2008-01-01

    We constructed a comprehensive genetic map for Populus and ordered 332 Mb of sequence scaffolds along the 19 haploid chromosomes in order to compare chromosomal regions among diverse members of the genus. These efforts lead us to conclude that chromosome XIX in Populus is evolving into a sex chromosome. Consistent segregation distortion in favor of the sub-genera Tacamahaca alleles provided evidence of divergent selection among species, particularly at the proximal end of chromosome XIX. A large microsatellite marker (SSR) cluster was detected in the distorted region even though the genome-wide distribute SSR sites was uniform across the physical map. The differences between the genetic map and physical sequence data suggested recombination suppression was occurring in the distorted region. A gender-determination locus and an overabundance of NBS-LRR genes were also co-located to the distorted region and were put forth as the cause for divergent selection and recombination suppression. This hypothesis was verified by using fine-scale mapping of an integrated scaffold in the vicinity of the gender-determination locus. As such it appears that chromosome XIX in Populus is in the process of evolving from an autosome into a sex chromosome and that NBS-LRR genes may play important role in the chromosomal diversification process in Populus.

  18. 3D-GNOME: an integrated web service for structural modeling of the 3D genome.

    Science.gov (United States)

    Szalaj, Przemyslaw; Michalski, Paul J; Wróblewski, Przemysław; Tang, Zhonghui; Kadlof, Michal; Mazzocco, Giovanni; Ruan, Yijun; Plewczynski, Dariusz

    2016-07-08

    Recent advances in high-throughput chromosome conformation capture (3C) technology, such as Hi-C and ChIA-PET, have demonstrated the importance of 3D genome organization in development, cell differentiation and transcriptional regulation. There is now a widespread need for computational tools to generate and analyze 3D structural models from 3C data. Here we introduce our 3D GeNOme Modeling Engine (3D-GNOME), a web service which generates 3D structures from 3C data and provides tools to visually inspect and annotate the resulting structures, in addition to a variety of statistical plots and heatmaps which characterize the selected genomic region. Users submit a bedpe (paired-end BED format) file containing the locations and strengths of long range contact points, and 3D-GNOME simulates the structure and provides a convenient user interface for further analysis. Alternatively, a user may generate structures using published ChIA-PET data for the GM12878 cell line by simply specifying a genomic region of interest. 3D-GNOME is freely available at http://3dgnome.cent.uw.edu.pl/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure

    DEFF Research Database (Denmark)

    Torarinsson, Elfar; Sawera, Milena; Havgaard, Jakob Hull

    2006-01-01

    Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never...... been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements. We have investigated the presence of structural RNA elements by conducting a local structural...... alignment, using FOLDALIGN, on a subset of these 100,000 corresponding regions and estimate that 1800 contain common RNA structures. Comparing our results with the recent mapping of transcribed fragments (transfrags) in human, we find that high-scoring candidates are twice as likely to be found in regions...

  20. Bioinformatical approaches to RNA structure prediction & Sequencing of an ancient human genome

    DEFF Research Database (Denmark)

    Lindgreen, Stinus

    tools that exist. The second part has been focused on the mapping and genotyping of ancient genomic DNA. The development of next generation sequencing technologies combined with the use of ancient DNA material present the researchers with some special challenges in the analyses. This work resulted...... in the publication of the first genome of an ancient human individual, where close to the theoretical maximum of the genome sequence was recovered with high confidence. Part of the project was the development of the program SNPest for genotyping and SNP calling that models various sources of error and predicts...... in families of related RNA sequences. Also, the program MASTR was developed to perform simultaneous alignment of multiple RNA sequences and prediction of a common secondary structure. The webserver WAR was developed to make it easy for non-computer savy researchers to use the many RNA structure prediction...

  1. Detection of Genomic Structural Variants from Next-Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Lorenzo eTattini

    2015-06-01

    Full Text Available Structural variants are genomic rearrangements larger than 50 bp accounting for around1% of the variation among human genomes. They impact on phenotypic diversityand play a role in various diseases including neurological/neurocognitive disordersand cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approacheshave been proposed in the literature. In this mini review we describe and summarisethe latest tools – and their underlying algorithms – designed for the analysis ofwhole-genome sequencing, whole-exome sequencing, custom captures and ampliconsequencing data, pointing out the major advantages/drawbacks. We also report asummary of the most recent applications of third-generation sequencing platforms.This assessment provides a guided indication – with particular emphasis on humangenetics and copy number variants – for researchers involved in the investigation of thesegenomic events.

  2. Evolution of the Exon-Intron Structure in Ciliate Genomes.

    Directory of Open Access Journals (Sweden)

    Vladyslav S Bondarenko

    Full Text Available A typical eukaryotic gene is comprised of alternating stretches of regions, exons and introns, retained in and spliced out a mature mRNA, respectively. Although the length of introns may vary substantially among organisms, a large fraction of genes contains short introns in many species. Notably, some Ciliates (Paramecium and Nyctotherus possess only ultra-short introns, around 25 bp long. In Paramecium, ultra-short introns with length divisible by three (3n are under strong evolutionary pressure and have a high frequency of in-frame stop codons, which, in the case of intron retention, cause premature termination of mRNA translation and consequent degradation of the mis-spliced mRNA by the nonsense-mediated decay mechanism. Here, we analyzed introns in five genera of Ciliates, Paramecium, Tetrahymena, Ichthyophthirius, Oxytricha, and Stylonychia. Introns can be classified into two length classes in Tetrahymena and Ichthyophthirius (with means 48 bp, 69 bp, and 55 bp, 64 bp, respectively, but, surprisingly, comprise three distinct length classes in Oxytricha and Stylonychia (with means 33-35 bp, 47-51 bp, and 78-80 bp. In most ranges of the intron lengths, 3n introns are underrepresented and have a high frequency of in-frame stop codons in all studied species. Introns of Paramecium, Tetrahymena, and Ichthyophthirius are preferentially located at the 5' and 3' ends of genes, whereas introns of Oxytricha and Stylonychia are strongly skewed towards the 5' end. Analysis of evolutionary conservation shows that, in each studied genome, a significant fraction of intron positions is conserved between the orthologs, but intron lengths are not correlated between the species. In summary, our study provides a detailed characterization of introns in several genera of Ciliates and highlights some of their distinctive properties, which, together, indicate that splicing spellchecking is a universal and evolutionarily conserved process in the biogenesis of short

  3. Genomic data illuminates demography, genetic structure and selection of a popular dog breed.

    Science.gov (United States)

    Wiener, Pamela; Sánchez-Molano, Enrique; Clements, Dylan N; Woolliams, John A; Haskell, Marie J; Blott, Sarah C

    2017-08-14

    Genomic methods have proved to be important tools in the analysis of genetic diversity across the range of species and can be used to reveal processes underlying both short- and long-term evolutionary change. This study applied genomic methods to investigate population structure and inbreeding in a common UK dog breed, the Labrador Retriever. We found substantial within-breed genetic differentiation, which was associated with the role of the dog (i.e. working, pet, show) and also with coat colour (i.e. black, yellow, brown). There was little evidence of geographical differentiation. Highly differentiated genomic regions contained genes and markers associated with skull shape, suggesting that at least some of the differentiation is related to human-imposed selection on this trait. We also found that the total length of homozygous segments (runs of homozygosity, ROHs) was highly correlated with inbreeding coefficient. This study demonstrates that high-density genomic data can be used to quantify genetic diversity and to decipher demographic and selection processes. Analysis of genetically differentiated regions in the UK Labrador Retriever population suggests the possibility of human-imposed selection on craniofacial characteristics. The high correlation between estimates of inbreeding from genomic and pedigree data for this breed demonstrates that genomic approaches can be used to quantify inbreeding levels in dogs, which will be particularly useful where pedigree information is missing.

  4. Characteristics of de novo structural changes in the human genome

    NARCIS (Netherlands)

    Kloosterman, Wigard P.; Francioli, Laurent C.; Hormozdiari, Fereydoun; Marschall, Tobias; Hehir-Kwa, Jayne Y.; Abdellaoui, Abdel; Lameijer, Eric-Wubbo; Moed, Matthijs H.; Koval, Vyacheslav; Renkens, Ivo; van Roosmalen, Markus J.; Arp, Pascal; Karssen, Lennart C.; Coe, Bradley P.; Handsaker, Robert E.; Suchiman, Eka D.; Cuppen, Edwin; Thung, Djie Tjwan; McVey, Mitch; Wendl, Michael C.; Uitterlinden, Andre; van Duijn, Cornelia M.; Swertz, Morris A.; Wijmenga, Cisca; van Ommen, GertJan B.; Slagboom, P. Eline; Boomsma, Dorret I.; Schoenhuth, Alexander; Eichler, Evan E.; de Bakker, Paul I. W.; Ye, Kai; Guryev, Victor

    Small insertions and deletions (indels) and large structural variations (SVs) are major contributors to human genetic diversity and disease. However, mutation rates and characteristics of de novo indels and SVs in the general population have remained largely unexplored. We report 332 validated de

  5. Visualizing the global secondary structure of a viral RNA genome with cryo-electron microscopy.

    Science.gov (United States)

    Garmann, Rees F; Gopal, Ajaykumar; Athavale, Shreyas S; Knobler, Charles M; Gelbart, William M; Harvey, Stephen C

    2015-05-01

    The lifecycle, and therefore the virulence, of single-stranded (ss)-RNA viruses is regulated not only by their particular protein gene products, but also by the secondary and tertiary structure of their genomes. The secondary structure of the entire genomic RNA of satellite tobacco mosaic virus (STMV) was recently determined by selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE). The SHAPE analysis suggested a single highly extended secondary structure with much less branching than occurs in the ensemble of structures predicted by purely thermodynamic algorithms. Here we examine the solution-equilibrated STMV genome by direct visualization with cryo-electron microscopy (cryo-EM), using an RNA of similar length transcribed from the yeast genome as a control. The cryo-EM data reveal an ensemble of branching patterns that are collectively consistent with the SHAPE-derived secondary structure model. Thus, our results both elucidate the statistical nature of the secondary structure of large ss-RNAs and give visual support for modern RNA structure determination methods. Additionally, this work introduces cryo-EM as a means to distinguish between competing secondary structure models if the models differ significantly in terms of the number and/or length of branches. Furthermore, with the latest advances in cryo-EM technology, we suggest the possibility of developing methods that incorporate restraints from cryo-EM into the next generation of algorithms for the determination of RNA secondary and tertiary structures. © 2015 Garmann et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  6. Integrated view of genome structure and sequence of a single DNA molecule in a nanofluidic device

    DEFF Research Database (Denmark)

    Marie, Rodolphe; Pedersen, Jonas Nyvold; L. V. Bauer, David

    2013-01-01

    We show how a bird’s-eye view of genomic structure can be obtained at ∼1-kb resolution from long (∼2 Mb) DNA molecules extracted from whole chromosomes in a nanofluidic laboratoryon-a-chip. We use an improved single-molecule denaturation mapping approach to detect repetitive elements and known...

  7. Geranyl diphosphate synthase from mint

    Energy Technology Data Exchange (ETDEWEB)

    Croteau, R.B.; Wildung, M.R.; Burke, C.C.; Gershenzon, J.

    1999-03-02

    A cDNA encoding geranyl diphosphate synthase from peppermint has been isolated and sequenced, and the corresponding amino acid sequence has been determined. Accordingly, an isolated DNA sequence (SEQ ID No:1) is provided which codes for the expression of geranyl diphosphate synthase (SEQ ID No:2) from peppermint (Mentha piperita). In other aspects, replicable recombinant cloning vehicles are provided which code for geranyl diphosphate synthase or for a base sequence sufficiently complementary to at least a portion of the geranyl diphosphate synthase DNA or RNA to enable hybridization therewith (e.g., antisense geranyl diphosphate synthase RNA or fragments of complementary geranyl diphosphate synthase DNA which are useful as polymerase chain reaction primers or as probes for geranyl diphosphate synthase or related genes). In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding geranyl diphosphate synthase. Thus, systems and methods are provided for the recombinant expression of geranyl diphosphate synthase that may be used to facilitate the production, isolation and purification of significant quantities of recombinant geranyl diphosphate synthase for subsequent use, to obtain expression or enhanced expression of geranyl diphosphate synthase in plants in order to enhance the production of monoterpenoids, to produce geranyl diphosphate in cancerous cells as a precursor to monoterpenoids having anti-cancer properties or may be otherwise employed for the regulation or expression of geranyl diphosphate synthase or the production of geranyl diphosphate. 5 figs.

  8. Geranyl diphosphate synthase from mint

    Energy Technology Data Exchange (ETDEWEB)

    Croteau, Rodney Bruce (Pullman, WA); Wildung, Mark Raymond (Colfax, WA); Burke, Charles Cullen (Moscow, ID); Gershenzon, Jonathan (Jena, DE)

    1999-01-01

    A cDNA encoding geranyl diphosphate synthase from peppermint has been isolated and sequenced, and the corresponding amino acid sequence has been determined. Accordingly, an isolated DNA sequence (SEQ ID No:1) is provided which codes for the expression of geranyl diphosphate synthase (SEQ ID No:2) from peppermint (Mentha piperita). In other aspects, replicable recombinant cloning vehicles are provided which code for geranyl diphosphate synthase or for a base sequence sufficiently complementary to at least a portion of the geranyl diphosphate synthase DNA or RNA to enable hybridization therewith (e.g., antisense geranyl diphosphate synthase RNA or fragments of complementary geranyl diphosphate synthase DNA which are useful as polymerase chain reaction primers or as probes for geranyl diphosphate synthase or related genes). In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding geranyl diphosphate synthase. Thus, systems and methods are provided for the recombinant expression of geranyl diphosphate synthase that may be used to facilitate the production, isolation and purification of significant quantities of recombinant geranyl diphosphate synthase for subsequent use, to obtain expression or enhanced expression of geranyl diphosphate synthase in plants in order to enhance the production of monoterpenoids, to produce geranyl diphosphate in cancerous cells as a precursor to monoterpenoids having anti-cancer properties or may be otherwise employed for the regulation or expression of geranyl diphosphate synthase or the production of geranyl diphosphate.

  9. Genomic analysis of the hierarchical structure of regulatory networks

    Science.gov (United States)

    Yu, Haiyuan; Gerstein, Mark

    2006-01-01

    A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and their target genes can be modeled in terms of directed regulatory networks. These relationships, in turn, can be readily compared with commonplace “chain-of-command” structures in social networks, which have characteristic hierarchical layouts. Here, we develop algorithms for identifying generalized hierarchies (allowing for various loop structures) and use these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae), with most TFs at the bottom levels and only a few master TFs on top. These masters are situated near the center of the protein–protein interaction network, a different type of network from the regulatory one, and they receive most of the input for the whole regulatory hierarchy through protein interactions. Moreover, they have maximal influence over other genes, in terms of affecting expression-level changes. Surprisingly, however, TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell. Finally, one might think master TFs achieve their wide influence through directly regulating many targets, but TFs with most direct targets are in the middle of the hierarchy. We find, in fact, that these midlevel TFs are “control bottlenecks” in the hierarchy, and this great degree of control for “middle managers” has parallels in efficient social structures in various corporate and governmental settings. PMID:17003135

  10. Effects of Supervised Structured Aerobic Exercise Training Program on Interleukin-6, Nitric Oxide Synthase-1, and Cyclooxygenase-2 in Type 2 Diabetes Mellitus.

    Science.gov (United States)

    Karimi, Hossein; Rehman, Syed Shakil Ur; Gillani, Syed Amir

    2017-06-01

    To determine the effects of supervised structured aerobic exercise training (SSAET) program on interleukin-6 (IL-6), nitric oxide synthase 1 (NOS-1), and cyclooxygenase-2 (COX-2) in type 2 diabetes mellitus (T2DM). Randomized controlled trial. Riphah Rehabilitation and Research Centre, Railways General Hospital, Rawalpindi, from January 2015 to June 2016. Patients of either gender of minimum one year history of T2DM ranging from 40-70 years of age were included. Those with chronic systemic diseases, history of regular exercise, smoking, and those on dietary plan were excluded. Atotal of 195 patients were screened; 120 were selected and 102 agreed to participate in the study. They were randomly placed into experimental and control groups. SSAETprogram, routine medication, and dietary plan were applied in experimental group; whereas, control group was managed with routine medication and dietary plan for 25 weeks. IL-6, NOS-1, and COX-2 were assessed at baseline and 25 weeks. SSAET program, routine medication and dietary plan showed significantly improved IL-6 (pre-mean=0.25 ±0.11ng/ml, post-mean=0.19 ±0.04 ng/ml), NOS-1 (pre-median=4.65 ng/ml, IQ range=1.04 ng/ml), (post-median=2.72 ng/ml, IQ range=1.60 ng/ml), and COX-2 (pre-mean=18.72 ±4.42 ng/ml, post-mean=15.18 ±2.63 ng/ml) in experimental group, as compared with control group managed by routine medication and dietary plan, where deterioration was noted in IL-6 (pre-mean=0.23 ±0.08 ng/ml, post-mean=0.27 ±0.08 ng/ml) and COX-2 (pre-mean=18.49 ±4.56 ng/ml, postmean=19.10 ±4.76 ng/ml), while NOS-1 slight improvement (pre-mean=4.99 ng/ml, IQ range=2.67 ng/ml), (postmean=4.56 ng/ml, IQ range=3.85 ng/ml). Statistically at the baseline the p-values were not significant (p>0.05) in both experimental and control groups for IL-6, COX-2 and NOS-1; while after 25 weeks of intervention, the experimental group showed significant improvement (p<0.05) in comparison with the control group. SSAET program, routine

  11. Mapping the structure and dynamics of genomics-related MeSH terms complex networks.

    Science.gov (United States)

    Siqueiros-García, Jesús M; Hernández-Lemus, Enrique; García-Herrera, Rodrigo; Robina-Galatas, Andrea

    2014-01-01

    It has been proposed that the history and evolution of scientific ideas may reflect certain aspects of the underlying socio-cognitive frameworks in which science itself is developing. Systematic analyses of the development of scientific knowledge may help us to construct models of the collective dynamics of science. Aiming at scientific rigor, these models should be built upon solid empirical evidence, analyzed with formal tools leading to ever-improving results that support the related conclusions. Along these lines we studied the dynamics and structure of the development of research in genomics as represented by the entire collection of genomics-related scientific papers contained in the PubMed database. The analyzed corpus consisted in more than 49,000 articles published in the years 1987 (first appearance of the term Genomics) to 2011, categorized by means of the Medical Subheadings (MeSH) content-descriptors. Complex networks were built where two MeSH terms were connected if they are descriptors of the same article(s). The analysis of such networks revealed a complex structure and dynamics that to certain extent resembled small-world networks. The evolution of such networks in time reflected interesting phenomena in the historical development of genomic research, including what seems to be a phase-transition in a period marked by the completion of the first draft of the Human Genome Project. We also found that different disciplinary areas have different dynamic evolution patterns in their MeSH connectivity networks. In the case of areas related to science, changes in topology were somewhat fast while retaining a certain core-structure, whereas in the humanities, the evolution was pretty slow and the structure resulted highly redundant and in the case of technology related issues, the evolution was very fast and the structure remained tree-like with almost no overlapping terms.

  12. Mapping the structure and dynamics of genomics-related MeSH terms complex networks.

    Directory of Open Access Journals (Sweden)

    Jesús M Siqueiros-García

    Full Text Available It has been proposed that the history and evolution of scientific ideas may reflect certain aspects of the underlying socio-cognitive frameworks in which science itself is developing. Systematic analyses of the development of scientific knowledge may help us to construct models of the collective dynamics of science. Aiming at scientific rigor, these models should be built upon solid empirical evidence, analyzed with formal tools leading to ever-improving results that support the related conclusions. Along these lines we studied the dynamics and structure of the development of research in genomics as represented by the entire collection of genomics-related scientific papers contained in the PubMed database. The analyzed corpus consisted in more than 49,000 articles published in the years 1987 (first appearance of the term Genomics to 2011, categorized by means of the Medical Subheadings (MeSH content-descriptors. Complex networks were built where two MeSH terms were connected if they are descriptors of the same article(s. The analysis of such networks revealed a complex structure and dynamics that to certain extent resembled small-world networks. The evolution of such networks in time reflected interesting phenomena in the historical development of genomic research, including what seems to be a phase-transition in a period marked by the completion of the first draft of the Human Genome Project. We also found that different disciplinary areas have different dynamic evolution patterns in their MeSH connectivity networks. In the case of areas related to science, changes in topology were somewhat fast while retaining a certain core-structure, whereas in the humanities, the evolution was pretty slow and the structure resulted highly redundant and in the case of technology related issues, the evolution was very fast and the structure remained tree-like with almost no overlapping terms.

  13. High density LD-based structural variations analysis in cattle genome.

    Directory of Open Access Journals (Sweden)

    Ricardo Salomon-Torres

    Full Text Available Genomic structural variations represent an important source of genetic variation in mammal genomes, thus, they are commonly related to phenotypic expressions. In this work, ∼ 770,000 single nucleotide polymorphism genotypes from 506 animals from 19 cattle breeds were analyzed. A simple LD-based structural variation was defined, and a genome-wide analysis was performed. After applying some quality control filters, for each breed and each chromosome we calculated the linkage disequilibrium (r2 of short range (≤ 100 Kb. We sorted SNP pairs by distance and obtained a set of LD means (called the expected means using bins of 5 Kb. We identified 15,246 segments of at least 1 Kb, among the 19 breeds, consisting of sets of at least 3 adjacent SNPs so that, for each SNP, r2 within its neighbors in a 100 Kb range, to the right side of that SNP, were all bigger than, or all smaller than, the corresponding expected mean, and their P-value were significant after a Benjamini-Hochberg multiple testing correction. In addition, to account just for homogeneously distributed regions we considered only SNPs having at least 15 SNP neighbors within 100 Kb. We defined such segments as structural variations. By grouping all variations across all animals in the sample we defined 9,146 regions, involving a total of 53,137 SNPs; representing the 6.40% (160.98 Mb from the bovine genome. The identified structural variations covered 3,109 genes. Clustering analysis showed the relatedness of breeds given the geographic region in which they are evolving. In summary, we present an analysis of structural variations based on the deviation of the expected short range LD between SNPs in the bovine genome. With an intuitive and simple definition based only on SNPs data it was possible to discern closeness of breeds due to grouping by geographic region in which they are evolving.

  14. Structural genomic variation in childhood epilepsies with complex phenotypes

    DEFF Research Database (Denmark)

    Helbig, Ingo; Swinkels, Marielle E M; Aten, Emmelien

    2014-01-01

    A genetic contribution to a broad range of epilepsies has been postulated, and particularly copy number variations (CNVs) have emerged as significant genetic risk factors. However, the role of CNVs in patients with epilepsies with complex phenotypes is not known. Therefore, we investigated the role...... of CNVs in patients with unclassified epilepsies and complex phenotypes. A total of 222 patients from three European countries, including patients with structural lesions on magnetic resonance imaging (MRI), dysmorphic features, and multiple congenital anomalies, were clinically evaluated and screened...

  15. Structural genomics reveals EVE as a new ASCH/PUA-related domain

    Science.gov (United States)

    Bertonati, Claudia; Punta, Marco; Fischer, Markus; Yachdav, Guy; Forouhar, Farhad; Zhou, Weihong; Kuzin, Alexander P.; Seetharaman, Jayaraman; Abashidze, Mariam; Ramelot, Theresa A.; Kennedy, Michael A.; Cort, John R.; Belachew, Adam; Hunt, John F.; Tong, Liang; Montelione, Gaetano T.; Rost, Burkhard

    2014-01-01

    Summary We report on several proteins recently solved by structural genomics consortia, in particular by the Northeast Structural Genomics consortium (NESG). The proteins considered in this study differ substantially in their sequences but they share a similar structural core, characterized by a pseudobarrel five-stranded beta sheet. This core corresponds to the PUA domain-like architecture in the SCOP database. By connecting sequence information with structural knowledge, we characterize a new subgroup of these proteins that we propose to be distinctly different from previously described PUA domain-like domains such as PUA proper or ASCH. We refer to these newly defined domains as EVE. Although EVE may have retained the ability of PUA domains to bind RNA, the available experimental and computational data suggests that both the details of its molecular function and its cellular function differ from those of other PUA domain-like domains. This study of EVE and its relatives illustrates how the combination of structure and genomics creates new insights by connecting a cornucopia of structures that map to the same evolutionary potential. Primary sequence information alone would have not been sufficient to reveal these evolutionary links. PMID:19191354

  16. Genomic structure in Europeans dating back at least 36,200 years

    DEFF Research Database (Denmark)

    Seguin-Orlando, Andaine; Korneliussen, Thorfinn Sand; Sikora, Martin

    2014-01-01

    The origin of contemporary Europeans remains contentious. We obtained a genome sequence from Kostenki 14 in European Russia dating from 38,700 to 36,200 years ago, one of the oldest fossils of anatomically modern humans from Europe. We find that Kostenki 14 shares a close ancestry with the 24...... European Neolithic farmers. We find that Kostenki 14 contains more Neandertal DNA that is contained in longer tracts than present Europeans. Our findings reveal the timing of divergence of western Eurasians and East Asians to be more than 36,200 years ago and that European genomic structure today dates...

  17. Gene finding with a hidden Markov model of genome structure and evolution

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Hein, Jotun

    2003-01-01

    the model are linear in alignment length and genome number. The model is applied to the problem of gene finding. The benefit of modelling sequence evolution is demonstrated both in a range of simulations and on a set of orthologous human/mouse gene pairs. AVAILABILITY: Free availability over the Internet...... annotation. The modelling of evolution by the existing comparative gene finders leaves room for improvement. Results: A probabilistic model of both genome structure and evolution is designed. This type of model is called an Evolutionary Hidden Markov Model (EHMM), being composed of an HMM and a set of region...

  18. Fosmid library end sequencing reveals a rarely known genome structure of marine shrimp Penaeus monodon

    Directory of Open Access Journals (Sweden)

    Chen Ming

    2011-05-01

    Full Text Available Abstract Background The black tiger shrimp (Penaeus monodon is one of the most important aquaculture species in the world, representing the crustacean lineage which possesses the greatest species diversity among marine invertebrates. Yet, we barely know anything about their genomic structure. To understand the organization and evolution of the P. monodon genome, a fosmid library consisting of 288,000 colonies and was constructed, equivalent to 5.3-fold coverage of the 2.17 Gb genome. Approximately 11.1 Mb of fosmid end sequences (FESs from 20,926 non-redundant reads representing 0.45% of the P. monodon genome were obtained for repetitive and protein-coding sequence analyses. Results We found that microsatellite sequences were highly abundant in the P. monodon genome, comprising 8.3% of the total length. The density and the average length of microsatellites were evidently higher in comparison to those of other taxa. AT-rich microsatellite motifs, especially poly (AT and poly (AAT, were the most abundant. High abundance of microsatellite sequences were also found in the transcribed regions. Furthermore, via self-BlastN analysis we identified 103 novel repetitive element families which were categorized into four groups, i.e., 33 WSSV-like repeats, 14 retrotransposons, 5 gene-like repeats, and 51 unannotated repeats. Overall, various types of repeats comprise 51.18% of the P. monodon genome in length. Approximately 7.4% of the FESs contained protein-coding sequences, and the Inhibitor of Apoptosis Protein (IAP gene and the Innexin 3 gene homologues appear to be present in high abundance in the P. monodon genome. Conclusions The redundancy of various repeat types in the P. monodon genome illustrates its highly repetitive nature. In particular, long and dense microsatellite sequences as well as abundant WSSV-like sequences highlight the uniqueness of genome organization of penaeid shrimp from those of other taxa. These results provide substantial

  19. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  20. Complete Chloroplast Genomes of Papaver rhoeas and Papaver orientale: Molecular Structures, Comparative Analysis, and Phylogenetic Analysis

    Directory of Open Access Journals (Sweden)

    Jianguo Zhou

    2018-02-01

    Full Text Available Papaver rhoeas L. and P. orientale L., which belong to the family Papaveraceae, are used as ornamental and medicinal plants. The chloroplast genome has been used for molecular markers, evolutionary biology, and barcoding identification. In this study, the complete chloroplast genome sequences of P. rhoeas and P. orientale are reported. Results show that the complete chloroplast genomes of P. rhoeas and P. orientale have typical quadripartite structures, which are comprised of circular 152,905 and 152,799-bp-long molecules, respectively. A total of 130 genes were identified in each genome, including 85 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Sequence divergence analysis of four species from Papaveraceae indicated that the most divergent regions are found in the non-coding spacers with minimal differences among three Papaver species. These differences include the ycf1 gene and intergenic regions, such as rpoB-trnC, trnD-trnT, petA-psbJ, psbE-petL, and ccsA-ndhD. These regions are hypervariable regions, which can be used as specific DNA barcodes. This finding suggested that the chloroplast genome could be used as a powerful tool to resolve the phylogenetic positions and relationships of Papaveraceae. These results offer valuable information for future research in the identification of Papaver species and will benefit further investigations of these species.

  1. De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture

    Science.gov (United States)

    Di Pierro, Michele; Cheng, Ryan R.; Lieberman Aiden, Erez; Wolynes, Peter G.; Onuchic, José N.

    2017-01-01

    Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. PMID:29087948

  2. De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture.

    Science.gov (United States)

    Di Pierro, Michele; Cheng, Ryan R; Lieberman Aiden, Erez; Wolynes, Peter G; Onuchic, José N

    2017-11-14

    Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. Copyright © 2017 the Author(s). Published by PNAS.

  3. Complete plastid genomes from Ophioglossum californicum, Psilotum nudum, and Equisetum hyemale reveal an ancestral land plant genome structure and resolve the position of Equisetales among monilophytes

    Directory of Open Access Journals (Sweden)

    Grewe Felix

    2013-01-01

    Full Text Available Abstract Background Plastid genome structure and content is remarkably conserved in land plants. This widespread conservation has facilitated taxon-rich phylogenetic analyses that have resolved organismal relationships among many land plant groups. However, the relationships among major fern lineages, especially the placement of Equisetales, remain enigmatic. Results In order to understand the evolution of plastid genomes and to establish phylogenetic relationships among ferns, we sequenced the plastid genomes from three early diverging species: Equisetum hyemale (Equisetales, Ophioglossum californicum (Ophioglossales, and Psilotum nudum (Psilotales. A comparison of fern plastid genomes showed that some lineages have retained inverted repeat (IR boundaries originating from the common ancestor of land plants, while other lineages have experienced multiple IR changes including expansions and inversions. Genome content has remained stable throughout ferns, except for a few lineage-specific losses of genes and introns. Notably, the losses of the rps16 gene and the rps12i346 intron are shared among Psilotales, Ophioglossales, and Equisetales, while the gain of a mitochondrial atp1 intron is shared between Marattiales and Polypodiopsida. These genomic structural changes support the placement of Equisetales as sister to Ophioglossales + Psilotales and Marattiales as sister to Polypodiopsida. This result is augmented by some molecular phylogenetic analyses that recover the same relationships, whereas others suggest a relationship between Equisetales and Polypodiopsida. Conclusions Although molecular analyses were inconsistent with respect to the position of Marattiales and Equisetales, several genomic structural changes have for the first time provided a clear placement of these lineages within the ferns. These results further demonstrate the power of using rare genomic structural changes in cases where molecular data fail to provide strong phylogenetic

  4. Elucidating the role of transcription in shaping the 3D structure of the bacterial genome

    Science.gov (United States)

    Brandao, Hugo B.; Wang, Xindan; Rudner, David Z.; Mirny, Leonid

    Active transcription has been linked to several genome conformation changes in bacteria, including the recruitment of chromosomal DNA to the cell membrane and formation of nucleoid clusters. Using genomic and imaging data as input into mathematical models and polymer simulations, we sought to explore the extent to which bacterial 3D genome structure could be explained by 1D transcription tracks. Using B. subtilis as a model organism, we investigated via polymer simulations the role of loop extrusion and DNA super-coiling on the formation of interaction domains and other fine-scale features that are visible in chromosome conformation capture (Hi-C) data. We then explored the role of the condensin structural maintenance of chromosome complex on the alignment of chromosomal arms. A parameter-free transcription traffic model demonstrated that mean chromosomal arm alignment can be quantitatively explained, and the effects on arm alignment in genomically rearranged strains of B. subtilis were accurately predicted. H.B. acknowledges support from the Natural Sciences and Engineering Research Council of Canada for a PGS-D fellowship.

  5. Cleaning up polyketide synthases.

    Science.gov (United States)

    Kwan, Jason C; Schmidt, Eric W

    2012-03-23

    Complex biosynthetic enzymes such as polyketide synthases make mistakes. In this issue of Chemistry & Biology, Jensen et al. report that a discrete family of acyltransferases is responsible for error correction, hydrolyzing key biosynthetic intermediates from a multi-enzyme complex. This activity might find use in understanding polyketide biosynthesis, particularly in uncultivated organisms and in tailoring the synthesis of small molecules. Copyright © 2012 Elsevier Ltd. All rights reserved.

  6. Hybrid polyketide synthases

    Energy Technology Data Exchange (ETDEWEB)

    Fortman, Jeffrey L.; Hagen, Andrew; Katz, Leonard; Keasling, Jay D.; Poust, Sean; Zhang, Jingwei; Zotchev, Sergey

    2016-05-10

    The present invention provides for a polyketide synthase (PKS) capable of synthesizing an even-chain or odd-chain diacid or lactam or diamine. The present invention also provides for a host cell comprising the PKS and when cultured produces the even-chain diacid, odd-chain diacid, or KAPA. The present invention also provides for a host cell comprising the PKS capable of synthesizing a pimelic acid or KAPA, and when cultured produces biotin.

  7. Biophysical characterization of recombinant proteins: A key to higher structural genomics success

    Science.gov (United States)

    Vedadi, Masoud; Arrowsmith, Cheryl H.; Allali-Hassani, Abdellah; Senisterra, Guillermo; Wasney, Gregory A.

    2010-01-01

    Hundreds of genomes have been successfully sequenced to date, and the data are publicly available. At the same time, the advances in large-scale expression and purification of recombinant proteins have paved the way for structural genomics efforts. Frequently, however, little is known about newly expressed proteins calling for large-scale protein characterization to better understand their biochemical roles and to enable structure–function relationship studies. In the Structural Genomics Consortium (SGC), we have established a platform to characterize large numbers of purified proteins. This includes screening for ligands, enzyme assays, peptide arrays and peptide displacement in a 384-well format. In this review, we describe this platform in more detail and report on how our approach significantly increases the success rate for structure determination. Coupled with high-resolution X-ray crystallography and structure-guided methods, this platform can also be used toward the development of chemical probes through screening families of proteins against a variety of chemical series and focused chemical libraries. PMID:20466062

  8. Structural features based genome-wide characterization and prediction of nucleosome organization

    Directory of Open Access Journals (Sweden)

    Gan Yanglan

    2012-03-01

    Full Text Available Abstract Background Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. Results We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. Conclusions Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence

  9. Nonclinical and Clinical Enterococcus faecium Strains, but Not Enterococcus faecalis Strains, Have Distinct Structural and Functional Genomic Features

    Science.gov (United States)

    Kim, Eun Bae

    2014-01-01

    Certain strains of Enterococcus faecium and Enterococcus faecalis contribute beneficially to animal health and food production, while others are associated with nosocomial infections. To determine whether there are structural and functional genomic features that are distinct between nonclinical (NC) and clinical (CL) strains of those species, we analyzed the genomes of 31 E. faecium and 38 E. faecalis strains. Hierarchical clustering of 7,017 orthologs found in the E. faecium pangenome revealed that NC strains clustered into two clades and are distinct from CL strains. NC E. faecium genomes are significantly smaller than CL genomes, and this difference was partly explained by significantly fewer mobile genetic elements (ME), virulence factors (VF), and antibiotic resistance (AR) genes. E. faecium ortholog comparisons identified 68 and 153 genes that are enriched for NC and CL strains, respectively. Proximity analysis showed that CL-enriched loci, and not NC-enriched loci, are more frequently colocalized on the genome with ME. In CL genomes, AR genes are also colocalized with ME, and VF are more frequently associated with CL-enriched loci. Genes in 23 functional groups are also differentially enriched between NC and CL E. faecium genomes. In contrast, differences were not observed between NC and CL E. faecalis genomes despite their having larger genomes than E. faecium. Our findings show that unlike E. faecalis, NC and CL E. faecium strains are equipped with distinct structural and functional genomic features indicative of adaptation to different environments. PMID:24141120

  10. Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes.

    Science.gov (United States)

    Dolle, Dirk D; Liu, Zhicheng; Cotten, Matthew; Simpson, Jared T; Iqbal, Zamin; Durbin, Richard; McCarthy, Shane A; Keane, Thomas M

    2017-02-01

    We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference-based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows-Wheeler transform (BWT) and FM-index have been widely employed as a full-text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2705 samples from the 1000 Genomes Project. A key feature is that, as more genomes are added, identical read sequences are increasingly observed, and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out nonreference queries to search for the presence of all known viral genomes and discover human T-lymphotropic virus 1 integrations in six samples in a recognized epidemiological distribution. © 2017 Dolle et al.; Published by Cold Spring Harbor Laboratory Press.

  11. Effects of aneuploidy on genome structure, expression, and interphase organization in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    Bruno Huettel

    2008-10-01

    Full Text Available Aneuploidy refers to losses and/or gains of individual chromosomes from the normal chromosome set. The resulting gene dosage imbalance has a noticeable affect on the phenotype, as illustrated by aneuploid syndromes, including Down syndrome in humans, and by human solid tumor cells, which are highly aneuploid. Although the phenotypic manifestations of aneuploidy are usually apparent, information about the underlying alterations in structure, expression, and interphase organization of unbalanced chromosome sets is still sparse. Plants generally tolerate aneuploidy better than animals, and, through colchicine treatment and breeding strategies, it is possible to obtain inbred sibling plants with different numbers of chromosomes. This possibility, combined with the genetic and genomics tools available for Arabidopsis thaliana, provides a powerful means to assess systematically the molecular and cytological consequences of aberrant numbers of specific chromosomes. Here, we report on the generation of Arabidopsis plants in which chromosome 5 is present in triplicate. We compare the global transcript profiles of normal diploids and chromosome 5 trisomics, and assess genome integrity using array comparative genome hybridization. We use live cell imaging to determine the interphase 3D arrangement of transgene-encoded fluorescent tags on chromosome 5 in trisomic and triploid plants. The results indicate that trisomy 5 disrupts gene expression throughout the genome and supports the production and/or retention of truncated copies of chromosome 5. Although trisomy 5 does not grossly distort the interphase arrangement of fluorescent-tagged sites on chromosome 5, it may somewhat enhance associations between transgene alleles. Our analysis reveals the complex genomic changes that can occur in aneuploids and underscores the importance of using multiple experimental approaches to investigate how chromosome numerical changes condition abnormal phenotypes and

  12. Effect of supervised structured aerobic exercise training program of interleukin-6, nitric oxide synthase-1, and cyclooxygenase-2 in type 2 diabetes mellitus

    International Nuclear Information System (INIS)

    Karimi, H.; Gillani, S.A.; Rehman, S.S.U.

    2017-01-01

    To determine the effects of supervised structured aerobic exercise training (SSAET) program on interleukin-6 (IL-6), nitric oxide synthase 1 (NOS-1), and cyclooxygenase-2 (COX-2) in type 2 diabetes mellitus (T2DM). Study Design: Randomized controlled trial. Place and Duration of Study: Riphah Rehabilitation and Research Centre, Railways General Hospital, Rawalpindi, from January 2015 to June 2016. Methodology: Patients of either gender of minimum one year history of T2DM ranging from 40-70 years of age were included. Those with chronic systemic diseases, history of regular exercise, smoking, and those on dietary plan were excluded. A total of 195 patients were screened; 120 were selected and 102 agreed to participate in the study. They were randomly placed into experimental and control groups. SSAET program, routine medication, and dietary plan were applied in experimental group; whereas, control group was managed with routine medication and dietary plan for 25 weeks. IL-6, NOS-1, and COX-2 were assessed at baseline and 25 weeks. Results: SSAET program, routine medication and dietary plan showed significantly improved IL-6 (pre-mean=0.25 +-0.11ng/ml, post-mean=0.19 +-0.04 ng/ml), NOS-1 (pre-median=4.65 ng/ml, IQ range=1.04 ng/ml), (post-median=2.72 ng/ml, IQ range=1.60 ng/ml), and COX-2 (pre-mean=18.72 +-4.42 ng/ml, post-mean=15.18 +-2.63 ng/ml) in experimental group, as compared with control group managed by routine medication and dietary plan, where deterioration was noted in IL-6 (pre-mean=0.23 +-0.08 ng/ml, post-mean=0.27 +-0.08 ng/ml) and COX-2 (pre-mean=18.49 +-4.56 ng/ml, post-mean=19.10 +-4.76 ng/ml), while NOS-1 slight improvement (pre-mean=4.99 ng/ml, IQ range=2.67 ng/ml), (post-mean=4.56 ng/ml, IQ range=3.85 ng/ml). Statistically at the baseline the p-values were not significant (p>0.05) in both experimental and control groups for IL-6, COX-2 and NOS-1; while after 25 weeks of intervention, the experimental group showed significant improvement (p<0

  13. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Directory of Open Access Journals (Sweden)

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  14. Whole genome comparison between table and wine grapes reveals a comprehensive catalog of structural variants.

    Science.gov (United States)

    Di Genova, Alex; Almeida, Andrea Miyasaka; Muñoz-Espinoza, Claudia; Vizoso, Paula; Travisany, Dante; Moraga, Carol; Pinto, Manuel; Hinrichsen, Patricio; Orellana, Ariel; Maass, Alejandro

    2014-01-07

    Grapevine (Vitis vinifera L.) is the most important Mediterranean fruit crop, used to produce both wine and spirits as well as table grape and raisins. Wine and table grape cultivars represent two divergent germplasm pools with different origins and domestication history, as well as differential characteristics for berry size, cluster architecture and berry chemical profile, among others. 'Sultanina' plays a pivotal role in modern table grape breeding providing the main source of seedlessness. This cultivar is also one of the most planted for fresh consumption and raisins production. Given its importance, we sequenced it and implemented a novel strategy for the de novo assembly of its highly heterozygous genome. Our approach produced a draft genome of 466 Mb, recovering 82% of the genes present in the grapevine reference genome; in addition, we identified 240 novel genes. A large number of structural variants and SNPs were identified. Among them, 45 (21 SNPs and 24 INDELs) were experimentally confirmed in 'Sultanina' and six SNPs in other 23 table grape varieties. Transposable elements corresponded to ca. 80% of the repetitive sequences involved in structural variants and more than 2,000 genes were affected in their structure by these variants. Some of these genes are likely involved in embryo development, suggesting that they may contribute to seedlessness, a key trait for table grapes. This work produced the first structural variants and SNPs catalog for grapevine, constituting a novel and very powerful tool for genomic studies in this key fruit crop, particularly useful to support marker assisted breeding in table grapes.

  15. Universal Internucleotide Statistics in Full Genomes: A Footprint of the DNA Structure and Packaging?

    OpenAIRE

    Bogachev, Mikhail I.; Kayumov, Airat R.; Bunde, Armin

    2014-01-01

    Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleot...

  16. Fungal type III polyketide synthases.

    Science.gov (United States)

    Hashimoto, Makoto; Nonaka, Takamasa; Fujii, Isao

    2014-10-01

    This article covers the literature on fungal type III polyketide synthases (PKSs) published from 2005 to 2014. Since the first discovery of fungal type III PKS genes in Aspergillus oryzae, reported in 2005, putative genes for type III PKSs have been discovered in fungal genomes. Compared with type I PKSs, type III PKSs are much less abundant in fungi. However, type III PKSs could have some critical roles in fungi. This article summarizes the studies on fungal type III PKS functional analysis, including Neurospora crassa ORAS, Aspergillus niger AnPKS, Botrytis cinerea BPKS and Aspergillus oryzae CsyA and CsyB. It is mostly in vitro analysis using their recombinant enzymes that has revealed their starter and product specificities. Of these, CsyB was found to be a new kind of type III PKS that catalyses the coupling of two β-keto fatty acyl CoAs. Homology modelling reported in this article supports the importance of the capacity of the acyl binding tunnel and active site cavity in fungal type III PKSs.

  17. ASMPKS: an analysis system for modular polyketide synthases

    Directory of Open Access Journals (Sweden)

    Kong Eun-Bae

    2007-09-01

    Full Text Available Abstract Background Polyketides are secondary metabolites of microorganisms with diverse biological activities, including pharmacological functions such as antibiotic, antitumor and agrochemical properties. Polyketides are synthesized by serialized reactions of a set of enzymes called polyketide synthase(PKSs, which coordinate the elongation of carbon skeletons by the stepwise condensation of short carbon precursors. Due to their importance as drugs, the volume of data on polyketides is rapidly increasing and creating a need for computational analysis methods for efficient polyketide research. Moreover, the increasing use of genetic engineering to research new kinds of polyketides requires genome wide analysis. Results We describe a system named ASMPKS (Analysis System for Modular Polyketide Synthesis for computational analysis of PKSs against genome sequences. It also provides overall management of information on modular PKS, including polyketide database construction, new PKS assembly, and chain visualization. ASMPKS operates on a web interface to construct the database and to analyze PKSs, allowing polyketide researchers to add their data to this database and to use it easily. In addition, the ASMPKS can predict functional modules for a protein sequence submitted by users, estimate the chemical composition of a polyketide synthesized from the modules, and display the carbon chain structure on the web interface. Conclusion ASMPKS has powerful computation features to aid modular PKS research. As various factors, such as starter units and post-processing, are related to polyketide biosynthesis, ASMPKS will be improved through further development for study of the factors.

  18. Combining functional and structural genomics to sample the essential Burkholderia structome.

    Directory of Open Access Journals (Sweden)

    Loren Baugh

    Full Text Available The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite.We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq. We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail.This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against

  19. Combining functional and structural genomics to sample the essential Burkholderia structome.

    Science.gov (United States)

    Baugh, Loren; Gallagher, Larry A; Patrapuvich, Rapatbhorn; Clifton, Matthew C; Gardberg, Anna S; Edwards, Thomas E; Armour, Brianna; Begley, Darren W; Dieterich, Shellie H; Dranow, David M; Abendroth, Jan; Fairman, James W; Fox, David; Staker, Bart L; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W; Stacy, Robin; Myler, Peter J; Stewart, Lance J; Manoil, Colin; Van Voorhis, Wesley C

    2013-01-01

    The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases

  20. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus.

    Directory of Open Access Journals (Sweden)

    Chuan Hong

    2014-12-01

    Full Text Available Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion.

  1. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus.

    Science.gov (United States)

    Hong, Chuan; Oksanen, Hanna M; Liu, Xiangan; Jakana, Joanita; Bamford, Dennis H; Chiu, Wah

    2014-12-01

    Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds) DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM) and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion.

  2. Genome-wide population structure and evolutionary history of the Frizarta dairy sheep.

    Science.gov (United States)

    Kominakis, A; Hager-Theodorides, A L; Saridaki, A; Antonakos, G; Tsiamis, G

    2017-10-01

    In the present study, we used genomic data, generated with a medium density single nucleotide polymorphisms (SNP) array, to acquire more information on the population structure and evolutionary history of the synthetic Frizarta dairy sheep. First, two typical measures of linkage disequilibrium (LD) were estimated at various physical distances that were then used to make inferences on the effective population size at key past time points. Population structure was also assessed by both multidimensional scaling analysis and k-means clustering on the distance matrix obtained from the animals' genomic relationships. The Wright's fixation F ST index was also employed to assess herds' genetic homogeneity and to indirectly estimate past migration rates. The Wright's fixation F IS index and genomic inbreeding coefficients based on the genomic relationship matrix as well as on runs of homozygosity were also estimated. The Frizarta breed displays relatively low LD levels with r 2 and |D'| equal to 0.18 and 0.50, respectively, at an average inter-marker distance of 31 kb. Linkage disequilibrium decayed rapidly by distance and persisted over just a few thousand base pairs. Rate of LD decay (β) varied widely among the 26 autosomes with larger values estimated for shorter chromosomes (e.g. β=0.057, for OAR6) and smaller values for longer ones (e.g. β=0.022, for OAR2). The inferred effective population size at the beginning of the breed's formation was as high as 549, was then reduced to 463 in 1981 (end of the breed's formation) and further declined to 187, one generation ago. Multidimensional scaling analysis and k-means clustering suggested a genetically homogenous population, F ST estimates indicated relatively low genetic differentiation between herds, whereas a heat map of the animals' genomic kinship relationships revealed a stratified population, at a herd level. Estimates of genomic inbreeding coefficients suggested that most recent parental relatedness may have been a

  3. Genomic Structure of an Economically Important Cyanobacterium, Arthrospira (Spirulina) platensis NIES-39

    Science.gov (United States)

    Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki

    2010-01-01

    A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057

  4. Single-cell paired-end genome sequencing reveals structural variation per cell cycle

    Science.gov (United States)

    Voet, Thierry; Kumar, Parveen; Van Loo, Peter; Cooke, Susanna L.; Marshall, John; Lin, Meng-Lay; Zamani Esteki, Masoud; Van der Aa, Niels; Mateiu, Ligia; McBride, David J.; Bignell, Graham R.; McLaren, Stuart; Teague, Jon; Butler, Adam; Raine, Keiran; Stebbings, Lucy A.; Quail, Michael A.; D’Hooghe, Thomas; Moreau, Yves; Futreal, P. Andrew; Stratton, Michael R.; Vermeesch, Joris R.; Campbell, Peter J.

    2013-01-01

    The nature and pace of genome mutation is largely unknown. Because standard methods sequence DNA from populations of cells, the genetic composition of individual cells is lost, de novo mutations in cells are concealed within the bulk signal and per cell cycle mutation rates and mechanisms remain elusive. Although single-cell genome analyses could resolve these problems, such analyses are error-prone because of whole-genome amplification (WGA) artefacts and are limited in the types of DNA mutation that can be discerned. We developed methods for paired-end sequence analysis of single-cell WGA products that enable (i) detecting multiple classes of DNA mutation, (ii) distinguishing DNA copy number changes from allelic WGA-amplification artefacts by the discovery of matching aberrantly mapping read pairs among the surfeit of paired-end WGA and mapping artefacts and (iii) delineating the break points and architecture of structural variants. By applying the methods, we capture DNA copy number changes acquired over one cell cycle in breast cancer cells and in blastomeres derived from a human zygote after in vitro fertilization. Furthermore, we were able to discover and fine-map a heritable inter-chromosomal rearrangement t(1;16)(p36;p12) by sequencing a single blastomere. The methods will expedite applications in basic genome research and provide a stepping stone to novel approaches for clinical genetic diagnosis. PMID:23630320

  5. Structural genomics: keeping up with expanding knowledge of the protein universe

    Science.gov (United States)

    Grabowski, Marek; Joachimiak, Andrzej; Otwinowski, Zbyszek; Minor, Wladek

    2010-01-01

    Structural characterization of the protein universe is the main mission of Structural Genomics (SG) programs. However, progress in gene sequencing technology, set in motion in the 1990s, has resulted in rapid expansion of protein sequence space — a twelvefold increase in the past seven years. For the SG field, this creates new challenges and necessitates a reassessment of its strategies. Nevertheless, despite the growth of sequence space, at present nearly half of the content of the Swiss-Prot database and over 40% of Pfam protein families can be structurally modeled based on structures determined so far, with SG projects making an increasingly significant contribution. The SG contribution of new Pfam structures nearly doubled from 27.2% in 2003 to 51.6% in 2006. PMID:17587562

  6. TRFolder-W: a web server for telomerase RNA structure prediction in yeast genomes.

    Science.gov (United States)

    Zhang, Dong; Xue, Xingran; Malmberg, Russell L; Cai, Liming

    2012-10-15

    TRFolder-W is a web server capable of predicting core structures of telomerase RNA (TR) in yeast genomes. TRFolder is a command-line Python toolkit for TR-specific structure prediction. We developed a web-version built on the django web framework, leveraging the work done previously, to include enhancements to increase flexibility of usage. To date, there are five core sub-structures commonly found in TR of fungal species, which are the template region, downstream pseudoknot, boundary element, core-closing stem and triple helix. The aim of TRFolder-W is to use the five core structures as fundamental units to predict potential TR genes for yeast, and to provide a user-friendly interface. Moreover, the application of TRFolder-W can be extended to predict the characteristic structure on species other than fungal species. The web server TRFolder-W is available at http://rna-informatics.uga.edu/?f=software&p=TRFolder-w.

  7. An Arabidopsis callose synthase

    DEFF Research Database (Denmark)

    Ostergaard, Lars; Petersen, Morten; Mattsson, Ole

    2002-01-01

    in the Arabidopsis mpk4 mutant which exhibits systemic acquired resistance (SAR), elevated beta-1,3-glucan synthase activity, and increased callose levels. In addition, AtGsl5 is a likely target of salicylic acid (SA)-dependent SAR, since AtGsl5 mRNA accumulation is induced by SA in wild-type plants, while...... expression of the nahG salicylate hydroxylase reduces AtGsl5 mRNA levels in the mpk4 mutant. These results indicate that AtGsl5 is likely involved in callose synthesis in flowering tissues and in the mpk4 mutant....

  8. StructureFold: genome-wide RNA secondary structure mapping and reconstruction in vivo.

    Science.gov (United States)

    Tang, Yin; Bouvier, Emil; Kwok, Chun Kit; Ding, Yiliang; Nekrutenko, Anton; Bevilacqua, Philip C; Assmann, Sarah M

    2015-08-15

    RNAs fold into complex structures that are integral to the diverse mechanisms underlying RNA regulation of gene expression. Recent development of transcriptome-wide RNA structure profiling through the application of structure-probing enzymes or chemicals combined with high-throughput sequencing has opened a new field that greatly expands the amount of in vitro and in vivo RNA structural information available. The resultant datasets provide the opportunity to investigate RNA structural information on a global scale. However, the analysis of high-throughput RNA structure profiling data requires considerable computational effort and expertise. We present a new platform, StructureFold, that provides an integrated computational solution designed specifically for large-scale RNA structure mapping and reconstruction across any transcriptome. StructureFold automates the processing and analysis of raw high-throughput RNA structure profiling data, allowing the seamless incorporation of wet-bench structural information from chemical probes and/or ribonucleases to restrain RNA secondary structure prediction via the RNAstructure and ViennaRNA package algorithms. StructureFold performs reads mapping and alignment, normalization and reactivity derivation, and RNA structure prediction in a single user-friendly web interface or via local installation. The variation in transcript abundance and length that prevails in living cells and consequently causes variation in the counts of structure-probing events between transcripts is accounted for. Accordingly, StructureFold is applicable to RNA structural profiling data obtained in vivo as well as to in vitro or in silico datasets. StructureFold is deployed via the Galaxy platform. StructureFold is freely available as a component of Galaxy available at: https://usegalaxy.org/. yxt148@psu.edu or sma3@psu.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights

  9. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    Science.gov (United States)

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  10. Monoterpene synthases from common sage (Salvia officinalis)

    Energy Technology Data Exchange (ETDEWEB)

    Croteau, Rodney Bruce (Pullman, WA); Wise, Mitchell Lynn (Pullman, WA); Katahira, Eva Joy (Pullman, WA); Savage, Thomas Jonathan (Christchurch 5, NZ)

    1999-01-01

    cDNAs encoding (+)-bornyl diphosphate synthase, 1,8-cineole synthase and (+)-sabinene synthase from common sage (Salvia officinalis) have been isolated and sequenced, and the corresponding amino acid sequences has been determined. Accordingly, isolated DNA sequences (SEQ ID No:1; SEQ ID No:3 and SEQ ID No:5) are provided which code for the expression of (+)-bornyl diphosphate synthase (SEQ ID No:2), 1,8-cineole synthase (SEQ ID No:4) and (+)-sabinene synthase SEQ ID No:6), respectively, from sage (Salvia officinalis). In other aspects, replicable recombinant cloning vehicles are provided which code for (+)-bornyl diphosphate synthase, 1,8-cineole synthase or (+)-sabinene synthase, or for a base sequence sufficiently complementary to at least a portion of (+)-bornyl diphosphate synthase, 1,8-cineole synthase or (+)-sabinene synthase DNA or RNA to enable hybridization therewith. In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding (+)-bornyl diphosphate synthase, 1,8-cineole synthase or (+)-sabinene synthase. Thus, systems and methods are provided for the recombinant expression of the aforementioned recombinant monoterpene synthases that may be used to facilitate their production, isolation and purification in significant amounts. Recombinant (+)-bornyl diphosphate synthase, 1,8-cineole synthase and (+)-sabinene synthase may be used to obtain expression or enhanced expression of (+)-bornyl diphosphate synthase, 1,8-cineole synthase and (+)-sabinene synthase in plants in order to enhance the production of monoterpenoids, or may be otherwise employed for the regulation or expression of (+)-bornyl diphosphate synthase, 1,8-cineole synthase and (+)-sabinene synthase, or the production of their products.

  11. Complete sequence and structure of the mitochondrial genome of the human tapeworm, Taenia asiatica (Platyhelminthes; Cestoda).

    Science.gov (United States)

    Jeon, H K; Lee, K H; Kim, K H; Hwang, U W; Eom, K S

    2005-06-01

    The complete Taenia asiatica mitochondrial genome was amplified by long extension polymerase chain reaction (long PCR) to yield overlapping fragments that were then completely sequenced. The whole mitochondrial genome was 13 703 bp long and contained 12 protein-encoding, 2 ribosomal RNA (small and large subunits), 22 transfer RNA genes and a short non-coding region. Thus, its gene contents are like those typically found in metazoan animal mitochondrial genomes (apart from the absence of atp8). All the genes were transcribed from the same strand. The 3' end 34 bp region of nad4L overlapped with the 5' end portion of nad4. The tRNA genes were 61-69 bp long, and the secondary structures of 18 tRNAs had typical clover-leaf shapes with paired DHU arms. However, trnC, trnS1, trnS2 and trnR had unpaired DHU arms that were 7-12 bp in length. The tRNAs that transferred serine lacked a DHU arm, as is also observed in a number of parasitic platyhelminths and metazoans. However, the trematode trnRs have paired DHU arms. The T. asiatica mtDNA non-coding region was like that in other cestodes since it was composed of a short non-coding region of 72 nucleotides and a long non-coding region of 176 nucleotides separated by a trnL1/, trnS2/, trnL2/, trnR/, nad5 gene cluster. The sequences of the cox1 genes between T. asiatica and T. saginata differ by 4.6%, while the T. asiatica cob gene differs by 4.1% and 12.9% from the cob genes of T. saginata and T. solium, respectively. In conclusion, the T. asiatica mitocondrial genome should provide a resource for comparative mitochondrial genomics and systematic studies of parasitic cestodes.

  12. The Diversity, Structure, and Function of Heritable Adaptive Immunity Sequences in the Aedes aegypti Genome.

    Science.gov (United States)

    Whitfield, Zachary J; Dolan, Patrick T; Kunitomi, Mark; Tassetto, Michel; Seetin, Matthew G; Oh, Steve; Heiner, Cheryl; Paxinos, Ellen; Andino, Raul

    2017-11-20

    The Aedes aegypti mosquito transmits arboviruses, including dengue, chikungunya, and Zika virus. Understanding the mechanisms underlying mosquito immunity could provide new tools to control arbovirus spread. Insects exploit two different RNAi pathways to combat viral and transposon infection: short interfering RNAs (siRNAs) and PIWI-interacting RNAs (piRNAs) [1, 2]. Endogenous viral elements (EVEs) are sequences from non-retroviral viruses that are inserted into the mosquito genome and can act as templates for the production of piRNAs [3, 4]. EVEs therefore represent a record of past infections and a reservoir of potential immune memory [5]. The large-scale organization of EVEs has been difficult to resolve with short-read sequencing because they tend to integrate into repetitive regions of the genome. To define the diversity, organization, and function of EVEs, we took advantage of the contiguity associated with long-read sequencing to generate a high-quality assembly of the Ae. aegypti-derived Aag2 cell line genome, an important and widely used model system. We show EVEs are acquired through recombination with specific classes of long terminal repeat (LTR) retrotransposons and organize into large loci (>50 kbp) characterized by high LTR density. These EVE-containing loci have increased density of piRNAs compared to similar regions without EVEs. Furthermore, we detected EVE-derived piRNAs consistent with a targeted processing of persistently infecting virus genomes. We propose that comparisons of EVEs across mosquito populations may explain differences in vector competence, and further study of the structure and function of these elements in the genome of mosquitoes may lead to epidemiological interventions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. GW-SEM: A Statistical Package to Conduct Genome-Wide Structural Equation Modeling.

    Science.gov (United States)

    Verhulst, Brad; Maes, Hermine H; Neale, Michael C

    2017-05-01

    Improving the accuracy of phenotyping through the use of advanced psychometric tools will increase the power to find significant associations with genetic variants and expand the range of possible hypotheses that can be tested on a genome-wide scale. Multivariate methods, such as structural equation modeling (SEM), are valuable in the phenotypic analysis of psychiatric and substance use phenotypes, but these methods have not been integrated into standard genome-wide association analyses because fitting a SEM at each single nucleotide polymorphism (SNP) along the genome was hitherto considered to be too computationally demanding. By developing a method that can efficiently fit SEMs, it is possible to expand the set of models that can be tested. This is particularly necessary in psychiatric and behavioral genetics, where the statistical methods are often handicapped by phenotypes with large components of stochastic variance. Due to the enormous amount of data that genome-wide scans produce, the statistical methods used to analyze the data are relatively elementary and do not directly correspond with the rich theoretical development, and lack the potential to test more complex hypotheses about the measurement of, and interaction between, comorbid traits. In this paper, we present a method to test the association of a SNP with multiple phenotypes or a latent construct on a genome-wide basis using a diagonally weighted least squares (DWLS) estimator for four common SEMs: a one-factor model, a one-factor residuals model, a two-factor model, and a latent growth model. We demonstrate that the DWLS parameters and p-values strongly correspond with the more traditional full information maximum likelihood parameters and p-values. We also present the timing of simulations and power analyses and a comparison with and existing multivariate GWAS software package.

  14. Aphis Glycines Virus 2, a Novel Insect Virus with a Unique Genome Structure

    Directory of Open Access Journals (Sweden)

    Sijun Liu

    2016-11-01

    Full Text Available The invasive soybean aphid, Aphis glycines, is a major pest in soybeans, resulting in substantial economic loss. We analyzed the A. glycines transcriptome to identify sequences derived from viruses of A. glycines. We identified sequences derived from a novel virus named Aphis glycines virus 2 (ApGlV2. The assembled virus genome sequence was confirmed by reverse transcription polymerase chain reaction (RT-PCR and Sanger sequencing, conserved domains were characterized, and distribution, and transmission examined. This virus has a positive sense, single-stranded RNA genome of ~4850 nt that encodes three proteins. The RNA-dependent RNA polymerase (RdRp of ApGlV2 is a permuted RdRp similar to those of some tetraviruses, while the capsid protein is structurally similar to the capsid proteins of plant sobemoviruses. ApGlV2 also encodes a larger minor capsid protein, which is translated by a readthrough mechanism. ApGlV2 appears to be widespread in A. glycines populations and to persistently infect aphids with a 100% vertical transmission rate. ApGlV2 is susceptible to the antiviral RNA interference (RNAi pathway. This virus, with its unique genome structure with both plant- and insect-virus characteristics, is of particular interest from an evolutionary standpoint.

  15. Reactivation of methionine synthase from Thermotoga maritima (TM0268) requires the downstream gene product TM0269.

    Science.gov (United States)

    Huang, Sha; Romanchuk, Gail; Pattridge, Katherine; Lesley, Scott A; Wilson, Ian A; Matthews, Rowena G; Ludwig, Martha

    2007-08-01

    The crystal structure of the Thermotoga maritima gene product TM0269, determined as part of genome-wide structural coverage of T. maritima by the Joint Center for Structural Genomics, revealed structural homology with the fourth module of the cobalamin-dependent methionine synthase (MetH) from Escherichia coli, despite the lack of significant sequence homology. The gene specifying TM0269 lies in close proximity to another gene, TM0268, which shows sequence homology with the first three modules of E. coli MetH. The fourth module of E. coli MetH is required for reductive remethylation of the cob(II)alamin form of the cofactor and binds the methyl donor for this reactivation, S-adenosylmethionine (AdoMet). Measurements of the rates of methionine formation in the presence and absence of TM0269 and AdoMet demonstrate that both TM0269 and AdoMet are required for reactivation of the inactive cob(II)alamin form of TM0268. These activity measurements confirm the structure-based assignment of the function of the TM0269 gene product. In the presence of TM0269, AdoMet, and reductants, the measured activity of T. maritima MetH is maximal near 80 degrees C, where the specific activity of the purified protein is approximately 15% of that of E. coli methionine synthase (MetH) at 37 degrees C. Comparisons of the structures and sequences of TM0269 and the reactivation domain of E. coli MetH suggest that AdoMet may be bound somewhat differently by the homologous proteins. However, the conformation of a hairpin that is critical for cobalamin binding in E. coli MetH, which constitutes an essential structural element, is retained in the T. maritima reactivation protein despite striking divergence of the sequences.

  16. Extensive loss of translational genes in the structurally dynamic mitochondrial genome of the angiosperm Silene latifolia

    Directory of Open Access Journals (Sweden)

    Sloan Daniel B

    2010-09-01

    Full Text Available Abstract Background Mitochondrial gene loss and functional transfer to the nucleus is an ongoing process in many lineages of plants, resulting in substantial variation across species in mitochondrial gene content. The Caryophyllaceae represents one lineage that has experienced a particularly high rate of mitochondrial gene loss relative to other angiosperms. Results In this study, we report the first complete mitochondrial genome sequence from a member of this family, Silene latifolia. The genome can be mapped as a 253,413 bp circle, but its structure is complicated by a large repeated region that is present in 6 copies. Active recombination among these copies produces a suite of alternative genome configurations that appear to be at or near "recombinational equilibrium". The genome contains the fewest genes of any angiosperm mitochondrial genome sequenced to date, with intact copies of only 25 of the 41 protein genes inferred to be present in the common ancestor of angiosperms. As observed more broadly in angiosperms, ribosomal proteins have been especially prone to gene loss in the S. latifolia lineage. The genome has also experienced a major reduction in tRNA gene content, including loss of functional tRNAs of both native and chloroplast origin. Even assuming expanded wobble-pairing rules, the mitochondrial genome can support translation of only 17 of the 61 sense codons, which code for only 9 of the 20 amino acids. In addition, genes encoding 18S and, especially, 5S rRNA exhibit exceptional sequence divergence relative to other plants. Divergence in one region of 18S rRNA appears to be the result of a gene conversion event, in which recombination with a homologous gene of chloroplast origin led to the complete replacement of a helix in this ribosomal RNA. Conclusions These findings suggest a markedly expanded role for nuclear gene products in the translation of mitochondrial genes in S. latifolia and raise the possibility of altered

  17. Maintenance of genome stability in plants: repairing DNA double strand breaks and chromatin structure stability

    Directory of Open Access Journals (Sweden)

    Sujit eRoy

    2014-09-01

    Full Text Available Plant cells are subject to high levels of DNA damage resulting from plant’s obligatory dependence on sunlight and the associated exposure to environmental stresses like solar UV radiation, high soil salinity, drought, chilling injury and other air and soil pollutants including heavy metals and metabolic byproducts from endogenous processes. The irreversible DNA damages, generated by the environmental and genotoxic stresses affect plant growth and development, reproduction and crop productivity. Thus, for maintaining genome stability, plants have developed an extensive array of mechanisms for the detection and repair of DNA damages. This review will focus recent advances in our understanding of mechanisms regulating plant genome stability in the context of repairing of double stand breaks and chromatin structure maintenance.

  18. A penalized linear mixed model for genomic prediction using pedigree structures.

    Science.gov (United States)

    Yang, Can; Li, Cong; Chen, Mengjie; Chen, Xiaowei; Hou, Lin; Zhao, Hongyu

    2014-01-01

    Genetic Analysis Workshop 18 provided a platform for evaluating genomic prediction power based on single-nucleotide polymorphisms from single-nucleotide polymorphism array data and sequencing data. Also, Genetic Analysis Workshop 18 provided a diverse pedigree structure to be explored in prediction. In this study, we attempted to combine pedigree information with single-nucleotide polymorphism data to predict systolic blood pressure. Our results suggested that the prediction power based on pedigree information only could be unsatisfactory. Using additional information such as single-nucleotide polymorphism genotypes would improve prediction accuracy. In particular, the improvement can be significant when there exist a few single-nucleotide polymorphisms with relatively larger effect sizes. We also compared the prediction performance based on genome-wide association study data (ie, common variants) and sequencing data (ie, common variants plus low-frequency variants). The experimental result showed that inclusion of low frequency variants could not lead to improvement of prediction accuracy.

  19. Structure and mechanism of the ATPase that powers viral genome packaging.

    Science.gov (United States)

    Hilbert, Brendan J; Hayes, Janelle A; Stone, Nicholas P; Duffy, Caroline M; Sankaran, Banumathi; Kelch, Brian A

    2015-07-21

    Many viruses package their genomes into procapsids using an ATPase machine that is among the most powerful known biological motors. However, how this motor couples ATP hydrolysis to DNA translocation is still unknown. Here, we introduce a model system with unique properties for studying motor structure and mechanism. We describe crystal structures of the packaging motor ATPase domain that exhibit nucleotide-dependent conformational changes involving a large rotation of an entire subdomain. We also identify the arginine finger residue that catalyzes ATP hydrolysis in a neighboring motor subunit, illustrating that previous models for motor structure need revision. Our findings allow us to derive a structural model for the motor ring, which we validate using small-angle X-ray scattering and comparisons with previously published data. We illustrate the model's predictive power by identifying the motor's DNA-binding and assembly motifs. Finally, we integrate our results to propose a mechanistic model for DNA translocation by this molecular machine.

  20. Engineering of plant type III polyketide synthases.

    Science.gov (United States)

    Wakimoto, Toshiyuki; Morita, Hiroyuki; Abe, Ikuro

    2012-01-01

    Members of the chalcone synthase superfamily of type III polyketide synthases (PKSs) catalyze iterative condensations of CoA thioesters to produce a variety of polyketide scaffolds with remarkable structural diversity and biological activities. The homodimeric type III PKSs share a common three-dimensional overall fold with a conserved Cys-His-Asn catalytic triad; notably, only a slight modification of the active site dramatically expands the catalytic repertoire of the enzymes. In addition, the enzymes exhibit extremely promiscuous substrate specificities, and accept a variety of nonphysiological substrates, making the type III PKSs an excellent platform for the further production of unnatural, novel polyketide scaffolds with promising biological activities. This chapter summarizes recent advances in the engineering of plant type III PKS enzymes in our laboratories, using approaches combining structure-based enzyme engineering and precursor-directed biosynthesis with rationally designed substrate analogs. Copyright © 2012 Elsevier Inc. All rights reserved.

  1. Fast and accurate search for non-coding RNA pseudoknot structures in genomes.

    Science.gov (United States)

    Huang, Zhibin; Wu, Yong; Robertson, Joseph; Feng, Liang; Malmberg, Russell L; Cai, Liming

    2008-10-15

    Searching genomes for non-coding RNAs (ncRNAs) by their secondary structure has become an important goal for bioinformatics. For pseudoknot-free structures, ncRNA search can be effective based on the covariance model and CYK-type dynamic programming. However, the computational difficulty in aligning an RNA sequence to a pseudoknot has prohibited fast and accurate search of arbitrary RNA structures. Our previous work introduced a graph model for RNA pseudoknots and proposed to solve the structure-sequence alignment by graph optimization. Given k candidate regions in the target sequence for each of the n stems in the structure, we could compute a best alignment in time O(k(t)n) based upon a tree width t decomposition of the structure graph. However, to implement this method to programs that can routinely perform fast yet accurate RNA pseudoknot searches, we need novel heuristics to ensure that, without degrading the accuracy, only a small number of stem candidates need to be examined and a tree decomposition of a small tree width can always be found for the structure graph. The current work builds on the previous one with newly developed preprocessing algorithms to reduce the values for parameters k and t and to implement the search method into a practical program, called RNATOPS, for RNA pseudoknot search. In particular, we introduce techniques, based on probabilistic profiling and distance penalty functions, which can identify for every stem just a small number k (e.g. k algorithm that can yield tree decomposition of small tree width t (e.g. t search prokaryotic and eukaryotic genomes for specific RNA structures of medium to large sizes, including pseudoknots, with high sensitivity and high specificity, and in a reasonable amount of time.

  2. Structural genomic alterations in primary mediastinal large B-cell lymphoma.

    Science.gov (United States)

    Twa, David D W; Steidl, Christian

    2015-01-01

    Primary mediastinal large B-cell lymphoma (PMBCL) is an aggressive non-Hodgkin lymphoma that displays phenotypic and genotypic similarity to Hodgkin lymphoma and diffuse large B-cell lymphoma. Studies using genome-wide discovery tools have revealed specific, recurrent structural aberrations as critical somatic events in the pathogenesis of PMBCL. These structural alterations prominently include transcript and protein altering rearrangements and copy number variations of the programmed death ligands 1 (CD274) and 2 (PDCD1LG2), CIITA, JAK2 and REL. Importantly, evidence is emerging that these acquired structural genomic changes, in synergy with other somatic alterations, contribute to PMBCL pathogenesis by influencing tumor microenvironment interactions that favor malignant B-cell growth. The means by which these rearrangements arise are not well understood. However, analysis of breakpoint junctions at base-pair resolution provides preliminary insight into putative rearrangement mechanisms. As the field also anticipates predictive value and therapeutic targeting of structural changes involving programmed death ligands and JAK2, a review of therapies that will likely shape future lymphoma treatment is needed.

  3. Probing Retroviral and Retrotransposon Genome Structures: The “SHAPE” of Things to Come

    Directory of Open Access Journals (Sweden)

    Joanna Sztuba-Solinska

    2012-01-01

    Full Text Available Understanding the nuances of RNA structure as they pertain to biological function remains a formidable challenge for retrovirus research and development of RNA-based therapeutics, an area of particular importance with respect to combating HIV infection. Although a variety of chemical and enzymatic RNA probing techniques have been successfully employed for more than 30 years, they primarily interrogate small (100–500 nt RNAs that have been removed from their biological context, potentially eliminating long-range tertiary interactions (such as kissing loops and pseudoknots that may play a critical regulatory role. Selective 2′ hydroxyl acylation analyzed by primer extension (SHAPE, pioneered recently by Merino and colleagues, represents a facile, user-friendly technology capable of interrogating RNA structure with a single reagent and, combined with automated capillary electrophoresis, can analyze an entire 10,000-nucleotide RNA genome in a matter of weeks. Despite these obvious advantages, SHAPE essentially provides a nucleotide “connectivity map,” conversion of which into a 3-D structure requires a variety of complementary approaches. This paper summarizes contributions from SHAPE towards our understanding of the structure of retroviral genomes, modifications to which technology that have been developed to address some of its limitations, and future challenges.

  4. Damming the genomic data flood using a comprehensive analysis and storage data structure.

    Science.gov (United States)

    Bouffard, Marc; Phillips, Michael S; Brown, Andrew M K; Marsh, Sharon; Tardif, Jean-Claude; van Rooij, Tibor

    2010-01-01

    Data generation, driven by rapid advances in genomic technologies, is fast outpacing our analysis capabilities. Faced with this flood of data, more hardware and software resources are added to accommodate data sets whose structure has not specifically been designed for analysis. This leads to unnecessarily lengthy processing times and excessive data handling and storage costs. Current efforts to address this have centered on developing new indexing schemas and analysis algorithms, whereas the root of the problem lies in the format of the data itself. We have developed a new data structure for storing and analyzing genotype and phenotype data. By leveraging data normalization techniques, database management system capabilities and the use of a novel multi-table, multidimensional database structure we have eliminated the following: (i) unnecessarily large data set size due to high levels of redundancy, (ii) sequential access to these data sets and (iii) common bottlenecks in analysis times. The resulting novel data structure horizontally divides the data to circumvent traditional problems associated with the use of databases for very large genomic data sets. The resulting data set required 86% less disk space and performed analytical calculations 6248 times faster compared to a standard approach without any loss of information. Database URL: http://castor.pharmacogenomics.ca.

  5. Novel insights through the integration of structural and functional genomics data with protein networks.

    Science.gov (United States)

    Clarke, Declan; Bhardwaj, Nitin; Gerstein, Mark B

    2012-09-01

    In recent years, major advances in genomics, proteomics, macromolecular structure determination, and the computational resources capable of processing and disseminating the large volumes of data generated by each have played major roles in advancing a more systems-oriented appreciation of biological organization. One product of systems biology has been the delineation of graph models for describing genome-wide protein-protein interaction networks. The network organization and topology which emerges in such models may be used to address fundamental questions in an array of cellular processes, as well as biological features intrinsic to the constituent proteins (or "nodes") themselves. However, graph models alone constitute an abstraction which neglects the underlying biological and physical reality that the network's nodes and edges are highly heterogeneous entities. Here, we explore some of the advantages of introducing a protein structural dimension to such models, as the marriage of conventional network representations with macromolecular structural data helps to place static node and edge constructs in a biologically more meaningful context. We emphasize that 3D protein structures constitute a valuable conceptual and predictive framework by discussing examples of the insights provided, such as enabling in silico predictions of protein-protein interactions, providing rational and compelling classification schemes for network elements, as well as revealing interesting intrinsic differences between distinct node types, such as disorder and evolutionary features, which may then be rationalized in light of their respective functions within networks. Copyright © 2012 Elsevier Inc. All rights reserved.

  6. Structure, High Affinity, and Negative Cooperativity of the Escherichia coli Holo-(Acyl Carrier Protein):Holo-(Acyl Carrier Protein) Synthase Complex

    Energy Technology Data Exchange (ETDEWEB)

    Marcella, Aaron M.; Culbertson, Sannie J.; Shogren-Knaak, Michael A.; Barb, Adam W.

    2017-11-01

    The Escherichia coli holo-(acyl carrier protein) synthase (ACPS) catalyzes the coenzyme A-dependent activation of apo-ACPP to generate holo-(acyl carrier protein) (holo-ACPP) in an early step of fatty acid biosynthesis. E. coli ACPS is sufficiently different from the human fatty acid synthase to justify the development of novel ACPS-targeting antibiotics. Models of E. coli ACPS in unliganded and holo-ACPP-bound forms solved by X-ray crystallography to 2.05 and 4.10 Å, respectively, revealed that ACPS bound three product holo-ACPP molecules to form a 3:3 hexamer. Solution NMR spectroscopy experiments validated the ACPS binding interface on holo-ACPP using chemical shift perturbations and by determining the relative orientation of holo-ACPP to ACPS by fitting residual dipolar couplings. The binding interface is organized to arrange contacts between positively charged ACPS residues and the holo-ACPP phosphopantetheine moiety, indicating product contains more stabilizing interactions than expected in the enzyme:substrate complex. Indeed, holo-ACPP bound the enzyme with greater affinity than the substrate, apo-ACPP, and with negative cooperativity. The first equivalent of holo-ACPP bound with a KD = 62 ± 13 nM, followed by the binding of two more equivalents of holo-ACPP with KD = 1.2 ± 0.2 μM. Cooperativity was not observed for apo-ACPP which bound with KD = 2.4 ± 0.1 μM. Strong product binding and high levels of holo-ACPP in the cell identify a potential regulatory role of ACPS in fatty acid biosynthesis.

  7. Population Structure and Genomic Breed Composition in an Angus–Brahman Crossbred Cattle Population

    Directory of Open Access Journals (Sweden)

    Mesfin Gobena

    2018-03-01

    Full Text Available Crossbreeding is a common strategy used in tropical and subtropical regions to enhance beef production, and having accurate knowledge of breed composition is essential for the success of a crossbreeding program. Although pedigree records have been traditionally used to obtain the breed composition of crossbred cattle, the accuracy of pedigree-based breed composition can be reduced by inaccurate and/or incomplete records and Mendelian sampling. Breed composition estimation from genomic data has multiple advantages including higher accuracy without being affected by missing, incomplete, or inaccurate records and the ability to be used as independent authentication of breed in breed-labeled beef products. The present study was conducted with 676 Angus–Brahman crossbred cattle with genotype and pedigree information to evaluate the feasibility and accuracy of using genomic data to determine breed composition. We used genomic data in parametric and non-parametric methods to detect population structure due to differences in breed composition while accounting for the confounding effect of close familial relationships. By applying principal component analysis (PCA and the maximum likelihood method of ADMIXTURE to genomic data, it was possible to successfully characterize population structure resulting from heterogeneous breed ancestry, while accounting for close familial relationships. PCA results offered additional insight into the different hierarchies of genetic variation structuring. The first principal component was strongly correlated with Angus–Brahman proportions, and the second represented variation within animals that have a relatively more extended Brangus lineage—indicating the presence of a distinct pattern of genetic variation in these cattle. Although there was strong agreement between breed proportions estimated from pedigree and genetic information, there were significant discrepancies between these two methods for certain animals

  8. Whole genome PCR scanning reveals the syntenic genome structure of toxigenic Vibrio cholerae strains in the O1/O139 population.

    Directory of Open Access Journals (Sweden)

    Bo Pang

    Full Text Available Vibrio cholerae is commonly found in estuarine water systems. Toxigenic O1 and O139 V. cholerae strains have caused cholera epidemics and pandemics, whereas the nontoxigenic strains within these serogroups only occasionally lead to disease. To understand the differences in the genome and clonality between the toxigenic and nontoxigenic strains of V. cholerae serogroups O1 and O139, we employed a whole genome PCR scanning (WGPScanning method, an rrn operon-mediated fragment rearrangement analysis and comparative genomic hybridization (CGH to analyze the genome structure of different strains. WGPScanning in conjunction with CGH revealed that the genomic contents of the toxigenic strains were conservative, except for a few indels located mainly in mobile elements. Minor nucleotide variation in orthologous genes appeared to be the major difference between the toxigenic strains. rrn operon-mediated rearrangements were infrequent in El Tor toxigenic strains tested using I-CeuI digested pulsed-field gel electrophoresis (PFGE analysis and PCR analysis based on flanking sequence of rrn operons. Using these methods, we found that the genomic structures of toxigenic El Tor and O139 strains were syntenic. The nontoxigenic strains exhibited more extensive sequence variations, but toxin coregulated pilus positive (TCP+ strains had a similar structure. TCP+ nontoxigenic strains could be subdivided into multiple lineages according to the TCP type, suggesting the existence of complex intermediates in the evolution of toxigenic strains. The data indicate that toxigenic O1 El Tor and O139 strains were derived from a single lineage of intermediates from complex clones in the environment. The nontoxigenic strains with non-El Tor type TCP may yet evolve into new epidemic clones after attaining toxigenic attributes.

  9. One amino acid makes the difference: the formation of ent-kaurene and 16α-hydroxy-ent-kaurane by diterpene synthases in poplar.

    Science.gov (United States)

    Irmisch, Sandra; Müller, Andrea T; Schmidt, Lydia; Günther, Jan; Gershenzon, Jonathan; Köllner, Tobias G

    2015-10-28

    Labdane-related diterpenoids form the largest group among the diterpenes. They fulfill important functions in primary metabolism as essential plant growth hormones and are known to function in secondary metabolism as, for example, phytoalexins. The biosynthesis of labdane-related diterpenes is mediated by the action of class II and class I diterpene synthases. Although terpene synthases have been well investigated in poplar, little is known about diterpene formation in this woody perennial plant species. The recently sequenced genome of Populus trichocarpa possesses two putative copalyl diphosphate synthase genes (CPS, class II) and two putative kaurene synthase genes (KS, class I), which most likely arose through a genome duplication and a recent tandem gene duplication, respectively. We showed that the CPS-like gene PtTPS17 encodes an ent-copalyl diphosphate synthase (ent-CPS), while the protein encoded by the putative CPS gene PtTPS18 showed no enzymatic activity. The putative kaurene synthases PtTPS19 and PtTPS20 both accepted ent-copalyl diphosphate (ent-CPP) as substrate. However, despite their high sequence similarity, they produced different diterpene products. While PtTPS19 formed exclusively ent-kaurene, PtTPS20 generated mainly the diterpene alcohol, 16α-hydroxy-ent-kaurane. Using homology-based structure modeling and site-directed mutagenesis, we demonstrated that one amino acid residue determines the different product specificity of PtTPS19 and PtTPS20. A reciprocal exchange of methionine 607 and threonine 607 in the active sites of PtTPS19 and PtTPS20, respectively, led to a complete interconversion of the enzyme product profiles. Gene expression analysis revealed that the diterpene synthase genes characterized showed organ-specific expression with the highest abundance of PtTPS17 and PtTPS20 transcripts in poplar roots. The poplar diterpene synthases PtTPS17, PtTPS19, and PtTPS20 contribute to the production of ent-kaurene and 16

  10. The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability.

    Science.gov (United States)

    Hamperl, Stephan; Cimprich, Karlene A

    2014-07-01

    Accurate DNA replication and DNA repair are crucial for the maintenance of genome stability, and it is generally accepted that failure of these processes is a major source of DNA damage in cells. Intriguingly, recent evidence suggests that DNA damage is more likely to occur at genomic loci with high transcriptional activity. Furthermore, loss of certain RNA processing factors in eukaryotic cells is associated with increased formation of co-transcriptional RNA:DNA hybrid structures known as R-loops, resulting in double-strand breaks (DSBs) and DNA damage. However, the molecular mechanisms by which R-loop structures ultimately lead to DNA breaks and genome instability is not well understood. In this review, we summarize the current knowledge about the formation, recognition and processing of RNA:DNA hybrids, and discuss possible mechanisms by which these structures contribute to DNA damage and genome instability in the cell. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo.

    Science.gov (United States)

    Zubradt, Meghan; Gupta, Paromita; Persad, Sitara; Lambowitz, Alan M; Weissman, Jonathan S; Rouskin, Silvi

    2017-01-01

    Coupling of structure-specific in vivo chemical modification to next-generation sequencing is transforming RNA secondary structure studies in living cells. The dominant strategy for detecting in vivo chemical modifications uses reverse transcriptase truncation products, which introduce biases and necessitate population-average assessments of RNA structure. Here we present dimethyl sulfate (DMS) mutational profiling with sequencing (DMS-MaPseq), which encodes DMS modifications as mismatches using a thermostable group II intron reverse transcriptase. DMS-MaPseq yields a high signal-to-noise ratio, can report multiple structural features per molecule, and allows both genome-wide studies and focused in vivo investigations of even low-abundance RNAs. We apply DMS-MaPseq for the first analysis of RNA structure within an animal tissue and to identify a functional structure involved in noncanonical translation initiation. Additionally, we use DMS-MaPseq to compare the in vivo structure of pre-mRNAs with their mature isoforms. These applications illustrate DMS-MaPseq's capacity to dramatically expand in vivo analysis of RNA structure.

  12. Primary structure of the human follistatin precursor and its genomic organization

    International Nuclear Information System (INIS)

    Shimasaki, Shunichi; Koga, Makoto; Esch, F.

    1988-01-01

    Follistatin is a single-chain gonadal protein that specifically inhibits follicle-stimulating hormone release. By use of the recently characterized porcine follistatin cDNA as a probe to screen a human testis cDNA library and a genomic library, the structure of the complete human follistatin precursor as well as its genomic organization have been determined. Three of eight cDNA clones that were sequenced predicted a precursor with 344 amino acids, whereas the remaining five cDNA clones encoded a 317 amino acid precursor, resulting from alternative splicing of the precursor mRNA. Mature follistatins contain four contiguous domains that are encoded by precisely separated exons; three of the domains are highly similar to each other, as well as to human epidermal growth factor and human pancreatic secretory trypsin inhibitor. The genomic organization of the human follistatin is similar to that of the human epidermal growth factor gene and thus supports the notion of exon shuffling during evolution

  13. The genomic structure of human BTK, the defective gene in X-linked agammaglobulinemia

    Energy Technology Data Exchange (ETDEWEB)

    Rohrer, J.; Parolini, O. [St. Jude Children`s Research Hospital, Memphis, TN (United States); Conley, M.E. [St. Jude Children`s Research Hospital, Memphis, TN (United States)]|[Univ. of Tennessee College of Medicine, Memphis, TN (United States); Belmont, J.W. [Baylor College of Medicine, Houston, TX (United States)

    1994-12-31

    It has recently been demonstrated that mutations in the gene for Bruton`s tyrosine kinase (BTK) are responsible for X-linked agammaglobulinemia. Southern blot analysis and sequencing of cDNA were used to document deletions, insertions, and single base pair substitutions. To facilitate analysis of BTK regulation and to permit the development of assays that could be used to screen genomic DNA for mutations in BTK, the authors determined the genomic organization of this gene. Subcloning of a cosmid and a yeast artificial chromosome showed that BTK is divided into 19 exons spanning 37 kilobases of genomic DNA. Analysis of the region 5{prime} to the first untranslated exon revealed no consensus TATAA or CAAT boxes; however, three retinoic acid binding sites were identified in this region. Comparison of the structure of BTK with that of other nonreceptor tyrosine kinases, including SRC, FES, and CSK, demonstrated a lack of conservation of exon borders. Information obtained in this study will contribute to understanding of the evolution of nonreceptor tyrosine kinases. It will also be useful in diagnostic studies, including carrier detection, and in studies directed towards gene therapy or gene replacement. 29 refs., 2 figs., 2 tabs.

  14. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... of unprecedented high resolution enabling clear detection of subclonal structure and observation of otherwise undetectable mutations. Furthermore, we demonstrate that OSCC show a high degree of inter-patient heterogeneity but a low degree of intra-patient/tumor heterogeneity. However, some OSCC cancers contain...

  15. Update on the Pfam5000 Strategy for Selection of StructuralGenomics Targets

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2005-06-27

    Structural Genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good financial value, and tractable. In 2003, we presented the ''Pfam5000'' strategy, which involves selecting the 5,000 most important families from the Pfam database as sources for targets. In this update, we show that although both the Pfam database and the number of sequenced genomes have increased in size, the expected benefits of the Pfam5000 strategy have not changed substantially. Solving the structures of proteins from the 5,000 largest Pfam families would allow accurate fold assignment for approximately 65 percent of all prokaryotic proteins (covering 54 percent of residues) and 63 percent of eukaryotic proteins (42 percent of residues). Fewer than 2,300 of the largest families on this list remain to be solved, making the project feasible in the next five years given the expected throughput to be achieved in the production phase of the Protein Structure Initiative.

  16. Genomic epidemiology and population structure of Neisseria gonorrhoeae from remote highly endemic Western Australian populations.

    Science.gov (United States)

    Al Suwayyid, Barakat A; Coombs, Geoffrey W; Speers, David J; Pearson, Julie; Wise, Michael J; Kahler, Charlene M

    2018-02-27

    Neisseria gonorrhoeae causes gonorrhoea, the second most commonly notified sexually transmitted infection in Australia. One of the highest notification rates of gonorrhoea is found in the remote regions of Western Australia (WA). Unlike isolates from the major Australian population centres, the remote community isolates have low rates of antimicrobial resistance (AMR). Population structure and whole-genome comparison of 59 isolates from the Western Australian N. gonorrhoeae collection were used to investigate relatedness of isolates cultured in the metropolitan and remote areas. Core genome phylogeny, multilocus sequencing typing (MLST), N. gonorrhoeae multi-antigen sequence typing (NG-MAST) and N. gonorrhoeae sequence typing for antimicrobial resistance (NG-STAR) in addition to hierarchical clustering of sequences were used to characterize the isolates. Population structure analysis of the 59 isolates together with 72 isolates from an international collection, revealed six population groups suggesting that N. gonorrhoeae is a weakly clonal species. Two distinct population groups, Aus1 and Aus2, represented 63% of WA isolates and were mostly composed of the remote community isolates that carried no chromosomal AMR genotypes. In contrast, the Western Australian metropolitan isolates were frequently multi-drug resistant and belonged to population groups found in the international database, suggesting international transmission of the isolates. Our study suggests that the population structure of N. gonorrhoeae is distinct between the communities in remote and metropolitan WA. Given the high rate of AMR in metropolitan regions, ongoing surveillance is essential to ensure the enduring efficacy of the empiric gonorrhoea treatment in remote WA.

  17. Structural variation in the chicken genome identified by paired-end next-generation DNA sequencing of reduced representation libraries

    Directory of Open Access Journals (Sweden)

    Okimoto Ron

    2011-02-01

    Full Text Available Abstract Background Variation within individual genomes ranges from single nucleotide polymorphisms (SNPs to kilobase, and even megabase, sized structural variants (SVs, such as deletions, insertions, inversions, and more complex rearrangements. Although much is known about the extent of SVs in humans and mice, species in which they exert significant effects on phenotypes, very little is known about the extent of SVs in the 2.5-times smaller and less repetitive genome of the chicken. Results We identified hundreds of shared and divergent SVs in four commercial chicken lines relative to the reference chicken genome. The majority of SVs were found in intronic and intergenic regions, and we also found SVs in the coding regions. To identify the SVs, we combined high-throughput short read paired-end sequencing of genomic reduced representation libraries (RRLs of pooled samples from 25 individuals and computational mapping of DNA sequences from a reference genome. Conclusion We provide a first glimpse of the high abundance of small structural genomic variations in the chicken. Extrapolating our results, we estimate that there are thousands of rearrangements in the chicken genome, the majority of which are located in non-coding regions. We observed that structural variation contributes to genetic differentiation among current domesticated chicken breeds and the Red Jungle Fowl. We expect that, because of their high abundance, SVs might explain phenotypic differences and play a role in the evolution of the chicken genome. Finally, our study exemplifies an efficient and cost-effective approach for identifying structural variation in sequenced genomes.

  18. Structural and functional insights of β-glucosidases identified from the genome of Aspergillus fumigatus

    Science.gov (United States)

    Dodda, Subba Reddy; Aich, Aparajita; Sarkar, Nibedita; Jain, Piyush; Jain, Sneha; Mondal, Sudipa; Aikat, Kaustav; Mukhopadhyay, Sudit S.

    2018-03-01

    Thermostable glucose tolerant β-glucosidase from Aspergillus species has attracted worldwide interest for their potentiality in industrial applications and bioethanol production. A strain of Aspergillus fumigatus (AfNITDGPKA3) identified by our laboratory from straw retting ground showed higher cellulase activity, specifically the β-glucosidase activity, compared to other contemporary strains. Though A. fumigatus has been known for high cellulase activity, detailed identification and characterization of the cellulase genes from their genome is yet to be done. In this work we have been analyzed the cellulase genes from the genome sequence database of Aspergillus fumigatus (Af293). Genome analysis suggests two cellobiohydrolase, eleven endoglucanase and seventeen β-glucosidase genes present. β-Glucosidase genes belong to either Glycohydro1 (GH1 or Bgl1) or Glycohydro3 (GH3 or Bgl3) family. The sequence similarity suggests that Bgl1 and Bgl3 of A. fumagatus are phylogenetically close to those of A. fisheri and A. oryzae. The modelled structure of the Bgl1 predicts the (β/α)8 barrel type structure with deep and narrow active site, whereas, Bgl3 shows the (α/β)8 barrel and (α/β)6 sandwich structure with shallow and open active site. Docking results suggest that amino acids Glu544, Glu466, Trp408,Trp567,Tyr44,Tyr222,Tyr770,Asp844,Asp537,Asn212,Asn217 of Bgl3 and Asp224,Asn242,Glu440, Glu445, Tyr367, Tyr365,Thr994,Trp435,Trp446 of Bgl1 are involved in the hydrolysis. Binding affinity analyses suggest that Bgl3 and Bgl1 enzymes are more active on the substrates like 4-methylumbelliferyl glycoside (MUG) and p-nitrophenyl-β-D-1, 4-glucopyranoside (pNPG) than on cellobiose. Further docking with glucose suggests that Bgl1 is more glucose tolerant than Bgl3. Analysis of the Aspergillus fumigatus genome may help to identify a β-glucosidase enzyme with better property and the structural information may help to develop an engineered recombinant enzyme.

  19. Structural and In Vivo Studies on Trehalose-6-Phosphate Synthase from Pathogenic Fungi Provide Insights into Its Catalytic Mechanism, Biological Necessity, and Potential for Novel Antifungal Drug Design

    Energy Technology Data Exchange (ETDEWEB)

    Miao, Yi; Tenor, Jennifer L.; Toffaletti, Dena L.; Maskarinec, Stacey A.; Liu, Jiuyu; Lee, Richard E.; Perfect, John R.; Brennan, Richard G.; Hendrickson, Wayne A.

    2017-07-25

    ABSTRACT

    The disaccharide trehalose is critical to the survival of pathogenic fungi in their human host. Trehalose-6-phosphate synthase (Tps1) catalyzes the first step of trehalose biosynthesis in fungi. Here, we report the first structures of eukaryotic Tps1s in complex with substrates or substrate analogues. The overall structures of Tps1 fromCandida albicansandAspergillus fumigatusare essentially identical and reveal N- and C-terminal Rossmann fold domains that form the glucose-6-phosphate and UDP-glucose substrate binding sites, respectively. These Tps1 structures with substrates or substrate analogues reveal key residues involved in recognition and catalysis. Disruption of these key residues severely impaired Tps1 enzymatic activity. Subsequent cellular analyses also highlight the enzymatic function of Tps1 in thermotolerance, yeast-hypha transition, and biofilm development. These results suggest that Tps1 enzymatic functionality is essential for the fungal stress response and virulence. Furthermore, structures of Tps1 in complex with the nonhydrolyzable inhibitor, validoxylamine A, visualize the transition state and support an internal return-like catalytic mechanism that is generalizable to other GT-B-fold retaining glycosyltransferases. Collectively, our results depict key Tps1-substrate interactions, unveil the enzymatic mechanism of these fungal proteins, and pave the way for high-throughput inhibitor screening buttressed and guided by the current structures and those of high-affinity ligand-Tps1 complexes.

    IMPORTANCEInvasive fungal diseases have emerged as major threats, resulting in more than 1.5 million deaths annually worldwide. This epidemic has been further complicated by increasing resistance to all major classes of antifungal drugs in the clinic. Trehalose biosynthesis is essential for the fungal stress response and virulence. Critically, this biosynthetic pathway is absent in

  20. A forest-based feature screening approach for large-scale genome data with complex structures.

    Science.gov (United States)

    Wang, Gang; Fu, Guifang; Corcoran, Christopher

    2015-12-23

    Genome-wide association studies (GWAS) interrogate large-scale whole genome to characterize the complex genetic architecture for biomedical traits. When the number of SNPs dramatically increases to half million but the sample size is still limited to thousands, the traditional p-value based statistical approaches suffer from unprecedented limitations. Feature screening has proved to be an effective and powerful approach to handle ultrahigh dimensional data statistically, yet it has not received much attention in GWAS. Feature screening reduces the feature space from millions to hundreds by removing non-informative noise. However, the univariate measures used to rank features are mainly based on individual effect without considering the mutual interactions with other features. In this article, we explore the performance of a random forest (RF) based feature screening procedure to emphasize the SNPs that have complex effects for a continuous phenotype. Both simulation and real data analysis are conducted to examine the power of the forest-based feature screening. We compare it with five other popular feature screening approaches via simulation and conclude that RF can serve as a decent feature screening tool to accommodate complex genetic effects such as nonlinear, interactive, correlative, and joint effects. Unlike the traditional p-value based Manhattan plot, we use the Permutation Variable Importance Measure (PVIM) to display the relative significance and believe that it will provide as much useful information as the traditional plot. Most complex traits are found to be regulated by epistatic and polygenic variants. The forest-based feature screening is proven to be an efficient, easily implemented, and accurate approach to cope whole genome data with complex structures. Our explorations should add to a growing body of enlargement of feature screening better serving the demands of contemporary genome data.

  1. Mitochondrial Genome Sequences and Structures Aid in the Resolution of Piroplasmida phylogeny

    Science.gov (United States)

    Marr, Henry S.; Tarigo, Jaime L.; Cohn, Leah A.; Bird, David M.; Scholl, Elizabeth H.; Levy, Michael G.; Wiegmann, Brian M.; Birkenheuer, Adam J.

    2016-01-01

    The taxonomy of the order Piroplasmida, which includes a number of clinically and economically relevant organisms, is a hotly debated topic amongst parasitologists. Three genera (Babesia, Theileria, and Cytauxzoon) are recognized based on parasite life cycle characteristics, but molecular phylogenetic analyses of 18S sequences have suggested the presence of five or more distinct Piroplasmida lineages. Despite these important advancements, a few studies have been unable to define the taxonomic relationships of some organisms (e.g. C. felis and T. equi) with respect to other Piroplasmida. Additional evidence from mitochondrial genome sequences and synteny should aid in the inference of Piroplasmida phylogeny and resolution of taxonomic uncertainties. In this study, we have amplified, sequenced, and annotated seven previously uncharacterized mitochondrial genomes (Babesia canis, Babesia vogeli, Babesia rossi, Babesia sp. Coco, Babesia conradae, Babesia microti-like sp., and Cytauxzoon felis) and identified additional ribosomal fragments in ten previously characterized mitochondrial genomes. Phylogenetic analysis of concatenated mitochondrial and 18S sequences as well as cox1 amino acid sequence identified five distinct Piroplasmida groups, each of which possesses a unique mitochondrial genome structure. Specifically, our results confirm the existence of four previously identified clades (B. microti group, Babesia sensu stricto, Theileria equi, and a Babesia sensu latu group that includes B. conradae) while supporting the integration of Theileria and Cytauxzoon species into a single fifth taxon. Although known biological characteristics of Piroplasmida corroborate the proposed phylogeny, more investigation into parasite life cycles is warranted to further understand the evolution of the Piroplasmida. Our results provide an evolutionary framework for comparative biology of these important animal and human pathogens and help focus renewed efforts toward understanding the

  2. Mitochondrial Genome Sequences and Structures Aid in the Resolution of Piroplasmida phylogeny.

    Directory of Open Access Journals (Sweden)

    Megan E Schreeg

    Full Text Available The taxonomy of the order Piroplasmida, which includes a number of clinically and economically relevant organisms, is a hotly debated topic amongst parasitologists. Three genera (Babesia, Theileria, and Cytauxzoon are recognized based on parasite life cycle characteristics, but molecular phylogenetic analyses of 18S sequences have suggested the presence of five or more distinct Piroplasmida lineages. Despite these important advancements, a few studies have been unable to define the taxonomic relationships of some organisms (e.g. C. felis and T. equi with respect to other Piroplasmida. Additional evidence from mitochondrial genome sequences and synteny should aid in the inference of Piroplasmida phylogeny and resolution of taxonomic uncertainties. In this study, we have amplified, sequenced, and annotated seven previously uncharacterized mitochondrial genomes (Babesia canis, Babesia vogeli, Babesia rossi, Babesia sp. Coco, Babesia conradae, Babesia microti-like sp., and Cytauxzoon felis and identified additional ribosomal fragments in ten previously characterized mitochondrial genomes. Phylogenetic analysis of concatenated mitochondrial and 18S sequences as well as cox1 amino acid sequence identified five distinct Piroplasmida groups, each of which possesses a unique mitochondrial genome structure. Specifically, our results confirm the existence of four previously identified clades (B. microti group, Babesia sensu stricto, Theileria equi, and a Babesia sensu latu group that includes B. conradae while supporting the integration of Theileria and Cytauxzoon species into a single fifth taxon. Although known biological characteristics of Piroplasmida corroborate the proposed phylogeny, more investigation into parasite life cycles is warranted to further understand the evolution of the Piroplasmida. Our results provide an evolutionary framework for comparative biology of these important animal and human pathogens and help focus renewed efforts toward

  3. Farnesyl pyrophosphate synthase enantiospecificity with a chiral risedronate analog, [6,7-dihydro-5H-cyclopenta[c]pyridin-7-yl(hydroxy)methylene]bis(phosphonic acid) (NE-10501): Synthetic, structural, and modeling studies

    Energy Technology Data Exchange (ETDEWEB)

    Deprele, Sylvine; Kashemirov, Boris A.; Hogan, James M.; Ebetino, Frank H.; Barnett, Bobby L.; Evdokimov, Artem; McKenna, Charles E. (USC); (UCIN); (PG)

    2008-08-19

    The complex formed from crystallization of human farnesyl pyrophosphate synthase (hFPPS) from a solution of racemic [6,7-dihydro-5H-cyclopenta[c]pyridin-7-yl(hydroxy)methylene]bis(phosphonic acid) (NE-10501, 8), a chiral analog of the anti-osteoporotic drug risedronate, contained the R enantiomer in the enzyme active site. This enantiospecificity was assessed by computer modeling of inhibitor-active site interactions using Autodock 3, which was also evaluated for predictive ability in calculations of the known configurations of risedronate, zoledronate, and minodronate complexed in the active site of hFPPS. In comparison with these structures, the 8 complex exhibited certain differences, including the presence of only one Mg{sup 2+}, which could contribute to its 100-fold higher IC{sub 50}. An improved synthesis of 8 is described, which decreases the number of steps from 12 to 8 and increases the overall yield by 17-fold.

  4. Functional characterization of nine Norway Spruce TPS genes and evolution of gymnosperm terpene synthases of the TPS-d subfamily.

    Science.gov (United States)

    Martin, Diane M; Fäldt, Jenny; Bohlmann, Jörg

    2004-08-01

    Constitutive and induced terpenoids are important defense compounds for many plants against potential herbivores and pathogens. In Norway spruce (Picea abies L. Karst), treatment with methyl jasmonate induces complex chemical and biochemical terpenoid defense responses associated with traumatic resin duct development in stems and volatile terpenoid emissions in needles. The cloning of (+)-3-carene synthase was the first step in characterizing this system at the molecular genetic level. Here we report the isolation and functional characterization of nine additional terpene synthase (TPS) cDNAs from Norway spruce. These cDNAs encode four monoterpene synthases, myrcene synthase, (-)-limonene synthase, (-)-alpha/beta-pinene synthase, and (-)-linalool synthase; three sesquiterpene synthases, longifolene synthase, E,E-alpha-farnesene synthase, and E-alpha-bisabolene synthase; and two diterpene synthases, isopimara-7,15-diene synthase and levopimaradiene/abietadiene synthase, each with a unique product profile. To our knowledge, genes encoding isopimara-7,15-diene synthase and longifolene synthase have not been previously described, and this linalool synthase is the first described from a gymnosperm. These functionally diverse TPS account for much of the structural diversity of constitutive and methyl jasmonate-induced terpenoids in foliage, xylem, bark, and volatile emissions from needles of Norway spruce. Phylogenetic analyses based on the inclusion of these TPS into the TPS-d subfamily revealed that functional specialization of conifer TPS occurred before speciation of Pinaceae. Furthermore, based on TPS enclaves created by distinct branching patterns, the TPS-d subfamily is divided into three groups according to sequence similarities and functional assessment. Similarities of TPS evolution in angiosperms and modeling of TPS protein structures are discussed.

  5. Target Selection and Deselection at the Berkeley StructuralGenomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E.

    2005-03-22

    At the Berkeley Structural Genomics Center (BSGC), our goalis to obtain a near-complete structural complement of proteins in theminimal organisms Mycoplasma genitalium and M. pneumoniae, two closelyrelated pathogens. Current targets for structure determination have beenselected in six major stages, starting with those predicted to be mosttractable to high throughput study and likely to yield new structuralinformation. We report on the process used to select these proteins, aswell as our target deselection procedure. Target deselection reducesexperimental effort by eliminating targets similar to those recentlysolved by the structural biology community or other centers. We measurethe impact of the 69 structures solved at the BSGC as of July 2004 onstructure prediction coverage of the M. pneumoniae and M. genitaliumproteomes. The number of Mycoplasma proteins for which thefold couldfirst be reliably assigned based on structures solved at the BSGC (24 M.pneumoniae and 21 M. genitalium) is approximately 25 percent of the totalresulting from work at all structural genomics centers and the worldwidestructural biology community (94 M. pneumoniae and 86M. genitalium)during the same period. As the number of structures contributed by theBSGC during that period is less than 1 percent of the total worldwideoutput, the benefits of a focused target selection strategy are apparent.If the structures of all current targets were solved, the percentage ofM. pneumoniae proteins for which folds could be reliably assigned wouldincrease from approximately 57 percent (391 of 687) at present to around80 percent (550 of 687), and the percentage of the proteome that could beaccurately modeled would increase from around 37 percent (254 of 687) toabout 64 percent (438 of 687). In M. genitalium, the percentage of theproteome that could be structurally annotated based on structures of ourremaining targets would rise from 72 percent (348 of 486) to around 76percent (371 of 486), with the

  6. Identification and phylogenetic analysis of a novel starch synthase in maize

    Directory of Open Access Journals (Sweden)

    Hanmei eLiu

    2015-11-01

    Full Text Available Starch is an important reserve of carbon and energy in plants, providing the majority of calories in the human diet and animal feed. Its synthesis is orchestrated by several key enzymes, and the amount and structure of starch, affecting crop yield and quality, are determined mainly by starch synthase (SS activity. To date, five SS isoforms, including SSI-IV and Granule Bound Starch Synthase (GBSS have been identified and their physiological functions have been well characterized. Here, we report the identification of a new SS isoform in maize, designated SSV. By searching sequenced genomes, SSV has been found in all green plants with conserved sequences and gene structures. Our phylogenetic analysis based on 780 base pairs has suggested that SSIV and SSV resulted from a gene duplication event, which may have occurred before the algae formation. An expression profile analysis of SSV in maize has indicated that ZmSSV is mainly transcribed in the kernel and ear leaf during the grain filling stage, which is partly similar to other SS isoforms. Therefore, it is likely that SSV may play an important role in starch biosynthesis. Subsequent analysis of SSV function may facilitate understanding the mechanism of starch granules formation, number and structure.

  7. Structure, proteome and genome of Sinorhizobium meliloti phage ΦM5: A virus with LUZ24-like morphology and a highly mosaic genome.

    Science.gov (United States)

    Johnson, Matthew C; Sena-Velez, Marta; Washburn, Brian K; Platt, Georgia N; Lu, Stephen; Brewer, Tess E; Lynn, Jason S; Stroupe, M Elizabeth; Jones, Kathryn M

    2017-12-01

    Bacteriophages of nitrogen-fixing rhizobial bacteria are revealing a wealth of novel structures, diverse enzyme combinations and genomic features. Here we report the cryo-EM structure of the phage capsid at 4.9-5.7Å-resolution, the phage particle proteome, and the genome of the Sinorhizobium meliloti-infecting Podovirus ΦM5. This is the first structure of a phage with a capsid and capsid-associated structural proteins related to those of the LUZ24-like viruses that infect Pseudomonas aeruginosa. Like many other Podoviruses, ΦM5 is a T=7 icosahedron with a smooth capsid and short, relatively featureless tail. Nonetheless, this group is phylogenetically quite distinct from Podoviruses of the well-characterized T7, P22, and epsilon 15 supergroups. Structurally, a distinct bridge of density that appears unique to ΦM5 reaches down the body of the coat protein to the extended loop that interacts with the next monomer in a hexamer, perhaps stabilizing the mature capsid. Further, the predicted tail fibers of ΦM5 are quite different from those of enteric bacteria phages, but have domains in common with other rhizophages. Genomically, ΦM5 is highly mosaic. The ΦM5 genome is 44,005bp with 357bp direct terminal repeats (DTRs) and 58 unique ORFs. Surprisingly, the capsid structural module, the tail module, the DNA-packaging terminase, the DNA replication module and the integrase each appear to be from a different lineage. One of the most unusual features of ΦM5 is its terminase whose large subunit is quite different from previously-described short-DTR-generating packaging machines and does not fit into any of the established phylogenetic groups. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Mapping the structure of folding cores in TIM barrel proteins by hydrogen exchange mass spectrometry: the roles of motif and sequence for the indole-3-glycerol phosphate synthase from Sulfolobus solfataricus.

    Science.gov (United States)

    Gu, Zhenyu; Zitzewitz, Jill A; Matthews, C Robert

    2007-04-27

    To test the roles of motif and amino acid sequence in the folding mechanisms of TIM barrel proteins, hydrogen-deuterium exchange was used to explore the structure of the stable folding intermediates for the of indole-3-glycerol phosphate synthase from Sulfolobus solfataricus (sIGPS). Previous studies of the urea denaturation of sIGPS revealed the presence of an intermediate that is highly populated at approximately 4.5 M urea and contains approximately 50% of the secondary structure of the native (N) state. Kinetic studies showed that this apparent equilibrium intermediate is actually comprised of two thermodynamically distinct species, I(a) and I(b). To probe the location of the secondary structure in this pair of stable on-pathway intermediates, the equilibrium unfolding process of sIGPS was monitored by hydrogen-deuterium exchange mass spectrometry. The intact protein and pepsin-digested fragments were studied at various concentrations of urea by electrospray and matrix-assisted laser desorption ionization time-of-flight mass spectrometry, respectively. Intact sIGPS strongly protects at least 54 amide protons from hydrogen-deuterium exchange in the intermediate states, demonstrating the presence of stable folded cores. When the protection patterns and the exchange mechanisms for the peptides are considered with the proposed folding mechanism, the results can be interpreted to define the structural boundaries of I(a) and I(b). Comparison of these results with previous hydrogen-deuterium exchange studies on another TIM barrel protein of low sequence identify, alpha-tryptophan synthase (alphaTS), indicates that the thermodynamic states corresponding to the folding intermediates are better conserved than their structures. Although the TIM barrel motif appears to define the basic features of the folding free energy surface, the structures of the partially folded states that appear during the folding reaction depend on the amino acid sequence. Markedly, the good

  9. Modeling structure of G protein-coupled receptors in huan genome

    KAUST Repository

    Zhang, Yang

    2016-01-26

    G protein-coupled receptors (or GPCRs) are integral transmembrane proteins responsible to various cellular signal transductions. Human GPCR proteins are encoded by 5% of human genes but account for the targets of 40% of the FDA approved drugs. Due to difficulties in crystallization, experimental structure determination remains extremely difficult for human GPCRs, which have been a major barrier in modern structure-based drug discovery. We proposed a new hybrid protocol, GPCR-I-TASSER, to construct GPCR structure models by integrating experimental mutagenesis data with ab initio transmembrane-helix assembly simulations, assisted by the predicted transmembrane-helix interaction networks. The method was tested in recent community-wide GPCRDock experiments and constructed models with a root mean square deviation 1.26 Å for Dopamine-3 and 2.08 Å for Chemokine-4 receptors in the transmembrane domain regions, which were significantly closer to the native than the best templates available in the PDB. GPCR-I-TASSER has been applied to model all 1,026 putative GPCRs in the human genome, where 923 are found to have correct folds based on the confidence score analysis and mutagenesis data comparison. The successfully modeled GPCRs contain many pharmaceutically important families that do not have previously solved structures, including Trace amine, Prostanoids, Releasing hormones, Melanocortins, Vasopressin and Neuropeptide Y receptors. All the human GPCR models have been made publicly available through the GPCR-HGmod database at http://zhanglab.ccmb.med.umich.edu/GPCR-HGmod/ The results demonstrate new progress on genome-wide structure modeling of transmembrane proteins which should bring useful impact on the effort of GPCR-targeted drug discovery.

  10. Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models.

    Science.gov (United States)

    Lehermeier, Christina; Schön, Chris-Carolin; de Los Campos, Gustavo

    2015-09-01

    Plant breeding populations exhibit varying levels of structure and admixture; these features are likely to induce heterogeneity of marker effects across subpopulations. Traditionally, structure has been dealt with as a potential confounder, and various methods exist to "correct" for population stratification. However, these methods induce a mean correction that does not account for heterogeneity of marker effects. The animal breeding literature offers a few recent studies that consider modeling genetic heterogeneity in multibreed data, using multivariate models. However, these methods have received little attention in plant breeding where population structure can have different forms. In this article we address the problem of analyzing data from heterogeneous plant breeding populations, using three approaches: (a) a model that ignores population structure [A-genome-based best linear unbiased prediction (A-GBLUP)], (b) a stratified (i.e., within-group) analysis (W-GBLUP), and (c) a multivariate approach that uses multigroup data and accounts for heterogeneity (MG-GBLUP). The performance of the three models was assessed on three different data sets: a diversity panel of rice (Oryza sativa), a maize (Zea mays L.) half-sib panel, and a wheat (Triticum aestivum L.) data set that originated from plant breeding programs. The estimated genomic correlations between subpopulations varied from null to moderate, depending on the genetic distance between subpopulations and traits. Our assessment of prediction accuracy features cases where ignoring population structure leads to a parsimonious more powerful model as well as others where the multivariate and stratified approaches have higher predictive power. In general, the multivariate approach appeared slightly more robust than either the A- or the W-GBLUP. Copyright © 2015 by the Genetics Society of America.

  11. Molecular docking and molecular dynamics simulation study of inositol phosphorylceramide synthase – inhibitor complex in leishmaniasis: Insight into the structure based drug design [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Vineetha Mandlik

    2016-09-01

    Full Text Available Inositol phosphorylceramide synthase (IPCS has emerged as an important, interesting and attractive target in the sphingolipid metabolism of Leishmania. IPCS catalyzes the conversion of ceramide to IPC which forms the most predominant sphingolipid in Leishmania. IPCS has no mammalian equivalent and also plays an important role in maintaining the infectivity and viability of the parasite. The present study explores the possibility of targeting IPCS; development of suitable inhibitors for the same would serve as a treatment strategy for the infectious disease leishmaniasis. Five coumarin derivatives were developed as inhibitors of IPCS protein. Molecular dynamics simulations of the complexes of IPCS with these inhibitors were performed which provided insights into the binding modes of the inhibitors. In vitro screening of the top three compounds has resulted in the identification of one of the compounds (compound 3 which shows little cytotoxic effects. This compound therefore represents a good starting point for further in vivo experimentation and could possibly serve as an important drug candidate for the treatment of leishmaniasis.

  12. A quantitative structure-activity relationship (QSAR) study on a few series of potent, highly selective inhibitors of nitric oxide synthase.

    Science.gov (United States)

    Bharti, Vishwa Deepak; Gupta, Satya P; Kumar, Harish

    2014-02-01

    QSAR study was performed on a series of 1,2-dihydro-4-quinazolinamines, 4,5-dialkylsubstituted-2-imino-1,3-thiazolidine derivatives and 4,5-disubstituted-1,3-oxazolidin-2-imine derivatives studied by Tinker et al. [J Med Chem (2003), 46, 913-916], Ueda et al. [Bioorg Med Chem (2004) 12, 4101-4116] and Ueda et al. [Bioorg Med Chem Lett (2004) 14, 313-316], respectively, as potent, highly selective inhibitors of inducible nitric oxide synthase (iNOS). The iNOS inhibition activity of the whole series of compounds was analyzed in relation to the physicochemical and molecular properties of the compounds. The QSAR analysis revealed that the inhibition potency of the compounds was controlled by a topological parameter 1chi(v) (Kier's first order valence molecular connectivity index), density (D), surface tension (St) and length (steric parameter) of a substituent. This suggested that the drug-receptor interaction predominantly involved the dispersion interaction, but the bulky molecule would face steric problem because of which the molecule may not completely fit in active sites of the receptor and thus may not have the optimum interaction.

  13. ViVar: a comprehensive platform for the analysis and visualization of structural genomic variation.

    Directory of Open Access Journals (Sweden)

    Tom Sante

    Full Text Available Structural genomic variations play an important role in human disease and phenotypic diversity. With the rise of high-throughput sequencing tools, mate-pair/paired-end/single-read sequencing has become an important technique for the detection and exploration of structural variation. Several analysis tools exist to handle different parts and aspects of such sequencing based structural variation analyses pipelines. A comprehensive analysis platform to handle all steps, from processing the sequencing data, to the discovery and visualization of structural variants, is missing. The ViVar platform is built to handle the discovery of structural variants, from Depth Of Coverage analysis, aberrant read pair clustering to split read analysis. ViVar provides you with powerful visualization options, enables easy reporting of results and better usability and data management. The platform facilitates the processing, analysis and visualization, of structural variation based on massive parallel sequencing data, enabling the rapid identification of disease loci or genes. ViVar allows you to scale your analysis with your work load over multiple (cloud servers, has user access control to keep your data safe and is easy expandable as analysis techniques advance. URL: https://www.cmgg.be/vivar/

  14. From Genome to Structure and Back Again: A Family Portrait of the Transcarbamylases

    Directory of Open Access Journals (Sweden)

    Dashuang Shi

    2015-08-01

    Full Text Available Enzymes in the transcarbamylase family catalyze the transfer of a carbamyl group from carbamyl phosphate (CP to an amino group of a second substrate. The two best-characterized members, aspartate transcarbamylase (ATCase and ornithine transcarbamylase (OTCase, are present in most organisms from bacteria to humans. Recently, structures of four new transcarbamylase members, N-acetyl-l-ornithine transcarbamylase (AOTCase, N-succinyl-l-ornithine transcarbamylase (SOTCase, ygeW encoded transcarbamylase (YTCase and putrescine transcarbamylase (PTCase have also been determined. Crystal structures of these enzymes have shown that they have a common overall fold with a trimer as their basic biological unit. The monomer structures share a common CP binding site in their N-terminal domain, but have different second substrate binding sites in their C-terminal domain. The discovery of three new transcarbamylases, l-2,3-diaminopropionate transcarbamylase (DPTCase, l-2,4-diaminobutyrate transcarbamylase (DBTCase and ureidoglycine transcarbamylase (UGTCase, demonstrates that our knowledge and understanding of the spectrum of the transcarbamylase family is still incomplete. In this review, we summarize studies on the structures and function of transcarbamylases demonstrating how structural information helps to define biological function and how small structural differences govern enzyme specificity. Such information is important for correctly annotating transcarbamylase sequences in the genome databases and for identifying new members of the transcarbamylase family.

  15. Universal internucleotide statistics in full genomes: a footprint of the DNA structure and packaging?

    Directory of Open Access Journals (Sweden)

    Mikhail I Bogachev

    Full Text Available Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleotide interval distributions exhibit the same [Formula: see text]-exponential form. While in prokaryotes a single [Formula: see text]-exponential function makes the best fit, in eukaryotes the PDF contains additionally a second [Formula: see text]-exponential, which in the human genome makes a perfect approximation over nearly 10 decades. We suggest that this functional form is a footprint of the heterogeneous DNA structure, where the first [Formula: see text]-exponential reflects the universal helical pitch that appears both in pro- and eukaryotic DNA, while the second [Formula: see text]-exponential is a specific marker of the large-scale eukaryotic DNA organization.

  16. Inferring network structure in non-normal and mixed discrete-continuous genomic data.

    Science.gov (United States)

    Bhadra, Anindya; Rao, Arvind; Baladandayuthapani, Veerabhadran

    2018-03-01

    Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear situations when these approaches are inadequate. The first occurs when the data are continuous but display non-normal marginal behavior such as heavy tails or skewness, rendering an assumption of normality inappropriate. The second occurs when a part of the data is ordinal or discrete (e.g., presence or absence of a mutation) and the other part is continuous (e.g., expression levels of genes or proteins). In this case, the existing Bayesian approaches typically employ a latent variable framework for the discrete part that precludes inferring conditional independence among the data that are actually observed. The current article overcomes these two challenges in a unified framework using Gaussian scale mixtures. Our framework is able to handle continuous data that are not normal and data that are of mixed continuous and discrete nature, while still being able to infer a sparse conditional sign independence structure among the observed data. Extensive performance comparison in simulations with alternative techniques and an analysis of a real cancer genomics data set demonstrate the effectiveness of the proposed approach. © 2017, The International Biometric Society.

  17. Genomic structure and evolution of the mating type locus in the green seaweed Ulva partita.

    Science.gov (United States)

    Yamazaki, Tomokazu; Ichihara, Kensuke; Suzuki, Ryogo; Oshima, Kenshiro; Miyamura, Shinichi; Kuwano, Kazuyoshi; Toyoda, Atsushi; Suzuki, Yutaka; Sugano, Sumio; Hattori, Masahira; Kawano, Shigeyuki

    2017-09-15

    The evolution of sex chromosomes and mating loci in organisms with UV systems of sex/mating type determination in haploid phases via genes on UV chromosomes is not well understood. We report the structure of the mating type (MT) locus and its evolutionary history in the green seaweed Ulva partita, which is a multicellular organism with an isomorphic haploid-diploid life cycle and mating type determination in the haploid phase. Comprehensive comparison of a total of 12.0 and 16.6 Gb of genomic next-generation sequencing data for mt - and mt + strains identified highly rearranged MT loci of 1.0 and 1.5 Mb in size and containing 46 and 67 genes, respectively, including 23 gametologs. Molecular evolutionary analyses suggested that the MT loci diverged over a prolonged period in the individual mating types after their establishment in an ancestor. A gene encoding an RWP-RK domain-containing protein was found in the mt - MT locus but was not an ortholog of the chlorophycean mating type determination gene MID. Taken together, our results suggest that the genomic structure and its evolutionary history in the U. partita MT locus are similar to those on other UV chromosomes and that the MT locus genes are quite different from those of Chlorophyceae.

  18. Comparative Annotation of Viral Genomes with Non-Conserved Gene Structure

    DEFF Research Database (Denmark)

    de Groot, Saskia; Mailund, Thomas; Hein, Jotun

    2007-01-01

    allows for coding in unidirectional nested and overlapping reading frames, to annotate two homologous aligned viral genomes. Our method does not insist on conserved gene structure between the two sequences, thus making it applicable for the pairwise comparison of more distantly related sequences. Results...... for simultaneously in one direction. Conventional HMM based gene finding algorithms may find it difficult — if not impossible — to identify multiple coding regions, since in general their topologies do not allow for the presence of overlapping or nested genes. Comparative methods have therefore been restricted...... and HIV2, as well as of two different Hepatitis Viruses, attaining results of ~87% sensitivity and ~98.5% specificity. We subsequently incorporate prior knowledge by "knowing" the gene structure of one sequence and annotating the other conditional on it. Boosting accuracy close to perfect we demonstrate...

  19. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... OSCC patients. From each patient, a series of biopsies were sampled from 3 distinct geographical sites in primary tumor and 1 lymph node metastasis. A whole blood sample was taken as the matched reference. Results and discussion: Our results demonstrate that ultra-deep sequencing gives a level...

  20. Structure of the acidianus filamentous virus 3 and comparative genomics of related archaeal lipothrixviruses

    DEFF Research Database (Denmark)

    Vestergaard, Gisle Alberg; Aramayo, Ricardo; Basta, Tamara

    2008-01-01

    Four novel filamentous viruses with double-stranded DNA genomes, namely, Acidianus filamentous virus 3 (AFV3), AFV6, AFV7, and AFV8, have been characterized from the hyperthermophilic archaeal genus Acidianus, and they are assigned to the Betalipothrixvirus genus of the family Lipothrixviridae....... The structures of the approximately 2-mum-long virions are similar, and one of them, AFV3, was studied in detail. It consists of a cylindrical envelope containing globular subunits arranged in a helical formation that is unique for any known double-stranded DNA virus. The envelope is 3.1 nm thick and encases...... structural proteins; (iii) multiple overlapping open reading frames, which may be indicative of gene recoding; (iv) putative 12-bp genetic elements; and (v) partial gene sequences corresponding closely to spacer sequences of chromosomal repeat clusters....

  1. Population genomics of dengue virus serotype 4: insights into genetic structure and evolution.

    Science.gov (United States)

    Waman, Vaishali P; Kasibhatla, Sunitha Manjari; Kale, Mohan M; Kulkarni-Kale, Urmila

    2016-08-01

    The spread of dengue disease has become a global public health concern. Dengue is caused by dengue virus, which is a mosquito-borne arbovirus of the genus Flavivirus, family Flaviviridae. There are four dengue virus serotypes (1-4), each of which is known to trigger mild to severe disease. Dengue virus serotype 4 (DENV-4) has four genotypes and is increasingly being reported to be re-emerging in various parts of the world. Therefore, the population structure and factors shaping the evolution of DENV-4 strains across the world were studied using genome-based population genetic, phylogenetic and selection pressure analysis methods. The population genomics study helped to reveal the spatiotemporal structure of the DENV-4 population and its primary division into two spatially distinct clusters: American and Asian. These spatial clusters show further time-dependent subdivisions within genotypes I and II. Thus, the DENV-4 population is observed to be stratified into eight genetically distinct lineages, two of which are formed by American strains and six of which are formed by Asian strains. Episodic positive selection was observed in the structural (E) and non-structural (NS2A and NS3) genes, which appears to be responsible for diversification of Asian lineages in general and that of modern lineages of genotype I and II in particular. In summary, the global DENV-4 population is stratified into eight genetically distinct lineages, in a spatiotemporal manner with limited recombination. The significant role of adaptive evolution in causing diversification of DENV-4 lineages is discussed. The evolution of DENV-4 appears to be governed by interplay between spatiotemporal distribution, episodic positive selection and intra/inter-genotype recombination.

  2. AluScan: a method for genome-wide scanning of sequence and structure variations in the human genome

    Directory of Open Access Journals (Sweden)

    Mei Lingling

    2011-11-01

    Full Text Available Abstract Background To complement next-generation sequencing technologies, there is a pressing need for efficient pre-sequencing capture methods with reduced costs and DNA requirement. The Alu family of short interspersed nucleotide elements is the most abundant type of transposable elements in the human genome and a recognized source of genome instability. With over one million Alu elements distributed throughout the genome, they are well positioned to facilitate genome-wide sequence amplification and capture of regions likely to harbor genetic variation hotspots of biological relevance. Results Here we report on the use of inter-Alu PCR with an enhanced range of amplicons in conjunction with next-generation sequencing to generate an Alu-anchored scan, or 'AluScan', of DNA sequences between Alu transposons, where Alu consensus sequence-based 'H-type' PCR primers that elongate outward from the head of an Alu element are combined with 'T-type' primers elongating from the poly-A containing tail to achieve huge amplicon range. To illustrate the method, glioma DNA was compared with white blood cell control DNA of the same patient by means of AluScan. The over 10 Mb sequences obtained, derived from more than 8,000 genes spread over all the chromosomes, revealed a highly reproducible capture of genomic sequences enriched in genic sequences and cancer candidate gene regions. Requiring only sub-micrograms of sample DNA, the power of AluScan as a discovery tool for genetic variations was demonstrated by the identification of 357 instances of loss of heterozygosity, 341 somatic indels, 274 somatic SNVs, and seven potential somatic SNV hotspots between control and glioma DNA. Conclusions AluScan, implemented with just a small number of H-type and T-type inter-Alu PCR primers, provides an effective capture of a diversity of genome-wide sequences for analysis. The method, by enabling an examination of gene-enriched regions containing exons, introns, and

  3. SCHEMA computational design of virus capsid chimeras: calibrating how genome packaging, protection, and transduction correlate with calculated structural disruption.

    Science.gov (United States)

    Ho, Michelle L; Adler, Benjamin A; Torre, Michael L; Silberg, Jonathan J; Suh, Junghae

    2013-12-20

    Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effectively transduce cells. Such improvements to the virus design process may help advance not only gene therapy applications but also other bionanotechnologies dependent upon the development of viruses with new sequences and functions.

  4. Report on three Genomes to Life Workshops: Data Infrastructure, Modeling and Simulation, and Protein Structure Prediction

    Energy Technology Data Exchange (ETDEWEB)

    Geist, GA

    2003-09-16

    On July 22, 23, 24, 2003, three one day workshops were held in Gaithersburg, Maryland. Each was attended by about 30 computational biologists, mathematicians, and computer scientists who were experts in the respective workshop areas The first workshop discussed the data infrastructure needs for the Genomes to Life (GTL) program with the objective to identify gaps in the present GTL data infrastructure and define the GTL data infrastructure required for the success of the proposed GTL facilities. The second workshop discussed the modeling and simulation needs for the next phase of the GTL program and defined how these relate to the experimental data generated by genomics, proteomics, and metabolomics. The third workshop identified emerging technical challenges in computational protein structure prediction for DOE missions and outlining specific goals for the next phase of GTL. The workshops were attended by representatives from both OBER and OASCR. The invited experts at each of the workshops made short presentations on what they perceived as the key needs in the GTL data infrastructure, modeling and simulation, and structure prediction respectively. Each presentation was followed by a lively discussion by all the workshop attendees. The following findings and recommendations were derived from the three workshops. A seamless integration of GTL data spanning the entire range of genomics, proteomics, and metabolomics will be extremely challenging but it has to be treated as the first-class component of the GTL program to assure GTL's chances for success. High-throughput GTL facilities and ultrascale computing will make it possible to address the ultimate goal of modern biology: to achieve a fundamental, comprehensive, and systematic understanding of life. But first the GTL community needs to address the problem of the massive quantities and increased complexity of biological data produced by experiments and computations. Genome-scale collection, analysis

  5. New Structure Sheds Light on Selective HIV-1 Genomic RNA Packaging.

    Science.gov (United States)

    Olson, Erik D; Cantara, William A; Musier-Forsyth, Karin

    2015-08-24

    Two copies of unspliced human immunodeficiency virus (HIV)-1 genomic RNA (gRNA) are preferentially selected for packaging by the group-specific antigen (Gag) polyprotein into progeny virions as a dimer during the late stages of the viral lifecycle. Elucidating the RNA features responsible for selective recognition of the full-length gRNA in the presence of an abundance of other cellular RNAs and spliced viral RNAs remains an area of intense research. The recent nuclear magnetic resonance (NMR) structure by Keane et al. [1] expands upon previous efforts to determine the conformation of the HIV-1 RNA packaging signal. The data support a secondary structure wherein sequences that constitute the major splice donor site are sequestered through base pairing, and a tertiary structure that adopts a tandem 3-way junction motif that exposes the dimerization initiation site and unpaired guanosines for specific recognition by Gag. While it remains to be established whether this structure is conserved in the context of larger RNA constructs or in the dimer, this study serves as the basis for characterizing large RNA structures using novel NMR techniques, and as a major advance toward understanding how the HIV-1 gRNA is selectively packaged.

  6. Losing identity: structural diversity of transposable elements belonging to different classes in the genome of Anopheles gambiae

    Directory of Open Access Journals (Sweden)

    Fernández-Medina Rita D

    2012-06-01

    Full Text Available Abstract Background Transposable elements (TEs, both DNA transposons and retrotransposons, are genetic elements with the main characteristic of being able to mobilize and amplify their own representation within genomes, utilizing different mechanisms of transposition. An almost universal feature of TEs in eukaryotic genomes is their inability to transpose by themselves, mainly as the result of sequence degeneration (by either mutations or deletions. Most of the elements are thus either inactive or non-autonomous. Considering that the bulk of some eukaryotic genomes derive from TEs, they have been conceived as “TE graveyards.” It has been shown that once an element has been inactivated, it progressively accumulates mutations and deletions at neutral rates until completely losing its identity or being lost from the host genome; however, it has also been shown that these “neutral sequences” might serve as raw material for domestication by host genomes. Results We have analyzed the sequence structural variations, nucleotide divergence, and pattern of insertions and deletions of several superfamilies of TEs belonging to both class I (long terminal repeats [LTRs] and non-LTRs [NLTRs] and II in the genome of Anopheles gambiae, aiming at describing the landscape of deterioration of these elements in this particular genome. Our results describe a great diversity in patterns of deterioration, indicating lineage-specific differences including the presence of Solo-LTRs in the LTR lineage, 5′-deleted NLTRs, and several non-autonomous and MITEs in the class II families. Interestingly, we found fragments of NLTRs corresponding to the RT domain, which preserves high identity among them, suggesting a possible remaining genomic role for these domains. Conclusions We show here that the TEs in the An. gambiae genome deteriorate in different ways according to the class to which they belong. This diversity certainly has implications not only at the host

  7. Structure modeling of all identified G protein-coupled receptors in the human genome.

    Directory of Open Access Journals (Sweden)

    Yang Zhang

    2006-02-01

    Full Text Available G protein-coupled receptors (GPCRs, encoded by about 5% of human genes, comprise the largest family of integral membrane proteins and act as cell surface receptors responsible for the transduction of endogenous signal into a cellular response. Although tertiary structural information is crucial for function annotation and drug design, there are few experimentally determined GPCR structures. To address this issue, we employ the recently developed threading assembly refinement (TASSER method to generate structure predictions for all 907 putative GPCRs in the human genome. Unlike traditional homology modeling approaches, TASSER modeling does not require solved homologous template structures; moreover, it often refines the structures closer to native. These features are essential for the comprehensive modeling of all human GPCRs when close homologous templates are absent. Based on a benchmarked confidence score, approximately 820 predicted models should have the correct folds. The majority of GPCR models share the characteristic seven-transmembrane helix topology, but 45 ORFs are predicted to have different structures. This is due to GPCR fragments that are predominantly from extracellular or intracellular domains as well as database annotation errors. Our preliminary validation includes the automated modeling of bovine rhodopsin, the only solved GPCR in the Protein Data Bank. With homologous templates excluded, the final model built by TASSER has a global C(alpha root-mean-squared deviation from native of 4.6 angstroms, with a root-mean-squared deviation in the transmembrane helix region of 2.1 angstroms. Models of several representative GPCRs are compared with mutagenesis and affinity labeling data, and consistent agreement is demonstrated. Structure clustering of the predicted models shows that GPCRs with similar structures tend to belong to a similar functional class even when their sequences are diverse. These results demonstrate the usefulness

  8. Comparative Genome Analyses Reveal Distinct Structure in the Saltwater Crocodile MHC

    Science.gov (United States)

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M.; Shan, Xueyan; Peterson, Daniel G.; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M.; Isberg, Sally R.; Higgins, Damien P.; Chong, Amanda Y.; John, John St; Glenn, Travis C.; Ray, David A.; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2–6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521

  9. Genome characterization and population genetic structure of the zoonotic pathogen, Streptococcus canis

    Directory of Open Access Journals (Sweden)

    Richards Vincent P

    2012-12-01

    Full Text Available Abstract Background Streptococcus canis is an important opportunistic pathogen of dogs and cats that can also infect a wide range of additional mammals including cows where it can cause mastitis. It is also an emerging human pathogen. Results Here we provide characterization of the first genome sequence for this species, strain FSL S3-227 (milk isolate from a cow with an intra-mammary infection. A diverse array of putative virulence factors was encoded by the S. canis FSL S3-227 genome. Approximately 75% of these gene sequences were homologous to known Streptococcal virulence factors involved in invasion, evasion, and colonization. Present in the genome are multiple potentially mobile genetic elements (MGEs [plasmid, phage, integrative conjugative element (ICE] and comparison to other species provided convincing evidence for lateral gene transfer (LGT between S. canis and two additional bovine mastitis causing pathogens (Streptococcus agalactiae, and Streptococcus dysgalactiae subsp. dysgalactiae, with this transfer possibly contributing to host adaptation. Population structure among isolates obtained from Europe and USA [bovine = 56, canine = 26, and feline = 1] was explored. Ribotyping of all isolates and multi locus sequence typing (MLST of a subset of the isolates (n = 45 detected significant differentiation between bovine and canine isolates (Fisher exact test: P = 0.0000 [ribotypes], P = 0.0030 [sequence types], suggesting possible host adaptation of some genotypes. Concurrently, the ancestral clonal complex (54% of isolates occurred in many tissue types, all hosts, and all geographic locations suggesting the possibility of a wide and diverse niche. Conclusion This study provides evidence highlighting the importance of LGT in the evolution of the bacteria S. canis, specifically, its possible role in host adaptation and acquisition of virulence factors. Furthermore, recent LGT detected between S. canis and human

  10. Genome characterization and population genetic structure of the zoonotic pathogen, Streptococcus canis.

    Science.gov (United States)

    Richards, Vincent P; Zadoks, Ruth N; Pavinski Bitar, Paulina D; Lefébure, Tristan; Lang, Ping; Werner, Brenda; Tikofsky, Linda; Moroni, Paolo; Stanhope, Michael J

    2012-12-18

    Streptococcus canis is an important opportunistic pathogen of dogs and cats that can also infect a wide range of additional mammals including cows where it can cause mastitis. It is also an emerging human pathogen. Here we provide characterization of the first genome sequence for this species, strain FSL S3-227 (milk isolate from a cow with an intra-mammary infection). A diverse array of putative virulence factors was encoded by the S. canis FSL S3-227 genome. Approximately 75% of these gene sequences were homologous to known Streptococcal virulence factors involved in invasion, evasion, and colonization. Present in the genome are multiple potentially mobile genetic elements (MGEs) [plasmid, phage, integrative conjugative element (ICE)] and comparison to other species provided convincing evidence for lateral gene transfer (LGT) between S. canis and two additional bovine mastitis causing pathogens (Streptococcus agalactiae, and Streptococcus dysgalactiae subsp. dysgalactiae), with this transfer possibly contributing to host adaptation. Population structure among isolates obtained from Europe and USA [bovine = 56, canine = 26, and feline = 1] was explored. Ribotyping of all isolates and multi locus sequence typing (MLST) of a subset of the isolates (n = 45) detected significant differentiation between bovine and canine isolates (Fisher exact test: P = 0.0000 [ribotypes], P = 0.0030 [sequence types]), suggesting possible host adaptation of some genotypes. Concurrently, the ancestral clonal complex (54% of isolates) occurred in many tissue types, all hosts, and all geographic locations suggesting the possibility of a wide and diverse niche. This study provides evidence highlighting the importance of LGT in the evolution of the bacteria S. canis, specifically, its possible role in host adaptation and acquisition of virulence factors. Furthermore, recent LGT detected between S. canis and human bacteria (Streptococcus urinalis) is cause for concern

  11. Population genomic structure and adaptation in the zoonotic malaria parasite Plasmodium knowlesi

    KAUST Repository

    Assefa, Samuel

    2015-10-06

    Malaria cases caused by the zoonotic parasite Plasmodium knowlesi are being increasingly reported throughout Southeast Asia and in travelers returning from the region. To test for evidence of signatures of selection or unusual population structure in this parasite, we surveyed genome sequence diversity in 48 clinical isolates recently sampled from Malaysian Borneo and in five lines maintained in laboratory rhesus macaques after isolation in the 1960s from Peninsular Malaysia and the Philippines. Overall genomewide nucleotide diversity (π = 6.03 × 10) was much higher than has been seen in worldwide samples of either of the major endemic malaria parasite species Plasmodium falciparum and Plasmodium vivax. A remarkable substructure is revealed within P. knowlesi, consisting of two major sympatric clusters of the clinical isolates and a third cluster comprising the laboratory isolates. There was deep differentiation between the two clusters of clinical isolates [mean genomewide fixation index (F) = 0.21, with 9,293 SNPs having fixed differences of F = 1.0]. This differentiation showed marked heterogeneity across the genome, with mean F values of different chromosomes ranging from 0.08 to 0.34 and with further significant variation across regions within several chromosomes. Analysis of the largest cluster (cluster 1, 38 isolates) indicated long-term population growth, with negatively skewed allele frequency distributions (genomewide average Tajima\\'s D = -1.35). Against this background there was evidence of balancing selection on particular genes, including the circumsporozoite protein (csp) gene, which had the top Tajima\\'s D value (1.57), and scans of haplotype homozygosity implicate several genomic regions as being under recent positive selection.

  12. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    Directory of Open Access Journals (Sweden)

    Weerachai Jaratlerdsiri

    Full Text Available The major histocompatibility complex (MHC is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs.

  13. Variation in the OC locus of Acinetobacter baumannii genomes predicts extensive structural diversity in the lipooligosaccharide.

    Directory of Open Access Journals (Sweden)

    Johanna J Kenyon

    Full Text Available Lipooligosaccharide (LOS is a complex surface structure that is linked to many pathogenic properties of Acinetobacter baumannii. In A. baumannii, the genes responsible for the synthesis of the outer core (OC component of the LOS are located between ilvE and aspS. The content of the OC locus is usually variable within a species, and examination of 6 complete and 227 draft A. baumannii genome sequences available in GenBank non-redundant and Whole Genome Shotgun databases revealed nine distinct new types, OCL4-OCL12, in addition to the three known ones. The twelve gene clusters fell into two distinct groups, designated Group A and Group B, based on similarities in the genes present. OCL6 (Group B was unique in that it included genes for the synthesis of L-Rhamnosep. Genetic exchange of the different configurations between strains has occurred as some OC forms were found in several different sequence types (STs. OCL1 (Group A was the most widely distributed being present in 18 STs, and OCL6 was found in 16 STs. Variation within clones was also observed, with more than one OC locus type found in the two globally disseminated clones, GC1 and GC2, that include the majority of multiply antibiotic resistant isolates. OCL1 was the most abundant gene cluster in both GC1 and GC2 genomes but GC1 isolates also carried OCL2, OCL3 or OCL5, and OCL3 was also present in GC2. As replacement of the OC locus in the major global clones indicates the presence of sub-lineages, a PCR typing scheme was developed to rapidly distinguish Group A and Group B types, and to distinguish the specific forms found in GC1 and GC2 isolates.

  14. Genome characterization and population genetic structure of the zoonotic pathogen, Streptococcus canis

    Science.gov (United States)

    2012-01-01

    Background Streptococcus canis is an important opportunistic pathogen of dogs and cats that can also infect a wide range of additional mammals including cows where it can cause mastitis. It is also an emerging human pathogen. Results Here we provide characterization of the first genome sequence for this species, strain FSL S3-227 (milk isolate from a cow with an intra-mammary infection). A diverse array of putative virulence factors was encoded by the S. canis FSL S3-227 genome. Approximately 75% of these gene sequences were homologous to known Streptococcal virulence factors involved in invasion, evasion, and colonization. Present in the genome are multiple potentially mobile genetic elements (MGEs) [plasmid, phage, integrative conjugative element (ICE)] and comparison to other species provided convincing evidence for lateral gene transfer (LGT) between S. canis and two additional bovine mastitis causing pathogens (Streptococcus agalactiae, and Streptococcus dysgalactiae subsp. dysgalactiae), with this transfer possibly contributing to host adaptation. Population structure among isolates obtained from Europe and USA [bovine = 56, canine = 26, and feline = 1] was explored. Ribotyping of all isolates and multi locus sequence typing (MLST) of a subset of the isolates (n = 45) detected significant differentiation between bovine and canine isolates (Fisher exact test: P = 0.0000 [ribotypes], P = 0.0030 [sequence types]), suggesting possible host adaptation of some genotypes. Concurrently, the ancestral clonal complex (54% of isolates) occurred in many tissue types, all hosts, and all geographic locations suggesting the possibility of a wide and diverse niche. Conclusion This study provides evidence highlighting the importance of LGT in the evolution of the bacteria S. canis, specifically, its possible role in host adaptation and acquisition of virulence factors. Furthermore, recent LGT detected between S. canis and human bacteria (Streptococcus

  15. A Therapeutic Connection between Dietary Phytochemicals and ATP Synthase.

    Science.gov (United States)

    Ahmad, Zulfiqar; Hassan, Sherif S; Azim, Sofiya

    2017-11-20

    For centuries, phytochemicals have been used to prevent and cure multiple health ailments. Phytochemicals have been reported to have antioxidant, antidiabetic, antitussive, antiparasitic, anticancer, and antimicrobial properties. Generally, the therapeutic use of phytochemicals is based on tradition or word of mouth with few evidence-based studies. Moreover, molecular level interactions or molecular targets for the majority of phytochemicals are unknown. In recent years, antibiotic resistance by microbes has become a major healthcare concern. As such, the use of phytochemicals with antimicrobial properties has become pertinent. Natural compounds from plants, vegetables, herbs, and spices with strong antimicrobial properties present an excellent opportunity for preventing and combating antibiotic resistant microbial infections. ATP synthase is the fundamental means of cellular energy. Inhibition of ATP synthase may deprive cells of required energy leading to cell death, and a variety of dietary phytochemicals are known to inhibit ATP synthase. Structural modifications of phytochemicals have been shown to increase the inhibitory potency and extent of inhibition. Sitedirected mutagenic analysis has elucidated the binding site(s) for some phytochemicals on ATP synthase. Amino acid variations in and around the phytochemical binding sites can result in selective binding and inhibition of microbial ATP synthase. In this review, the therapeutic connection between dietary phytochemicals and ATP synthase is summarized based on the inhibition of ATP synthase by dietary phytochemicals. Research suggests selective targeting of ATP synthase is a valuable alternative molecular level approach to combat antibiotic resistant microbial infections. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  16. The population genomics of begomoviruses: global scale population structure and gene flow

    Directory of Open Access Journals (Sweden)

    Prasanna HC

    2010-09-01

    Full Text Available Abstract Background The rapidly growing availability of diverse full genome sequences from across the world is increasing the feasibility of studying the large-scale population processes that underly observable pattern of virus diversity. In particular, characterizing the genetic structure of virus populations could potentially reveal much about how factors such as geographical distributions, host ranges and gene flow between populations combine to produce the discontinuous patterns of genetic diversity that we perceive as distinct virus species. Among the richest and most diverse full genome datasets that are available is that for the dicotyledonous plant infecting genus, Begomovirus, in the Family Geminiviridae. The begomoviruses all share the same whitefly vector, are highly recombinogenic and are distributed throughout tropical and subtropical regions where they seriously threaten the food security of the world's poorest people. Results We focus here on using a model-based population genetic approach to identify the genetically distinct sub-populations within the global begomovirus meta-population. We demonstrate the existence of at least seven major sub-populations that can further be sub-divided into as many as thirty four significantly differentiated and genetically cohesive minor sub-populations. Using the population structure framework revealed in the present study, we further explored the extent of gene flow and recombination between genetic populations. Conclusions Although geographical barriers are apparently the most significant underlying cause of the seven major population sub-divisions, within the framework of these sub-divisions, we explore patterns of gene flow to reveal that both host range differences and genetic barriers to recombination have probably been major contributors to the minor population sub-divisions that we have identified. We believe that the global Begomovirus population structure revealed here could

  17. Suites of terpene synthases explain differential terpenoid production in ginger and turmeric tissues.

    Directory of Open Access Journals (Sweden)

    Hyun Jo Koo

    Full Text Available The essential oils of ginger (Zingiber officinale and turmeric (Curcuma longa contain a large variety of terpenoids, some of which possess anticancer, antiulcer, and antioxidant properties. Despite their importance, only four terpene synthases have been identified from the Zingiberaceae family: (+-germacrene D synthase and (S-β-bisabolene synthase from ginger rhizome, and α-humulene synthase and β-eudesmol synthase from shampoo ginger (Zingiber zerumbet rhizome. We report the identification of 25 mono- and 18 sesquiterpene synthases from ginger and turmeric, with 13 and 11, respectively, being functionally characterized. Novel terpene synthases, (--caryolan-1-ol synthase and α-zingiberene/β-sesquiphellandrene synthase, which is responsible for formation of the major sesquiterpenoids in ginger and turmeric rhizomes, were also discovered. These suites of enzymes are responsible for formation of the majority of the terpenoids present in these two plants. Structures of several were modeled, and a comparison of sets of paralogs suggests how the terpene synthases in ginger and turmeric evolved. The most abundant and most important sesquiterpenoids in turmeric rhizomes, (+-α-turmerone and (+-β-turmerone, are produced from (--α-zingiberene and (--β-sesquiphellandrene, respectively, via α-zingiberene/β-sesquiphellandrene oxidase and a still unidentified dehydrogenase.

  18. Suites of Terpene Synthases Explain Differential Terpenoid Production in Ginger and Turmeric Tissues

    Science.gov (United States)

    Koo, Hyun Jo; Gang, David R.

    2012-01-01

    The essential oils of ginger (Zingiber officinale) and turmeric (Curcuma longa) contain a large variety of terpenoids, some of which possess anticancer, antiulcer, and antioxidant properties. Despite their importance, only four terpene synthases have been identified from the Zingiberaceae family: (+)-germacrene D synthase and (S)-β-bisabolene synthase from ginger rhizome, and α-humulene synthase and β-eudesmol synthase from shampoo ginger (Zingiber zerumbet) rhizome. We report the identification of 25 mono- and 18 sesquiterpene synthases from ginger and turmeric, with 13 and 11, respectively, being functionally characterized. Novel terpene synthases, (−)-caryolan-1-ol synthase and α-zingiberene/β-sesquiphellandrene synthase, which is responsible for formation of the major sesquiterpenoids in ginger and turmeric rhizomes, were also discovered. These suites of enzymes are responsible for formation of the majority of the terpenoids present in these two plants. Structures of several were modeled, and a comparison of sets of paralogs suggests how the terpene synthases in ginger and turmeric evolved. The most abundant and most important sesquiterpenoids in turmeric rhizomes, (+)-α-turmerone and (+)-β-turmerone, are produced from (−)-α-zingiberene and (−)-β-sesquiphellandrene, respectively, via α-zingiberene/β-sesquiphellandrene oxidase and a still unidentified dehydrogenase. PMID:23272109

  19. Population genomic analysis of ancient and modern genomes yields new insights into the genetic ancestry of the Tyrolean Iceman and the genetic structure of Europe.

    Directory of Open Access Journals (Sweden)

    Martin Sikora

    2014-05-01

    Full Text Available Genome sequencing of the 5,300-year-old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has yielded new insights into his origin and relationship to modern European populations. A key finding of that study was an apparent recent common ancestry with individuals from Sardinia, based largely on the Y chromosome haplogroup and common autosomal SNP variation. Here, we compiled and analyzed genomic datasets from both modern and ancient Europeans, including genome sequence data from over 400 Sardinians and two ancient Thracians from Bulgaria, to investigate this result in greater detail and determine its implications for the genetic structure of Neolithic Europe. Using whole-genome sequencing data, we confirm that the Iceman is, indeed, most closely related to Sardinians. Furthermore, we show that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context. We hypothesize that this genetic affinity of ancient samples from different parts of Europe with Sardinians represents a common genetic component that was geographically widespread across Europe during the Neolithic, likely related to migrations and population expansions associated with the spread of agriculture.

  20. Phylogenomic and functional domain analysis of polyketide synthases in Fusarium

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Daren W.; Butchko, Robert A.; Baker, Scott E.; Proctor, Robert H.

    2012-02-01

    Fusarium species are ubiquitous in nature, cause a range of plant diseases, and produce a variety of chemicals often referred to as secondary metabolites. Although some fungal secondary metabolites affect plant growth or protect plants from other fungi and bacteria, their presence in grain based food and feed is more often associated with a variety of diseases in plants and in animals. Many of these structurally diverse metabolites are derived from a family of related enzymes called polyketide synthases (PKSs). A search of genomic sequence of Fusarium verticillioides, F. graminearum, F. oxysporum and Nectria haematococca (anamorph F. solani) identified a total of 58 PKS genes. To gain insight into how this gene family evolved and to guide future studies, we conducted a phylogenomic and functional domain analysis. The resulting genealogy suggested that Fusarium PKSs represent 34 different groups responsible for synthesis of different core metabolites. The analyses indicate that variation in the Fusarium PKS gene family is due to gene duplication and loss events as well as enzyme gain-of-function due to the acquisition of new domains or of loss-of-function due to nucleotide mutations. Transcriptional analysis indicate that the 16 F. verticillioides PKS genes are expressed under a range of conditions, further evidence that they are functional genes that confer the ability to produce secondary metabolites.

  1. Crystal Structures of DNA-Whirly Complexes and Their Role in Arabidopsis Organelle Genome Repair

    Energy Technology Data Exchange (ETDEWEB)

    Cappadocia, Laurent; Maréchal, Alexandre; Parent, Jean-Sébastien; Lepage, Étienne; Sygusch, Jurgen; Brisson, Normand (Montreal)

    2010-09-07

    DNA double-strand breaks are highly detrimental to all organisms and need to be quickly and accurately repaired. Although several proteins are known to maintain plastid and mitochondrial genome stability in plants, little is known about the mechanisms of DNA repair in these organelles and the roles of specific proteins. Here, using ciprofloxacin as a DNA damaging agent specific to the organelles, we show that plastids and mitochondria can repair DNA double-strand breaks through an error-prone pathway similar to the microhomology-mediated break-induced replication observed in humans, yeast, and bacteria. This pathway is negatively regulated by the single-stranded DNA (ssDNA) binding proteins from the Whirly family, thus indicating that these proteins could contribute to the accurate repair of plant organelle genomes. To understand the role of Whirly proteins in this process, we solved the crystal structures of several Whirly-DNA complexes. These reveal a nonsequence-specific ssDNA binding mechanism in which DNA is stabilized between domains of adjacent subunits and rendered unavailable for duplex formation and/or protein interactions. Our results suggest a model in which the binding of Whirly proteins to ssDNA would favor accurate repair of DNA double-strand breaks over an error-prone microhomology-mediated break-induced replication repair pathway.

  2. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    Science.gov (United States)

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  3. Bloom Filter Trie: an alignment-free and reference-free data structure for pan-genome storage.

    Science.gov (United States)

    Holley, Guillaume; Wittler, Roland; Stoye, Jens

    2016-01-01

    High throughput sequencing technologies have become fast and cheap in the past years. As a result, large-scale projects started to sequence tens to several thousands of genomes per species, producing a high number of sequences sampled from each genome. Such a highly redundant collection of very similar sequences is called a pan-genome. It can be transformed into a set of sequences "colored" by the genomes to which they belong. A colored de Bruijn graph (C-DBG) extracts from the sequences all colored k-mers, strings of length k, and stores them in vertices. In this paper, we present an alignment-free, reference-free and incremental data structure for storing a pan-genome as a C-DBG: the bloom filter trie (BFT). The data structure allows to store and compress a set of colored k-mers, and also to efficiently traverse the graph. Bloom filter trie was used to index and query different pangenome datasets. Compared to another state-of-the-art data structure, BFT was up to two times faster to build while using about the same amount of main memory. For querying k-mers, BFT was about 52-66 times faster while using about 5.5-14.3 times less memory. We present a novel succinct data structure called the Bloom Filter Trie for indexing a pan-genome as a colored de Bruijn graph. The trie stores k-mers and their colors based on a new representation of vertices that compress and index shared substrings. Vertices use basic data structures for lightweight substrings storage as well as Bloom filters for efficient trie and graph traversals. Experimental results prove better performance compared to another state-of-the-art data structure. https://www.github.com/GuillaumeHolley/BloomFilterTrie.

  4. Strain Prioritization and Genome Mining for Enediyne Natural Products

    Science.gov (United States)

    Yan, Xiaohui; Ge, Huiming; Huang, Tingting; Hindra; Yang, Dong; Teng, Qihui; Crnovčić, Ivana; Li, Xiuling; Rudolf, Jeffrey D.; Lohman, Jeremy R.; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Van Nieuwerburgh, Filip; Rader, Christoph

    2016-01-01

    ABSTRACT The enediyne family of natural products has had a profound impact on modern chemistry, biology, and medicine, and yet only 11 enediynes have been structurally characterized to date. Here we report a genome survey of 3,400 actinomycetes, identifying 81 strains that harbor genes encoding the enediyne polyketide synthase cassettes that could be grouped into 28 distinct clades based on phylogenetic analysis. Genome sequencing of 31 representative strains confirmed that each clade harbors a distinct enediyne biosynthetic gene cluster. A genome neighborhood network allows prediction of new structural features and biosynthetic insights that could be exploited for enediyne discovery. We confirmed one clade as new C-1027 producers, with a significantly higher C-1027 titer than the original producer, and discovered a new family of enediyne natural products, the tiancimycins (TNMs), that exhibit potent cytotoxicity against a broad spectrum of cancer cell lines. Our results demonstrate the feasibility of rapid discovery of new enediynes from a large strain collection. PMID:27999165

  5. Exploiting the Biosynthetic Potential of Type III Polyketide Synthases

    Directory of Open Access Journals (Sweden)

    Yan Ping Lim

    2016-06-01

    Full Text Available Polyketides are structurally and functionally diverse secondary metabolites that are biosynthesized by polyketide synthases (PKSs using acyl-CoA precursors. Recent studies in the engineering and structural characterization of PKSs have facilitated the use of target enzymes as biocatalysts to produce novel functionally optimized polyketides. These compounds may serve as potential drug leads. This review summarizes the insights gained from research on type III PKSs, from the discovery of chalcone synthase in plants to novel PKSs in bacteria and fungi. To date, at least 15 families of type III PKSs have been characterized, highlighting the utility of PKSs in the development of natural product libraries for therapeutic development.

  6. Distinct Mechanisms of Nuclease-Directed DNA-Structure-Induced Genetic Instability in Cancer Genomes.

    Science.gov (United States)

    Zhao, Junhua; Wang, Guliang; Del Mundo, Imee M; McKinney, Jennifer A; Lu, Xiuli; Bacolla, Albino; Boulware, Stephen B; Zhang, Changsheng; Zhang, Haihua; Ren, Pengyu; Freudenreich, Catherine H; Vasquez, Karen M

    2018-01-30

    Sequences with the capacity to adopt alternative DNA structures have been implicated in cancer etiology; however, the mechanisms are unclear. For example, H-DNA-forming sequences within oncogenes have been shown to stimulate genetic instability in mammals. Here, we report that H-DNA-forming sequences are enriched at translocation breakpoints in human cancer genomes, further implicating them in cancer etiology. H-DNA-induced mutations were suppressed in human cells deficient in the nucleotide excision repair nucleases, ERCC1-XPF and XPG, but were stimulated in cells deficient in FEN1, a replication-related endonuclease. Further, we found that these nucleases cleaved H-DNA conformations, and the interactions of modeled H-DNA with ERCC1-XPF, XPG, and FEN1 proteins were explored at the sub-molecular level. The results suggest mechanisms of genetic instability triggered by H-DNA through distinct structure-specific, cleavage-based replication-independent and replication-dependent pathways, providing critical evidence for a role of the DNA structure itself in the etiology of cancer and other human diseases. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  7. PRISM 3: expanded prediction of natural product chemical structures from microbial genomes.

    Science.gov (United States)

    Skinnider, Michael A; Merwin, Nishanth J; Johnston, Chad W; Magarvey, Nathan A

    2017-07-03

    Microbial natural products represent a rich resource of pharmaceutically and industrially important compounds. Genome sequencing has revealed that the majority of natural products remain undiscovered, and computational methods to connect biosynthetic gene clusters to their corresponding natural products therefore have the potential to revitalize natural product discovery. Previously, we described PRediction Informatics for Secondary Metabolomes (PRISM), a combinatorial approach to chemical structure prediction for genetically encoded nonribosomal peptides and type I and II polyketides. Here, we present a ground-up rewrite of the PRISM structure prediction algorithm to derive prediction of natural products arising from non-modular biosynthetic paradigms. Within this new version, PRISM 3, natural product scaffolds are modeled as chemical graphs, permitting structure prediction for aminocoumarins, antimetabolites, bisindoles and phosphonate natural products, and building upon the addition of ribosomally synthesized and post-translationally modified peptides. Further, with the addition of cluster detection for 11 new cluster types, PRISM 3 expands to detect 22 distinct natural product cluster types. Other major modifications to PRISM include improved sequence input and ORF detection, user-friendliness and output. Distribution of PRISM 3 over a 300-core server grid improves the speed and capacity of the web application. PRISM 3 is available at http://magarveylab.ca/prism/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Detection of structural mosaicism from targeted and whole-genome sequencing data.

    Science.gov (United States)

    King, Daniel A; Sifrim, Alejandro; Fitzgerald, Tomas W; Rahbari, Raheleh; Hobson, Emma; Homfray, Tessa; Mansour, Sahar; Mehta, Sarju G; Shehla, Mohammed; Tomkins, Susan E; Vasudevan, Pradeep C; Hurles, Matthew E

    2017-10-01

    Structural mosaic abnormalities are large post-zygotic mutations present in a subset of cells and have been implicated in developmental disorders and cancer. Such mutations have been conventionally assessed in clinical diagnostics using cytogenetic or microarray testing. Modern disease studies rely heavily on exome sequencing, yet an adequate method for the detection of structural mosaicism using targeted sequencing data is lacking. Here, we present a method, called MrMosaic, to detect structural mosaic abnormalities using deviations in allele fraction and read coverage from next-generation sequencing data. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) simulations were used to calculate detection performance across a range of mosaic event sizes, types, clonalities, and sequencing depths. The tool was applied to 4911 patients with undiagnosed developmental disorders, and 11 events among nine patients were detected. For eight of these 11 events, mosaicism was observed in saliva but not blood, suggesting that assaying blood alone would miss a large fraction, possibly >50%, of mosaic diagnostic chromosomal rearrangements. © 2017 King et al.; Published by Cold Spring Harbor Laboratory Press.

  9. Genome Structural Diversity among 31 Bordetella pertussis Isolates from Two Recent U.S. Whooping Cough Statewide Epidemics.

    Science.gov (United States)

    Bowden, Katherine E; Weigand, Michael R; Peng, Yanhui; Cassiday, Pamela K; Sammons, Scott; Knipe, Kristen; Rowe, Lori A; Loparev, Vladimir; Sheth, Mili; Weening, Keeley; Tondella, M Lucia; Williams, Margaret M

    2016-01-01

    During 2010 and 2012, California and Vermont, respectively, experienced statewide epidemics of pertussis with differences seen in the demographic affected, case clinical presentation, and molecular epidemiology of the circulating strains. To overcome limitations of the current molecular typing methods for pertussis, we utilized whole-genome sequencing to gain a broader understanding of how current circulating strains are causing large epidemics. Through the use of combined next-generation sequencing technologies, this study compared de novo, single-contig genome assemblies from 31 out of 33 Bordetella pertussis isolates collected during two separate pertussis statewide epidemics and 2 resequenced vaccine strains. Final genome architecture assemblies were verified with whole-genome optical mapping. Sixteen distinct genome rearrangement profiles were observed in epidemic isolate genomes, all of which were distinct from the genome structures of the two resequenced vaccine strains. These rearrangements appear to be mediated by repetitive sequence elements, such as high-copy-number mobile genetic elements and rRNA operons. Additionally, novel and previously identified single nucleotide polymorphisms were detected in 10 virulence-related genes in the epidemic isolates. Whole-genome variation analysis identified state-specific variants, and coding regions bearing nonsynonymous mutations were classified into functional annotated orthologous groups. Comprehensive studies on whole genomes are needed to understand the resurgence of pertussis and develop novel tools to better characterize the molecular epidemiology of evolving B. pertussis populations. IMPORTANCE Pertussis, or whooping cough, is the most poorly controlled vaccine-preventable bacterial disease in the United States, which has experienced a resurgence for more than a decade. Once viewed as a monomorphic pathogen, B. pertussis strains circulating during epidemics exhibit diversity visible on a genome structural

  10. The cellulose synthase superfamily in fully sequenced plants and algae

    Directory of Open Access Journals (Sweden)

    Xu Ying

    2009-07-01

    Full Text Available Abstract Background The cellulose synthase superfamily has been classified into nine cellulose synthase-like (Csl families and one cellulose synthase (CesA family. The Csl families have been proposed to be involved in the synthesis of the backbones of hemicelluloses of plant cell walls. With 17 plant and algal genomes fully sequenced, we sought to conduct a genome-wide and systematic investigation of this superfamily through in-depth phylogenetic analyses. Results A single-copy gene is found in the six chlorophyte green algae, which is most closely related to the CslA and CslC families that are present in the seven land plants investigated in our analyses. Six proteins from poplar, grape and sorghum form a distinct family (CslJ, providing further support for the conclusions from two recent studies. CslB/E/G/H/J families have evolved significantly more rapidly than their widely distributed relatives, and tend to have intragenomic duplications, in particular in the grape genome. Conclusion Our data suggest that the CslA and CslC families originated through an ancient gene duplication event in land plants. We speculate that the single-copy Csl gene in green algae may encode a mannan synthase. We confirm that the rest of the Csl families have a different evolutionary origin than CslA and CslC, and have proposed a model for the divergence order among them. Our study provides new insights about the evolution of this important gene family in plants.

  11. Heterologous expression of an active chitin synthase from Rhizopus oryzae.

    Science.gov (United States)

    Salgado-Lugo, Holjes; Sánchez-Arreguín, Alejandro; Ruiz-Herrera, José

    2016-12-01

    Chitin synthases are highly important enzymes in nature, where they synthesize structural components in species belonging to different eukaryotic kingdoms, including kingdom Fungi. Unfortunately, their structure and the molecular mechanism of synthesis of their microfibrilar product remain largely unknown, probably because no fungal active chitin synthases have been isolated, possibly due to their extreme hydrophobicity. In this study we have turned to the heterologous expression of the transcript from a small chitin synthase of Rhizopus oryzae (RO3G_00942, Chs1) in Escherichia coli. The enzyme was active, but accumulated mostly in inclusion bodies. High concentrations of arginine or urea solubilized the enzyme, but their dilution led to its denaturation and precipitation. Nevertheless, use of urea permitted the purification of small amounts of the enzyme. The properties of Chs1 (Km, optimum temperature and pH, effect of GlcNAc) were abnormal, probably because it lacks the hydrophobic transmembrane regions characteristic of chitin synthases. The product of the enzyme showed that, contrasting with chitin made by membrane-bound Chs's and chitosomes, was only partially in the form of short microfibrils of low crystallinity. This approach may lead to future developments to obtain active chitin synthases that permit understanding their molecular mechanism of activity, and microfibril assembly. Copyright © 2016. Published by Elsevier Inc.

  12. Selectivity of the surface binding site (SBS) on barley starch synthase I

    DEFF Research Database (Denmark)

    Wilkens, Casper; Cuesta-Seijo, Jose A.; Palcic, Monica

    2014-01-01

    Starch synthase I (SSI) from various sources has been shown to preferentially elongate branch chains of degree of polymerisation (DP) from 6–7 to produce chains of DP 8–12. In the recently determined crystal structure of barley starch synthase I (HvSSI) a so-called surface binding site (SBS) was ...

  13. Novel applications of plant polyketide synthases.

    Science.gov (United States)

    Abe, Ikuro

    2012-04-01

    The structurally and mechanistically simple type III polyketide synthases (PKSs) catalyze iterative condensations of CoA thioesters to produce a variety of polyketide scaffolds with remarkably diverse structures and biological activities. By exploiting the enzymes, we combined precursor-directed biosynthesis with nitrogen-containing substrates and structure-based enzyme engineering and generated unnatural, novel polyketide-alkaloid scaffolds with promising biological activities. The nucleophilic nitrogen atom and the engineered enzymes thus facilitated the formation of additional CC and CN bonds during the enzymatic transformations. The methodology will contribute to the further production of chemically and structurally divergent, unnatural natural products, as well as the rational design of novel biocatalysts with unprecedented catalytic functions. Copyright © 2011 Elsevier Ltd. All rights reserved.

  14. Structures of the N-acetyltransferase domain of Xylella fastidiosa N-acetyl-L-glutamate synthase/kinase with and without a His tag bound to N-acetyl-L-glutamate.

    Science.gov (United States)

    Zhao, Gengxiang; Jin, Zhongmin; Allewell, Norma M; Tuchman, Mendel; Shi, Dashuang

    2015-01-01

    Structures of the catalytic N-acetyltransferase (NAT) domain of the bifunctional N-acetyl-L-glutamate synthase/kinase (NAGS/K) from Xylella fastidiosa bound to N-acetyl-L-glutamate (NAG) with and without an N-terminal His tag have been solved and refined at 1.7 and 1.4 Å resolution, respectively. The NAT domain with an N-terminal His tag crystallized in space group P4(1)2(1)2, with unit-cell parameters a=b=51.72, c=242.31 Å. Two subunits form a molecular dimer in the asymmetric unit, which contains ∼41% solvent. The NAT domain without an N-terminal His tag crystallized in space group P21, with unit-cell parameters a=63.48, b=122.34, c=75.88 Å, β=107.6°. Eight subunits, which form four molecular dimers, were identified in the asymmetric unit, which contains ∼38% solvent. The structures with and without the N-terminal His tag provide an opportunity to evaluate how the His tag affects structure and function. Furthermore, multiple subunits in different packing environments allow an assessment of the plasticity of the NAG binding site, which might be relevant to substrate binding and product release. The dimeric structure of the X. fastidiosa N-acetytransferase (xfNAT) domain is very similar to that of human N-acetyltransferase (hNAT), reinforcing the notion that mammalian NAGS is evolutionally derived from bifunctional bacterial NAGS/K.

  15. Comparisons of Copy Number, Genomic Structure, and Conserved Motifs for α-Amylase Genes from Barley, Rice, and Wheat

    Directory of Open Access Journals (Sweden)

    Qisen Zhang

    2017-10-01

    Full Text Available Barley is an important crop for the production of malt and beer. However, crops such as rice and wheat are rarely used for malting. α-amylase is the key enzyme that degrades starch during malting. In this study, we compared the genomic properties, gene copies, and conserved promoter motifs of α-amylase genes in barley, rice, and wheat. In all three crops, α-amylase consists of four subfamilies designated amy1, amy2, amy3, and amy4. In wheat and barley, members of amy1 and amy2 genes are localized on chromosomes 6 and 7, respectively. In rice, members of amy1 genes are found on chromosomes 1 and 2, and amy2 genes on chromosome 6. The barley genome has six amy1 members and three amy2 members. The wheat B genome contains four amy1 members and three amy2 members, while the rice genome has three amy1 members and one amy2 member. The B genome has mostly amy1 and amy2 members among the three wheat genomes. Amy1 promoters from all three crop genomes contain a GA-responsive complex consisting of a GA-responsive element (CAATAAA, pyrimidine box (CCTTTT and TATCCAT/C box. This study has shown that amy1 and amy2 from both wheat and barley have similar genomic properties, including exon/intron structures and GA-responsive elements on promoters, but these differ in rice. Like barley, wheat should have sufficient amy activity to degrade starch completely during malting. Other factors, such as high protein with haze issues and the lack of husk causing Lauting difficulty, may limit the use of wheat for brewing.

  16. Comparisons of Copy Number, Genomic Structure, and Conserved Motifs for α-Amylase Genes from Barley, Rice, and Wheat.

    Science.gov (United States)

    Zhang, Qisen; Li, Chengdao

    2017-01-01

    Barley is an important crop for the production of malt and beer. However, crops such as rice and wheat are rarely used for malting. α-amylase is the key enzyme that degrades starch during malting. In this study, we compared the genomic properties, gene copies, and conserved promoter motifs of α-amylase genes in barley, rice, and wheat. In all three crops, α-amylase consists of four subfamilies designated amy1, amy2 , amy3 , and amy4 . In wheat and barley, members of amy1 and amy2 genes are localized on chromosomes 6 and 7, respectively. In rice, members of amy1 genes are found on chromosomes 1 and 2, and amy2 genes on chromosome 6. The barley genome has six amy1 members and three amy2 members. The wheat B genome contains four amy1 members and three amy2 members, while the rice genome has three amy1 members and one amy2 member. The B genome has mostly amy1 and amy2 members among the three wheat genomes. Amy1 promoters from all three crop genomes contain a GA-responsive complex consisting of a GA-responsive element (CAATAAA), pyrimidine box (CCTTTT) and TATCCAT/C box. This study has shown that amy1 and amy2 from both wheat and barley have similar genomic properties, including exon/intron structures and GA-responsive elements on promoters, but these differ in rice. Like barley, wheat should have sufficient amy activity to degrade starch completely during malting. Other factors, such as high protein with haze issues and the lack of husk causing Lauting difficulty, may limit the use of wheat for brewing.

  17. Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber

    NARCIS (Netherlands)

    Zhang, Z.; Mao, L.; Chen, Junshi; Bu, F.; Li, G.; Sun, J.; Li, S.; Sun, H.; Jiao, C.; Blakely, R.; Pan, J.; Cai, R.; Luo, R.; Peer, Van de Y.; Jacobsen, E.; Fei, Z.; Huang, S.

    2015-01-01

    Structural variations (SVs) represent a major source of genetic diversity. However, the functional impact and formation mechanisms of SVs in plant genomes remain largely unexplored. Here, we report a nucleotide-resolution SV map of cucumber (Cucumis sativas) that comprises 26,788 SVs based on deep

  18. Insight into structure and assembly of the nuclear pore complex by utilizing the genome of a eukaryotic thermophile

    DEFF Research Database (Denmark)

    Amlacher, Stefan; Sarges, Phillip; Flemming, Dirk

    2011-01-01

    Despite decades of research, the structure and assembly of the nuclear pore complex (NPC), which is composed of ~30 nucleoporins (Nups), remain elusive. Here, we report the genome of the thermophilic fungus Chaetomium thermophilum (ct) and identify the complete repertoire of Nups therein. The the...... of a thermophilic eukaryote for studying complex molecular machines....

  19. Selection Effects on the Positioning of Genes and Gene Structures from the Interplay of Replication and Transcription in Bacterial Genomes

    Directory of Open Access Journals (Sweden)

    Kazuharu Arakawa

    2007-01-01

    Full Text Available Bacterial chromosomes are partly shaped by the functional requirements for efficient replication, which lead to strand bias as commonly characterized by the excess of guanines over cytosines in the leading strand. Gene structures are also highly organized within bacterial genomes as a result of such functional constraints, displaying characteristic positioning and structuring along the genome. Here we analyze the gene structures in completely sequenced bacterial chromosomes to observe the positional constraints on gene orientation, length, and codon usage with regard to the positions of replication origin and terminus. Selection on these gene features is different in regions surrounding the terminus of replication from the rest of the genome, but the selection could be either positive or negative depending on the species, and these positional effects are partly attributed to the A-T enrichment near the terminus. Characteristic gene structuring relative to the position of replication origin and terminus is commonly observed among most bacterial species with circular chromosomes, and therefore we argue that the highly organized gene positioning as well as the strand bias should be considered for genomics studies of bacteria.

  20. Genome-wide analysis reveals population structure and selection in Chinese indigenous sheep breeds.

    Science.gov (United States)

    Wei, Caihong; Wang, Huihua; Liu, Gang; Wu, Mingming; Cao, Jiaxve; Liu, Zhen; Liu, Ruizao; Zhao, Fuping; Zhang, Li; Lu, Jian; Liu, Chousheng; Du, Lixin

    2015-03-17

    Traditionally, Chinese indigenous sheep were classified geographically and morphologically into three groups: Mongolian, Kazakh and Tibetan. Herein, we aimed to evaluate the population structure and genome selection among 140 individuals from ten representative Chinese indigenous sheep breeds: Ujimqin, Hu, Tong, Large-Tailed Han and Lop breed (Mongolian group); Duolang and Kazakh (Kazakh group); and Diqing, Plateau-type Tibetan, and Valley-type Tibetan breed (Tibetan group). We analyzed the population using principal component analysis (PCA), STRUCTURE and a Neighbor-Joining (NJ)-tree. In PCA plot, the Tibetan and Mongolian groups were clustered as expected; however, Duolang and Kazakh (Kazakh group) were segregated. STRUCTURE analyses suggested two subpopulations: one from North China (Kazakh and Mongolian groups) and the other from the Southwest (Tibetan group). In the NJ-tree, the Tibetan group formed an independent branch and the Kazakh and Mongolian groups were mixed. We then used the d i statistic approach to reveal selection in Chinese indigenous sheep breeds. Among the 599 genome sequence windows analyzed, sixteen (2.7%) exhibited signatures of selection in four or more breeds. We detected three strong selection windows involving three functional genes: RXFP2, PPP1CC and PDGFD. PDGFD, one of the four subfamilies of PDGF, which promotes proliferation and inhibits differentiation of preadipocytes, was significantly selected in fat type breeds by the Rsb (across pairs of populations) approach. Two consecutive selection regions in Duolang sheep were obviously different to other breeds. One region was in OAR2 including three genes (NPR2, SPAG8 and HINT2) the influence growth traits. The other region was in OAR 6 including four genes (PKD2, SPP1, MEPE, and IBSP) associated with a milk production quantitative trait locus. We also identified known candidate genes such as BMPR1B, MSRB3, and three genes (KIT, MC1R, and FRY) that influence lambing percentage, ear size

  1. The first insight into the salvia (lamiaceae) genome via bac library construction and high-throughput sequencing of target bac clones

    International Nuclear Information System (INIS)

    Hao, D.C.; Vautrin, S.; Berges, H.; Chen, S.L.

    2015-01-01

    Salvia is a representative genus of Lamiaceae, a eudicot family with significant species diversity and population adaptibility. One of the key goals of Salvia genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of medicinal plants to increase their health and productivity. Large-insert genomic libraries are a fundamental tool for achieving this purpose. We report herein the construction, characterization and screening of a gridded BAC library for Salvia officinalis (sage). The S. officinalis BAC library consists of 17,764 clones and the average insert size is 107 Kb, corresponding to 3 haploid genome equivalents. Seventeen positive clones (average insert size 115 Kb) containing five terpene synthase (TPS) genes were screened out by PCR and 12 of them were subject to Illumina HiSeq 2000 sequencing, which yielded 28,097,480 90-bp raw reads (2.53 Gb). Scaffolds containing sabinene synthase (Sab), a Sab homolog, TPS3 (kaurene synthase-like 2), copalyl diphosphate synthase 2 and one cytochrome P450 gene were retrieved via de novo assembly and annotation, which also have flanking noncoding sequences, including predicted promoters and repeat sequences. Among 2,638 repeat sequences, there are 330 amplifiable microsatellites. This BAC library provides a new resource for Lamiaceae genomic studies, including microsatellite marker development, physical mapping, comparative genomics and genome sequencing. Characterization of positive clones provided insights into the structure of the Salvia genome. These sequences will be used in the assembly of a future genome sequence for S. officinalis. (author)

  2. Unexpected structural complexity of supernumerary marker chromosomes characterized by microarray comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Hing Anne V

    2008-04-01

    Full Text Available Abstract Background Supernumerary marker chromosomes (SMCs are structurally abnormal extra chromosomes that cannot be unambiguously identified by conventional banding techniques. In the past, SMCs have been characterized using a variety of different molecular cytogenetic techniques. Although these techniques can sometimes identify the chromosome of origin of SMCs, they are cumbersome to perform and are not available in many clinical cytogenetic laboratories. Furthermore, they cannot precisely determine the region or breakpoints of the chromosome(s involved. In this study, we describe four patients who possess one or more SMCs (a total of eight SMCs in all four patients that were characterized by microarray comparative genomic hybridization (array CGH. Results In at least one SMC from all four patients, array CGH uncovered unexpected complexity, in the form of complex rearrangements, that could have gone undetected using other molecular cytogenetic techniques. Although array CGH accurately defined the chromosome content of all but two minute SMCs, fluorescence in situ hybridization was necessary to determine the structure of the markers. Conclusion The increasing use of array CGH in clinical cytogenetic laboratories will provide an efficient method for more comprehensive characterization of SMCs. Improved SMC characterization, facilitated by array CGH, will allow for more accurate SMC/phenotype correlation.

  3. Genomic structure, expression and association study of the porcine FSD2.

    Science.gov (United States)

    Lim, Kyu-Sang; Lee, Kyung-Tai; Lee, Si-Woo; Chai, Han-Ha; Jang, Gulwon; Hong, Ki-Chang; Kim, Tae-Hun

    2016-09-01

    The fibronectin type III and SPRY domain containing 2 (FSD2) on porcine chromosome 7 is considered a candidate gene for pork quality, since its two domains, which were present in fibronectin and ryanodine receptor. The fibronectin type III and SPRY domains were first identified in fibronectin and ryanodine receptor, respectively, which are candidate genes for meat quality. The aim of this study was to elucidate the genomic structure of FSD2 and functions of single nucleotide polymorphisms (SNPs) within FSD2 that are related to meat quality in pigs. Using a bacterial artificial chromosome clone sequence, we revealed that porcine FSD2 consisted of 13 exons encoding 750 amino acids. In addition, FSD2 was expressed in heart, longissimus dorsi muscle, psoas muscle, and tendon among 23 kinds of porcine tissues tested. A total of ten SNPs, including four missense mutations, were identified in the exonic region of FSD2, and two major haplotypes were obtained based on the SNP genotypes of 633 Berkshire pigs. Both haplotypes were associated significantly with intramuscular fat content (IMF, P meat color, affecting yellowness (P = 0.002). These haplotype effects were further supported by the alteration of putative protein structures with amino acid substitutions. Taken together, our results suggest that FSD2 haplotypes are involved in regulating meat quality including IMF, MP, and meat color in pigs, and may be used as meaningful molecular makers to identify pigs with preferable pork quality.

  4. Functional and Structural Overview of G-Protein-Coupled Receptors Comprehensively Obtained from Genome Sequences

    Directory of Open Access Journals (Sweden)

    Makiko Suwa

    2011-04-01

    Full Text Available An understanding of the functional mechanisms of G-protein-coupled receptors (GPCRs is very important for GPCR-related drug design. We have developed an integrated GPCR database (SEVENS http://sevens.cbrc.jp/ that includes 64,090 reliable GPCR genes comprehensively identified from 56 eukaryote genome sequences, and overviewed the sequences and structure spaces of the GPCRs. In vertebrates, the number of receptors for biological amines, peptides, etc. is conserved in most species, whereas the number of chemosensory receptors for odorant, pheromone, etc. significantly differs among species. The latter receptors tend to be single exon type or a few exon type and show a high ratio in the numbers of GPCRs, whereas some families, such as Class B and Class C receptors, have long lengths due to the presence of many exons. Statistical analyses of amino acid residues reveal that most of the conserved residues in Class A GPCRs are found in the cytoplasmic half regions of transmembrane (TM helices, while residues characteristic to each subfamily found on the extracellular half regions. The 69 of Protein Data Bank (PDB entries of complete or fragmentary structures could be mapped on the TM/loop regions of Class A GPCRs covering 14 subfamilies.

  5. Genome Analysis of Structure-Function Relationships in Respiratory Complex I, an Ancient Bioenergetic Enzyme.

    Science.gov (United States)

    Degli Esposti, Mauro

    2015-11-27

    Respiratory complex I (NADH:ubiquinone oxidoreductase) is a ubiquitous bioenergetic enzyme formed by over 40 subunits in eukaryotes and a minimum of 11 subunits in bacteria. Recently, crystal structures have greatly advanced our knowledge of complex I but have not clarified the details of its reaction with ubiquinone (Q). This reaction is essential for bioenergy production and takes place in a large cavity embedded within a conserved module that is homologous to the catalytic core of Ni-Fe hydrogenases. However, how a hydrogenase core has evolved into the protonmotive Q reductase module of complex I has remained unclear. This work has exploited the abundant genomic information that is currently available to deduce structure-function relationships in complex I that indicate the evolutionary steps of Q reactivity and its adaptation to natural Q substrates. The results provide answers to fundamental questions regarding various aspects of complex I reaction with Q and help re-defining the old concept that this reaction may involve two Q or inhibitor sites. The re-definition leads to a simplified classification of the plethora of complex I inhibitors while throwing a new light on the evolution of the enzyme function. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  6. Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome.

    Science.gov (United States)

    Zhang, J; Feuk, L; Duggan, G E; Khaja, R; Scherer, S W

    2006-01-01

    The discovery of an abundance of copy number variants (CNVs; gains and losses of DNA sequences >1 kb) and other structural variants in the human genome is influencing the way research and diagnostic analyses are being designed and interpreted. As such, comprehensive databases with the most relevant information will be critical to fully understand the results and have impact in a diverse range of disciplines ranging from molecular biology to clinical genetics. Here, we describe the development of bioinformatics resources to facilitate these studies. The Database of Genomic Variants (http://projects.tcag.ca/variation/) is a comprehensive catalogue of structural variation in the human genome. The database currently contains 1,267 regions reported to contain copy number variation or inversions in apparently healthy human cases. We describe the current contents of the database and how it can serve as a resource for interpretation of array comparative genomic hybridization (array CGH) and other DNA copy imbalance data. We also present the structure of the database, which was built using a new data modeling methodology termed Cross-Referenced Tables (XRT). This is a generic and easy-to-use platform, which is strong in handling textual data and complex relationships. Web-based presentation tools have been built allowing publication of XRT data to the web immediately along with rapid sharing of files with other databases and genome browsers. We also describe a novel tool named eFISH (electronic fluorescence in situ hybridization) (http://projects.tcag.ca/efish/), a BLAST-based program that was developed to facilitate the choice of appropriate clones for FISH and CGH experiments, as well as interpretation of results in which genomic DNA probes are used in hybridization-based experiments. Copyright (c) 2006 S. Karger AG, Basel.

  7. Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress

    Directory of Open Access Journals (Sweden)

    Benham Craig J

    2006-05-01

    Full Text Available Abstract Background In our previous studies, we found that the sites in prokaryotic genomes which are most susceptible to duplex destabilization under the negative superhelical stresses that occur in vivo are statistically highly significantly associated with intergenic regions that are known or inferred to contain promoters. In this report we investigate how this structural property, either alone or together with other structural and sequence attributes, may be used to search prokaryotic genomes for promoters. Results We show that the propensity for stress-induced DNA duplex destabilization (SIDD is closely associated with specific promoter regions. The extent of destabilization in promoter-containing regions is found to be bimodally distributed. When compared with DNA curvature, deformability, thermostability or sequence motif scores within the -10 region, SIDD is found to be the most informative DNA property regarding promoter locations in the E. coli K12 genome. SIDD properties alone perform better at detecting promoter regions than other programs trained on this genome. Because this approach has a very low false positive rate, it can be used to predict with high confidence the subset of promoters that are strongly destabilized. When SIDD properties are combined with -10 motif scores in a linear classification function, they predict promoter regions with better than 80% accuracy. When these methods were tested with promoter and non-promoter sequences from Bacillus subtilis, they achieved similar or higher accuracies. We also present a strictly SIDD-based predictor for annotating promoter sequences in complete microbial genomes. Conclusion In this report we show that the propensity to undergo stress-induced duplex destabilization (SIDD is a distinctive structural attribute of many prokaryotic promoter sequences. We have developed methods to identify promoter sequences in prokaryotic genomes that use SIDD either as a sole predictor or in

  8. Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress.

    Science.gov (United States)

    Wang, Huiquan; Benham, Craig J

    2006-05-05

    In our previous studies, we found that the sites in prokaryotic genomes which are most susceptible to duplex destabilization under the negative superhelical stresses that occur in vivo are statistically highly significantly associated with intergenic regions that are known or inferred to contain promoters. In this report we investigate how this structural property, either alone or together with other structural and sequence attributes, may be used to search prokaryotic genomes for promoters. We show that the propensity for stress-induced DNA duplex destabilization (SIDD) is closely associated with specific promoter regions. The extent of destabilization in promoter-containing regions is found to be bimodally distributed. When compared with DNA curvature, deformability, thermostability or sequence motif scores within the -10 region, SIDD is found to be the most informative DNA property regarding promoter locations in the E. coli K12 genome. SIDD properties alone perform better at detecting promoter regions than other programs trained on this genome. Because this approach has a very low false positive rate, it can be used to predict with high confidence the subset of promoters that are strongly destabilized. When SIDD properties are combined with -10 motif scores in a linear classification function, they predict promoter regions with better than 80% accuracy. When these methods were tested with promoter and non-promoter sequences from Bacillus subtilis, they achieved similar or higher accuracies. We also present a strictly SIDD-based predictor for annotating promoter sequences in complete microbial genomes. In this report we show that the propensity to undergo stress-induced duplex destabilization (SIDD) is a distinctive structural attribute of many prokaryotic promoter sequences. We have developed methods to identify promoter sequences in prokaryotic genomes that use SIDD either as a sole predictor or in combination with other DNA structural and sequence properties

  9. Glycogen Synthase Kinase-3β

    DEFF Research Database (Denmark)

    Munkholm, Klaus; Lenskjold, Toke; Jacoby, Anne Sophie

    2016-01-01

    Evidence indicates a role for glycogen synthase kinase-3β (GSK-3β) in the pathophysiology of mood disorders and in cognitive disturbances; however, the natural variation in GSK-3β activity over time is unknown. We aimed to investigate GSK-3β activity over time and its possible correlation...

  10. Cloning of rat thymic stromal lymphopoietin receptor (TSLPR) and characterization of genomic structure of murine Tslpr gene

    DEFF Research Database (Denmark)

    Blagoev, Blagoy; Nielsen, Mogens M; Angrist, Misha

    2002-01-01

    IL-2 receptor common gamma chain (Il2rg). Use of an alternative splice acceptor site leads to two alternatively spliced transcript variants of murine TSLPR, both of which are functional receptors. Finally, using linkage analysis, we mapped the murine Tslpr gene to mouse chromosome 5 between the Ecm2...... expressed in rats suggesting that TSLPR may have roles in signaling outside the hematopoietic system. A zooblot analysis revealed that TSLPR is expressed in all vertebrate species examined. The absence of TSLPR in Saccharomyces cerevisiae, Drosophila melanogaster and Caenorhabditis elegans genomes...... is similar to the expression of several other cytokine receptors that have been characterized thus far. We have also characterized the genomic structure of the murine Tslpr gene which shows that in addition to primary sequence homology, it shares a common genomic organization of coding exons with the murine...

  11. Predicting the catalytic sites of isopenicillin N synthase (IPNS ...

    African Journals Online (AJOL)

    Isopenicillin N synthase (IPNS) related Non-haem iron-dependent oxygenases and oxidases (NHIDOX) demonstrated a striking structural conservativeness, even with low protein sequence homology. It is evident that these enzymes have an architecturally similar catalytic centre with active ligands lining the reactive pocket.

  12. Characterising the cellulose synthase complexes of cell walls

    NARCIS (Netherlands)

    Mansoori Zangir, N.

    2012-01-01

    One of the characteristics of the plant kingdom is the presence of a structural cell wall. Cellulose is a major component in both the primary and secondary cell walls of plants. In higher plants cellulose is synthesized by so called rosette protein complexes with cellulose synthases (CESAs) as

  13. Characterising the cellulose synthase complexes of cell walls

    NARCIS (Netherlands)

    Mansoori Zangir, N.

    2012-01-01

    One of the characteristics of the plant kingdom is the presence of a structural cell wall. Cellulose is a major component in both the primary and secondary cell walls of plants. In higher plants cellulose is synthesized by so called rosette protein complexes with cellulose synthases (CESAs) as the

  14. A high quality assembly of the Nile Tilapia (Oreochromis niloticus) genome reveals the structure of two sex determination regions.

    Science.gov (United States)

    Conte, Matthew A; Gammerdinger, William J; Bartie, Kerry L; Penman, David J; Kocher, Thomas D

    2017-05-02

    Tilapias are the second most farmed fishes in the world and a sustainable source of food. Like many other fish, tilapias are sexually dimorphic and sex is a commercially important trait in these fish. In this study, we developed a significantly improved assembly of the tilapia genome using the latest genome sequencing methods and show how it improves the characterization of two sex determination regions in two tilapia species. A homozygous clonal XX female Nile tilapia (Oreochromis niloticus) was sequenced to 44X coverage using Pacific Biosciences (PacBio) SMRT sequencing. Dozens of candidate de novo assemblies were generated and an optimal assembly (contig NG50 of 3.3Mbp) was selected using principal component analysis of likelihood scores calculated from several paired-end sequencing libraries. Comparison of the new assembly to the previous O. niloticus genome assembly reveals that recently duplicated portions of the genome are now well represented. The overall number of genes in the new assembly increased by 27.3%, including a 67% increase in pseudogenes. The new tilapia genome assembly correctly represents two recent vasa gene duplication events that have been verified with BAC sequencing. At total of 146Mbp of additional transposable element sequence are now assembled, a large proportion of which are recent insertions. Large centromeric satellite repeats are assembled and annotated in cichlid fish for the first time. Finally, the new assembly identifies the long-range structure of both a ~9Mbp XY sex determination region on LG1 in O. niloticus, and a ~50Mbp WZ sex determination region on LG3 in the related species O. aureus. This study highlights the use of long read sequencing to correctly assemble recent duplications and to characterize repeat-filled regions of the genome. The study serves as an example of the need for high quality genome assemblies and provides a framework for identifying sex determining genes in tilapia and related fish species.

  15. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome.

    Science.gov (United States)

    Collins, Ryan L; Brand, Harrison; Redin, Claire E; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon-Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A; Lucente, Diane; Levy, Brynn; Sanders, Stephan J; Wapner, Ronald J; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E

    2017-03-06

    Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV. These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease.

  16. Reproductive Mode and the Evolution of Genome Size and Structure in Caenorhabditis Nematodes.

    Science.gov (United States)

    Fierst, Janna L; Willis, John H; Thomas, Cristel G; Wang, Wei; Reynolds, Rose M; Ahearne, Timothy E; Cutter, Asher D; Phillips, Patrick C

    2015-06-01

    The self-fertile nematode worms Caenorhabditis elegans, C. briggsae, and C. tropicalis evolved independently from outcrossing male-female ancestors and have genomes 20-40% smaller than closely related outcrossing relatives. This pattern of smaller genomes for selfing species and larger genomes for closely related outcrossing species is also seen in plants. We use comparative genomics, including the first high quality genome assembly for an outcrossing member of the genus (C. remanei) to test several hypotheses for the evolution of genome reduction under a change in mating system. Unlike plants, it does not appear that reductions in the number of repetitive elements, such as transposable elements, are an important contributor to the change in genome size. Instead, all functional genomic categories are lost in approximately equal proportions. Theory predicts that self-fertilization should equalize the effective population size, as well as the resulting effects of genetic drift, between the X chromosome and autosomes. Contrary to this, we find that the self-fertile C. briggsae and C. elegans have larger intergenic spaces and larger protein-coding genes on the X chromosome when compared to autosomes, while C. remanei actually has smaller introns on the X chromosome than either self-reproducing species. Rather than being driven by mutational biases and/or genetic drift caused by a reduction in effective population size under self reproduction, changes in genome size in this group of nematodes appear to be caused by genome-wide patterns of gene loss, most likely generated by genomic adaptation to self reproduction per se.

  17. Reproductive Mode and the Evolution of Genome Size and Structure in Caenorhabditis Nematodes.

    Directory of Open Access Journals (Sweden)

    Janna L Fierst

    2015-06-01

    Full Text Available The self-fertile nematode worms Caenorhabditis elegans, C. briggsae, and C. tropicalis evolved independently from outcrossing male-female ancestors and have genomes 20-40% smaller than closely related outcrossing relatives. This pattern of smaller genomes for selfing species and larger genomes for closely related outcrossing species is also seen in plants. We use comparative genomics, including the first high quality genome assembly for an outcrossing member of the genus (C. remanei to test several hypotheses for the evolution of genome reduction under a change in mating system. Unlike plants, it does not appear that reductions in the number of repetitive elements, such as transposable elements, are an important contributor to the change in genome size. Instead, all functional genomic categories are lost in approximately equal proportions. Theory predicts that self-fertilization should equalize the effective population size, as well as the resulting effects of genetic drift, between the X chromosome and autosomes. Contrary to this, we find that the self-fertile C. briggsae and C. elegans have larger intergenic spaces and larger protein-coding genes on the X chromosome when compared to autosomes, while C. remanei actually has smaller introns on the X chromosome than either self-reproducing species. Rather than being driven by mutational biases and/or genetic drift caused by a reduction in effective population size under self reproduction, changes in genome size in this group of nematodes appear to be caused by genome-wide patterns of gene loss, most likely generated by genomic adaptation to self reproduction per se.

  18. Functional Characterization of Nine Norway Spruce TPS Genes and Evolution of Gymnosperm Terpene Synthases of the TPS-d Subfamily1[w

    Science.gov (United States)

    Martin, Diane M.; Fäldt, Jenny; Bohlmann, Jörg

    2004-01-01

    Constitutive and induced terpenoids are important defense compounds for many plants against potential herbivores and pathogens. In Norway spruce (Picea abies L. Karst), treatment with methyl jasmonate induces complex chemical and biochemical terpenoid defense responses associated with traumatic resin duct development in stems and volatile terpenoid emissions in needles. The cloning of (+)-3-carene synthase was the first step in characterizing this system at the molecular genetic level. Here we report the isolation and functional characterization of nine additional terpene synthase (TPS) cDNAs from Norway spruce. These cDNAs encode four monoterpene synthases, myrcene synthase, (−)-limonene synthase, (−)-α/β-pinene synthase, and (−)-linalool synthase; three sesquiterpene synthases, longifolene synthase, E,E-α-farnesene synthase, and E-α-bisabolene synthase; and two diterpene synthases, isopimara-7,15-diene synthase and levopimaradiene/abietadiene synthase, each with a unique product profile. To our knowledge, genes encoding isopimara-7,15-diene synthase and longifolene synthase have not been previously described, and this linalool synthase is the first described from a gymnosperm. These functionally diverse TPS account for much of the structural diversity of constitutive and methyl jasmonate-induced terpenoids in foliage, xylem, bark, and volatile emissions from needles of Norway spruce. Phylogenetic analyses based on the inclusion of these TPS into the TPS-d subfamily revealed that functional specialization of conifer TPS occurred before speciation of Pinaceae. Furthermore, based on TPS enclaves created by distinct branching patterns, the TPS-d subfamily is divided into three groups according to sequence similarities and functional assessment. Similarities of TPS evolution in angiosperms and modeling of TPS protein structures are discussed. PMID:15310829

  19. Cell-of-origin-specific 3D genome structure acquired during somatic cell reprogramming

    NARCIS (Netherlands)

    Krijger, Peter Hugo Lodewijk; Di Stefano, Bruno; de Wit, Elzo; Limone, Francesco; Van Oevelen, Chris; De Laat, Wouter; Graf, Thomas

    2016-01-01

    Forced expression of reprogramming factors can convert somatic cells into induced pluripotent stem cells (iPSCs). Here we studied genome topology dynamics during reprogramming of different somatic cell types with highly distinct genome conformations. We find large-scale topologically associated

  20. Cell-of-Origin-Specific 3D Genome Structure Acquired during Somatic Cell Reprogramming

    NARCIS (Netherlands)

    Krijger, Peter Hugo Lodewijk; Di Stefano, Bruno; de Wit, Elzo; Limone, Francesco; van Oevelen, Chris; de Laat, Wouter; Graf, Thomas

    2016-01-01

    Forced expression of reprogramming factors can convert somatic cells into induced pluripotent stem cells (iPSCs). Here we studied genome topology dynamics during reprogramming of different somatic cell types with highly distinct genome conformations. We find large-scale topologically associated

  1. Structure and expression of the tomato spotted wilt virus genome : a plant-infecting bunyavirus

    NARCIS (Netherlands)

    Kormelink, R.J.M.

    1994-01-01

    This thesis describes studies which are aimed at the elucidation of the genetic organisation and expression strategy of the tomato spotted wilt virus (TSWV) RNA genome.

    Using specific cDNA clones, corresponding to all three genomic RNA segments, the synthesis of virus specific RNA

  2. The roles of adenoviral vectors and donor DNA structures on genome editing

    NARCIS (Netherlands)

    Holkers, Maarten

    2016-01-01

    Accurate and efficient genome editing is primarily dependent on the generation of a sequence-specific, genomic double-stranded DNA break (DSB) combined with the introduction of an exogenous DNA template into target cells. The exogenous template, called donor DNA, normally contains the foreign

  3. Structure-function analysis of Drosophila Notch using genomic rescue transgenes.

    Science.gov (United States)

    Leonardi, Jessica; Jafar-Nejad, Hamed

    2014-01-01

    One of the evolutionarily conserved posttranslational modifications of the Notch receptors is the addition of an O-linked glucose to epidermal growth factor-like (EGF) repeats with a specific consensus sequence by the protein O-glucosyltransferase Rumi (POGLUT1 in human). Loss of rumi in flies results in a temperature-sensitive loss of Notch signaling. To demonstrate that the Notch receptor itself is the biologically relevant target of Rumi in flies, and to determine the role of the 18 Rumi target sites on Notch in regulating Notch signaling, we have performed an in vivo structure-function analysis of Drosophila Notch. In this chapter, we provide a detailed protocol for this analysis. To avoid the potential artifacts associated with overexpression of Notch and random insertion of transgenes, we have used recombineering and site-specific integration technologies, which have been adapted for usage in Drosophila in recent years. Using gene synthesis and site-directed mutagenesis, we generated a series of Notch genomic transgenes which harbor mutations in all or specific subsets of Notch O-glucose sites. Gene dosage and rescue experiments in animals raised at various temperatures allowed us to dissect the contribution of O-glucosylation sites to the regulation of the Notch signaling strength. The reagents and methods presented here can be used to address similar questions about other posttranslational modifications of Notch or other Drosophila proteins.

  4. Current status of potential applications of repurposed Cas9 for structural and functional genomics of plants.

    Science.gov (United States)

    Seth, Kunal; Harish

    2016-11-25

    Redesigned Cas9 has emerged as a tool with various applications like gene editing, gene regulation, epigenetic modification and chromosomal imaging. Target specific single guide RNA (sgRNA) can be used with Cas9 for precise gene editing with high efficiency than previously known methods. Further, nuclease-deactivated Cas9 (dCas9) can be fused with activator or repressor for activation (CRISPRa) and repression (CRISPRi) of gene expression, respectively. dCas9 fused with epigenetic modifier like methylase or acetylase further expand the scope of this technique. Fluorescent probes can be tagged to dCas9 to visualize the chromosome. Due to its wide-spread application, simplicity, accessibility, efficacy and universality, this technique is expanding the structural and functional genomic studies of plant and developing CRISPR crops. The present review focuses on current status of using repurposed Cas9 system in these various areas, with major focus on application in plants. Major challenges, concerns and future directions of using this technique are discussed in brief. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. [Structural mechanism of immune evasion of HIV-1 gp120 by genomic, computational, and experimental science].

    Science.gov (United States)

    Yokoyama, Masaru

    2011-06-01

    The third variable region (V3) of the human immunodeficiency virus type 1 (HIV-1) envelope gp120 subunit participates in determination of viral infection co-receptor tropism and host humoral immune responses. Positive charge of the V3 plays a key role in determining viral co-receptor tropism. In our previous papers, we showed a key role of the V3's net positive charge in the immunological escape and co-receptor tropism evolution in vivo. On the other hand, the several papers suggested that trimeric gp120s are protected from immune system by occlusion on the oligomer, by mutational variation, by carbohydrate masking and by conformational masking. If we can reveal the mechanism of neutralization escape, we expect that we will regulate the neutralization of HIV-1. In this review, we will overview the structural mechanism of neutralization escape of HIV-1 gp120 examined by computational science. The computational sciences for virology can provide more valuable information in combination with genomic and experimental science.

  6. Structural and functional characterization of the Helicobacter pylori cytidine 5'-monophosphate-pseudaminic acid synthase PseF: molecular insight into substrate recognition and catalysis mechanism

    Directory of Open Access Journals (Sweden)

    Wahid SUH

    2017-10-01

    Full Text Available Syeda Umme Habiba Wahid Department of Microbiology, University of Chittagong, Chittagong, Bangladesh Abstract: The bacterium Helicobacter pylori is a human gastric pathogen that can cause a wide range of diseases, including chronic gastritis, peptic ulcer and gastric carcinoma. It is classified as a definitive (class I human carcinogen by the International Agency for Research on Cancer. Flagella-mediated motility is essential for H. pylori to initiate colonization and for the development of infection in human beings. Glycosylation of the H. pylori flagellum with pseudaminic acid (Pse; 5,7-diacetamido-3,5,7,9-tetradeoxy-l-glycero-l-manno-nonulosonic acid is essential for flagella assembly and function. The sixth step in the Pse biosynthesis pathway, activation of Pse by addition of a cytidine 5′-monophosphate (CMP to generate CMP-Pse, is catalyzed by a metal-dependent enzyme pseudaminic acid biosynthesis protein F (PseF using cytidine 5′-triphosphate (CTP as a cofactor. No crystal–structural information for PseF is available. This study describes the first three-dimensional model of H. pylori PseF obtained using biocomputational tools. PseF harbors an α/β-type hydrolase fold with a β-hairpin (HP dimerization domain. Comparison of PseF with other structural homologs allowed identification of crucial residues for substrate recognition and the catalytic mechanism. This structural information would pave the way to design novel therapeutics to combat bacterial infection. Keywords: H. pylori, motility, glycosylation, homology modeling, pseudaminic acid

  7. Identification and classification of conserved RNA secondary structures in the human genome

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Bejerano, Gill; Siepel, Adam

    2006-01-01

    for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set......The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars......, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization....

  8. Recovering Genomics Clusters of Secondary Metabolites from Lakes Using Genome-Resolved Metagenomics

    Directory of Open Access Journals (Sweden)

    Rafael R. C. Cuadrat

    2018-02-01

    Full Text Available Metagenomic approaches became increasingly popular in the past decades due to decreasing costs of DNA sequencing and bioinformatics development. So far, however, the recovery of long genes coding for secondary metabolites still represents a big challenge. Often, the quality of metagenome assemblies is poor, especially in environments with a high microbial diversity where sequence coverage is low and complexity of natural communities high. Recently, new and improved algorithms for binning environmental reads and contigs have been developed to overcome such limitations. Some of these algorithms use a similarity detection approach to classify the obtained reads into taxonomical units and to assemble draft genomes. This approach, however, is quite limited since it can classify exclusively sequences similar to those available (and well classified in the databases. In this work, we used draft genomes from Lake Stechlin, north-eastern Germany, recovered by MetaBat, an efficient binning tool that integrates empirical probabilistic distances of genome abundance, and tetranucleotide frequency for accurate metagenome binning. These genomes were screened for secondary metabolism genes, such as polyketide synthases (PKS and non-ribosomal peptide synthases (NRPS, using the Anti-SMASH and NAPDOS workflows. With this approach we were able to identify 243 secondary metabolite clusters from 121 genomes recovered from our lake samples. A total of 18 NRPS, 19 PKS, and 3 hybrid PKS/NRPS clusters were found. In addition, it was possible to predict the partial structure of several secondary metabolite clusters allowing for taxonomical classifications and phylogenetic inferences. Our approach revealed a high potential to recover and study secondary metabolites genes from any aquatic ecosystem.

  9. Salmonella strains isolated from Galápagos iguanas show spatial structuring of serovar and genomic diversity.

    Directory of Open Access Journals (Sweden)

    Emily W Lankau

    Full Text Available It is thought that dispersal limitation primarily structures host-associated bacterial populations because host distributions inherently limit transmission opportunities. However, enteric bacteria may disperse great distances during food-borne outbreaks. It is unclear if such rapid long-distance dispersal events happen regularly in natural systems or if these events represent an anthropogenic exception. We characterized Salmonella enterica isolates from the feces of free-living Galápagos land and marine iguanas from five sites on four islands using serotyping and genomic fingerprinting. Each site hosted unique and nearly exclusive serovar assemblages. Genomic fingerprint analysis offered a more complex model of S. enterica biogeography, with evidence of both unique strain pools and of spatial population structuring along a geographic gradient. These findings suggest that even relatively generalist enteric bacteria may be strongly dispersal limited in a natural system with strong barriers, such as oceanic divides. Yet, these differing results seen on two typing methods also suggests that genomic variation is less dispersal limited, allowing for different ecological processes to shape biogeographical patterns of the core and flexible portions of this bacterial species' genome.

  10. Salmonella strains isolated from Galápagos iguanas show spatial structuring of serovar and genomic diversity.

    Science.gov (United States)

    Lankau, Emily W; Cruz Bedon, Lenin; Mackie, Roderick I

    2012-01-01

    It is thought that dispersal limitation primarily structures host-associated bacterial populations because host distributions inherently limit transmission opportunities. However, enteric bacteria may disperse great distances during food-borne outbreaks. It is unclear if such rapid long-distance dispersal events happen regularly in natural systems or if these events represent an anthropogenic exception. We characterized Salmonella enterica isolates from the feces of free-living Galápagos land and marine iguanas from five sites on four islands using serotyping and genomic fingerprinting. Each site hosted unique and nearly exclusive serovar assemblages. Genomic fingerprint analysis offered a more complex model of S. enterica biogeography, with evidence of both unique strain pools and of spatial population structuring along a geographic gradient. These findings suggest that even relatively generalist enteric bacteria may be strongly dispersal limited in a natural system with strong barriers, such as oceanic divides. Yet, these differing results seen on two typing methods also suggests that genomic variation is less dispersal limited, allowing for different ecological processes to shape biogeographical patterns of the core and flexible portions of this bacterial species' genome.

  11. Salmonella Strains Isolated from Galápagos Iguanas Show Spatial Structuring of Serovar and Genomic Diversity

    Science.gov (United States)

    Lankau, Emily W.; Cruz Bedon, Lenin; Mackie, Roderick I.

    2012-01-01

    It is thought that dispersal limitation primarily structures host-associated bacterial populations because host distributions inherently limit transmission opportunities. However, enteric bacteria may disperse great distances during food-borne outbreaks. It is unclear if such rapid long-distance dispersal events happen regularly in natural systems or if these events represent an anthropogenic exception. We characterized Salmonella enterica isolates from the feces of free-living Galápagos land and marine iguanas from five sites on four islands using serotyping and genomic fingerprinting. Each site hosted unique and nearly exclusive serovar assemblages. Genomic fingerprint analysis offered a more complex model of S. enterica biogeography, with evidence of both unique strain pools and of spatial population structuring along a geographic gradient. These findings suggest that even relatively generalist enteric bacteria may be strongly dispersal limited in a natural system with strong barriers, such as oceanic divides. Yet, these differing results seen on two typing methods also suggests that genomic variation is less dispersal limited, allowing for different ecological processes to shape biogeographical patterns of the core and flexible portions of this bacterial species' genome. PMID:22615968

  12. SeqFold: genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data.

    Science.gov (United States)

    Ouyang, Zhengqing; Snyder, Michael P; Chang, Howard Y

    2013-02-01

    We present an integrative approach, SeqFold, that combines high-throughput RNA structure profiling data with computational prediction for genome-scale reconstruction of RNA secondary structures. SeqFold transforms experimental RNA structure information into a structure preference profile (SPP) and uses it to select stable RNA structure candidates representing the structure ensemble. Under a high-dimensional classification framework, SeqFold efficiently matches a given SPP to the most likely cluster of structures sampled from the Boltzmann-weighted ensemble. SeqFold is able to incorporate diverse types of RNA structure profiling data, including parallel analysis of RNA structure (PARS), selective 2'-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq), fragmentation sequencing (FragSeq) data generated by deep sequencing, and conventional SHAPE data. Using the known structures of a wide range of mRNAs and noncoding RNAs as benchmarks, we demonstrate that SeqFold outperforms or matches existing approaches in accuracy and is more robust to noise in experimental data. Application of SeqFold to reconstruct the secondary structures of the yeast transcriptome reveals the diverse impact of RNA secondary structure on gene regulation, including translation efficiency, transcription initiation, and protein-RNA interactions. SeqFold can be easily adapted to incorporate any new types of high-throughput RNA structure profiling data and is widely applicable to analyze RNA structures in any transcriptome.

  13. Bioinformatical approaches to RN