WorldWideScience

Sample records for elucidating genome structure

  1. Using Genomics for Natural Product Structure Elucidation.

    Science.gov (United States)

    Tietz, Jonathan I; Mitchell, Douglas A

    2016-01-01

    Natural products (NPs) are the most historically bountiful source of chemical matter for drug development-especially for anti-infectives. With insights gleaned from genome mining, interest in natural product discovery has been reinvigorated. An essential stage in NP discovery is structural elucidation, which sheds light not only on the chemical composition of a molecule but also its novelty, properties, and derivatization potential. The history of structure elucidation is replete with techniquebased revolutions: combustion analysis, crystallography, UV, IR, MS, and NMR have each provided game-changing advances; the latest such advance is genomics. All natural products have a genetic basis, and the ability to obtain and interpret genomic information for structure elucidation is increasingly available at low cost to non-specialists. In this review, we describe the value of genomics as a structural elucidation technique, especially from the perspective of the natural product chemist approaching an unknown metabolite. Herein we first introduce the databases and programs of interest to the natural products chemist, with an emphasis on those currently most suited for general usability. We describe strategies for linking observed natural product-linked phenotypes to their corresponding gene clusters. We then discuss techniques for extracting structural information from genes, illustrated with numerous case examples. We also provide an analysis of the biases and limitations of the field with recommendations for future development. Our overview is not only aimed at biologically-oriented researchers already at ease with bioinformatic techniques, but also, in particular, at natural product, organic, and/or medicinal chemists not previously familiar with genomic techniques.

  2. Elucidation of Operon Structures across Closely Related Bacterial Genomes

    Science.gov (United States)

    Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components. PMID:24959722

  3. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene

  4. Structure elucidation of secondary natural products

    International Nuclear Information System (INIS)

    Seger, C.

    2001-06-01

    The presented thesis deals with the structure elucidation of secondary natural products. Most of the compounds under investigation were terpenes, especially triterpenes, alkaloids and stilbenoids. Besides characterizing a multitude of already known and also new compounds, it was possible to detect and correct wrongly assigned literature data. The methodological aspect of this thesis lies - beside in the utilization of modern 2D NMR spectroscopy - in the evaluation of computer assisted structure elucidation (CASE) techniques in the course of spectroscopy supported structure elucidation processes. (author)

  5. Blind trials of computer-assisted structure elucidation software

    Directory of Open Access Journals (Sweden)

    Moser Arvin

    2012-02-01

    Full Text Available Abstract Background One of the largest challenges in chemistry today remains that of efficiently mining through vast amounts of data in order to elucidate the chemical structure for an unknown compound. The elucidated candidate compound must be fully consistent with the data and any other competing candidates efficiently eliminated without doubt by using additional data if necessary. It has become increasingly necessary to incorporate an in silico structure generation and verification tool to facilitate this elucidation process. An effective structure elucidation software technology aims to mimic the skills of a human in interpreting the complex nature of spectral data while producing a solution within a reasonable amount of time. This type of software is known as computer-assisted structure elucidation or CASE software. A systematic trial of the ACD/Structure Elucidator CASE software was conducted over an extended period of time by analysing a set of single and double-blind trials submitted by a global audience of scientists. The purpose of the blind trials was to reduce subjective bias. Double-blind trials comprised of data where the candidate compound was unknown to both the submitting scientist and the analyst. The level of expertise of the submitting scientist ranged from novice to expert structure elucidation specialists with experience in pharmaceutical, industrial, government and academic environments. Results Beginning in 2003, and for the following nine years, the algorithms and software technology contained within ACD/Structure Elucidator have been tested against 112 data sets; many of these were unique challenges. Of these challenges 9% were double-blind trials. The results of eighteen of the single-blind trials were investigated in detail and included problems of a diverse nature with many of the specific challenges associated with algorithmic structure elucidation such as deficiency in protons, structure symmetry, a large number of

  6. Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

    Science.gov (United States)

    Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

    2016-07-01

    This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.

  7. Decoding the non-coding genome: elucidating genetic risk outside the coding genome.

    Science.gov (United States)

    Barr, C L; Misener, V L

    2016-01-01

    Current evidence emerging from genome-wide association studies indicates that the genetic underpinnings of complex traits are likely attributable to genetic variation that changes gene expression, rather than (or in combination with) variation that changes protein-coding sequences. This is particularly compelling with respect to psychiatric disorders, as genetic changes in regulatory regions may result in differential transcriptional responses to developmental cues and environmental/psychosocial stressors. Until recently, however, the link between transcriptional regulation and psychiatric genetic risk has been understudied. Multiple obstacles have contributed to the paucity of research in this area, including challenges in identifying the positions of remote (distal from the promoter) regulatory elements (e.g. enhancers) and their target genes and the underrepresentation of neural cell types and brain tissues in epigenome projects - the availability of high-quality brain tissues for epigenetic and transcriptome profiling, particularly for the adolescent and developing brain, has been limited. Further challenges have arisen in the prediction and testing of the functional impact of DNA variation with respect to multiple aspects of transcriptional control, including regulatory-element interaction (e.g. between enhancers and promoters), transcription factor binding and DNA methylation. Further, the brain has uncommon DNA-methylation marks with unique genomic distributions not found in other tissues - current evidence suggests the involvement of non-CG methylation and 5-hydroxymethylation in neurodevelopmental processes but much remains unknown. We review here knowledge gaps as well as both technological and resource obstacles that will need to be overcome in order to elucidate the involvement of brain-relevant gene-regulatory variants in genetic risk for psychiatric disorders. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.

  8. Spectroscopic databases - A tool for structure elucidation

    Energy Technology Data Exchange (ETDEWEB)

    Luksch, P [Fachinformationszentrum Karlsruhe, Gesellschaft fuer Wissenschaftlich-Technische Information mbH, Eggenstein-Leopoldshafen (Germany)

    1990-05-01

    Spectroscopic databases have developed to useful tools in the process of structure elucidation. Besides the conventional library searches, new intelligent programs have been added, that are able to predict structural features from measured spectra or to simulate for a given structure. The example of the C13NMR/IR database developed at BASF and available on STN is used to illustrate the present capabilities of online database. New developments in the field of spectrum simulation and methods for the prediction of complete structures from spectroscopic information are reviewed. (author). 10 refs, 5 figs.

  9. NMR spectroscopy: structure elucidation of cycloelatanene A: a natural product case study.

    Science.gov (United States)

    Urban, Sylvia; Dias, Daniel Anthony

    2013-01-01

    The structure elucidation of new secondary metabolites derived from marine and terrestrial sources is frequently a challenging task. The hurdles include the ability to isolate stable secondary metabolites of sufficient purity that are often present in products that the compound may rapidly degrade during and/or after the isolation, due to sensitivity to light, air oxidation, and/or temperature. In this way, precautions need to be taken, as much as possible to avoid any such chemical inter-conversions and/or degradations. Immediately after purification, the next step is to rapidly acquire all analytical spectroscopic data in order to complete the characterization of the isolated secondary metabolite(s), prior to any possible decomposition. The final hurdle in this multiple step process, especially in the acquisition of the NMR spectroscopic and other analytical data (mass spectra, infrared and ultra-violet spectra, optical rotation, etc.), is to assemble the structural moieties/units in an effort to complete the structure elucidation. Often ambiguity with the elucidation of the final structure remains when structural fragments identified are difficult to piece together on the basis of the HMBC NMR correlations or when the relative configuration cannot be unequivocally identified on the basis of NOE NMR enhancements observed. Herein, we describe the methodology used to carry out the structure elucidation of a new C16 chamigrene, cycloelatanene A (5) which was isolated from the southern Australian marine alga Laurencia elata (Rhodomelaceae). The general approach and principles used in the structure determination of this compound can be applied to the structure elucidation of other small molecular weight compounds derived from either natural or synthetic sources.

  10. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  11. Chemical synthesis and structure elucidation of bovine κ-casein (1-44)

    International Nuclear Information System (INIS)

    Bansal, Paramjit S.; Grieve, Paul A.; Marschke, Ronald J.; Daly, Norelle L.; McGhie, Emily; Craik, David J.; Alewood, Paul F.

    2006-01-01

    The caseins (α s1 , α s2 , β, and κ) are phosphoproteins present in bovine milk that have been studied for over a century and whose structures remain obscure. Here we describe the chemical synthesis and structure elucidation of the N-terminal segment (1-44) of bovine κ-casein, the protein which maintains the micellar structure of the caseins. κ-Casein (1-44) was synthesised by highly optimised Boc solid-phase peptide chemistry and characterised by mass spectrometry. Structure elucidation was carried out by circular dichroism and nuclear magnetic resonance spectroscopy. CD analysis demonstrated that the segment was ill defined in aqueous medium but in 30% trifluoroethanol it exhibited considerable helical structure. Further, NMR analysis showed the presence of a helical segment containing 26 residues which extends from Pro 8 to Arg 34 . This is First report which demonstrates extensive secondary structure within the casein class of proteins

  12. Collision-Induced Dissociation Mass Spectrometry: A Powerful Tool for Natural Product Structure Elucidation.

    Science.gov (United States)

    Johnson, Andrew R; Carlson, Erin E

    2015-11-03

    Mass spectrometry is a powerful tool in natural product structure elucidation, but our ability to directly correlate fragmentation spectra to these structures lags far behind similar efforts in peptide sequencing and proteomics. Often, manual data interpretation is required and our knowledge of the expected fragmentation patterns for many scaffolds is limited, further complicating analysis. Here, we summarize advances in natural product structure elucidation based upon the application of collision induced dissociation fragmentation mechanisms.

  13. Computer-assisted methods for molecular structure elucidation: realizing a spectroscopist's dream

    Directory of Open Access Journals (Sweden)

    Elyashberg Mikhail

    2009-03-01

    Full Text Available Abstract Background This article coincides with the 40 year anniversary of the first published works devoted to the creation of algorithms for computer-aided structure elucidation (CASE. The general principles on which CASE methods are based will be reviewed and the present state of the art in this field will be described using, as an example, the expert system Structure Elucidator. Results The developers of CASE systems have been forced to overcome many obstacles hindering the development of a software application capable of drastically reducing the time and effort required to determine the structures of newly isolated organic compounds. Large complex molecules of up to 100 or more skeletal atoms with topological peculiarity can be quickly identified using the expert system Structure Elucidator based on spectral data. Logical analysis of 2D NMR data frequently allows for the detection of the presence of COSY and HMBC correlations of "nonstandard" length. Fuzzy structure generation provides a possibility to obtain the correct solution even in those cases when an unknown number of nonstandard correlations of unknown length are present in the spectra. The relative stereochemistry of big rigid molecules containing many stereocenters can be determined using the StrucEluc system and NOESY/ROESY 2D NMR data for this purpose. Conclusion The StrucEluc system continues to be developed in order to expand the general applicability, provide improved workflows, usability of the system and increased reliability of the results. It is expected that expert systems similar to that described in this paper will receive increasing acceptance in the next decade and will ultimately be integrated directly to analytical instruments for the purpose of organic analysis. Work in this direction is in progress. In spite of the fact that many difficulties have already been overcome to deliver on the spectroscopist's dream of "fully automated structure elucidation" there is

  14. Structure elucidation of a novel oligosaccharide (Medalose) from camel milk

    Science.gov (United States)

    Gangwar, Lata; Singh, Rinku; Deepak, Desh

    2018-02-01

    Free oligosaccharides are the third most abundant solid component in milk after lactose and lipids. The study of milk oligosaccharides indicate that nutrients are not only benefits the infant's gut but also perform a number of other functions which include stimulation of growth, receptor analogues to inhibit binding of pathogens and substances that promote postnatal brain development. Surveys reveal that camel milk oligosaccharides possess varied biological activities that help in the treatment of diabetes, asthma, anaemia, piles and also a food supplement to milking mothers. In this research, camel milk was selected for its oligosaccharide contents, which was then processed by Kobata and Ginsburg method followed by the HPLC and CC techniques. Structure elucidation of isolated compound was done by the chemical degradation, chemical transformation and comparison of chemical shift of NMR data of natural and acetylated oligosaccharide structure reporter group theory, the 1H, 13C NMR, 2D-NMR (COSY, TOCSY and HSQC) techniques, and mass spectrometry. The structure was elucidated as under: MEDALOSE

  15. Chemical biology on the genome.

    Science.gov (United States)

    Balasubramanian, Shankar

    2014-08-15

    In this article I discuss studies towards understanding the structure and function of DNA in the context of genomes from the perspective of a chemist. The first area I describe concerns the studies that led to the invention and subsequent development of a method for sequencing DNA on a genome scale at high speed and low cost, now known as Solexa/Illumina sequencing. The second theme will feature the four-stranded DNA structure known as a G-quadruplex with a focus on its fundamental properties, its presence in cellular genomic DNA and the prospects for targeting such a structure in cels with small molecules. The final topic for discussion is naturally occurring chemically modified DNA bases with an emphasis on chemistry for decoding (or sequencing) such modifications in genomic DNA. The genome is a fruitful topic to be further elucidated by the creation and application of chemical approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. In silico functional elucidation of uncharacterized proteins of Chlamydia abortus strain LLG.

    Science.gov (United States)

    Singh, Gagandeep; Sharma, Dixit; Singh, Vikram; Rani, Jyoti; Marotta, Francessco; Kumar, Manoj; Mal, Gorakh; Singh, Birbal

    2017-03-01

    This study reports structural modeling, molecular dynamics profiling of hypothetical proteins in Chlamydia abortus genome database. The hypothetical protein sequences were extracted from C. abortus LLG Genome Database for functional elucidation using in silico methods. Fifty-one proteins with their roles in defense, binding and transporting other biomolecules were unraveled. Forty-five proteins were found to be nonhomologous to proteins present in hosts infected by C. abortus . Of these, 31 proteins were related to virulence. The structural modeling of two proteins, first, WP_006344020.1 (phosphorylase) and second, WP_006344325.1 (chlamydial protease/proteasome-like activity factor) were accomplished. The conserved active sites necessary for the catalytic function were analyzed. The finally concluded proteins are envisioned as possible targets for developing drugs to curtail chlamydial infections, however, and should be validated by molecular biological methods.

  17. Genomic mid-range inhomogeneity correlates with an abundance of RNA secondary structures

    Directory of Open Access Journals (Sweden)

    Song Jun

    2008-06-01

    Full Text Available Abstract Background Genomes possess different levels of non-randomness, in particular, an inhomogeneity in their nucleotide composition. Inhomogeneity is manifest from the short-range where neighboring nucleotides influence the choice of base at a site, to the long-range, commonly known as isochores, where a particular base composition can span millions of nucleotides. A separate genomic issue that has yet to be thoroughly elucidated is the role that RNA secondary structure (SS plays in gene expression. Results We present novel data and approaches that show that a mid-range inhomogeneity (~30 to 1000 nt not only exists in mammalian genomes but is also significantly associated with strong RNA SS. A whole-genome bioinformatics investigation of local SS in a set of 11,315 non-redundant human pre-mRNA sequences has been carried out. Four distinct components of these molecules (5'-UTRs, exons, introns and 3'-UTRs were considered separately, since they differ in overall nucleotide composition, sequence motifs and periodicities. For each pre-mRNA component, the abundance of strong local SS ( Conclusion We demonstrate that the excess of strong local SS in pre-mRNAs is linked to the little explored phenomenon of genomic mid-range inhomogeneity (MRI. MRI is an interdependence between nucleotide choice and base composition over a distance of 20–1000 nt. Additionally, we have created a public computational resource to support further study of genomic MRI.

  18. Informational laws of genome structures

    Science.gov (United States)

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-06-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.

  19. 2004 Structural, Function and Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  20. Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

    Directory of Open Access Journals (Sweden)

    Jian Li

    Full Text Available The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR mediated by low-copy repeats (LCRs. Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.

  1. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  2. The fundamentals behind solving for unknown molecular structures using computer-assisted structure elucidation: a free software package at the undergraduate and graduate levels.

    Science.gov (United States)

    Moser, Arvin; Pautler, Brent G

    2016-05-15

    The successful elucidation of an unknown compound's molecular structure often requires an analyst with profound knowledge and experience of advanced spectroscopic techniques, such as Nuclear Magnetic Resonance (NMR) spectroscopy and mass spectrometry. The implementation of Computer-Assisted Structure Elucidation (CASE) software in solving for unknown structures, such as isolated natural products and/or reaction impurities, can serve both as elucidation and teaching tools. As such, the introduction of CASE software with 112 exercises to train students in conjunction with the traditional pen and paper approach will strengthen their overall understanding of solving unknowns and explore of various structural end points to determine the validity of the results quickly. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  3. Structural elucidation of an antibiotic from the fungus Fusarium avenaceum Fries Sacc.; an amended structure for lateropyrone

    International Nuclear Information System (INIS)

    Gorst-Allman, C.P.; Van Rooyen, P.H.; Wnuk, S.; Golinski, P.; Chelkowski, J.

    1986-01-01

    The structural elucidation by X-ray crystallography of an antibiotic produced by Fusarium avenaceum Fries Sacc. is described. Some chemical reactions of the metabolite are reported, and the identity of the metabolite with lateropyrone is proposed. The structure reported for lateropyrone is amended. 1 H n.m.r. and 13 C n.m.r. are used in this study

  4. Functional Insights from Structural Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  5. Nature's Migraine Treatment: Isolation and Structure Elucidation of Parthenolide from "Tanacetum parthenium"

    Science.gov (United States)

    Walsh, Emma L.; Ashe, Siobhan; Walsh, John J.

    2012-01-01

    The purpose of this experiment is to provide students with the essential skills and knowledge required to perform the extraction, isolation, and structural elucidation of parthenolide from "Tanacetum parthenium" or feverfew. Students are introduced to a background of the traditional medicinal uses of parthenolide, while more modern applications of…

  6. The Jujube Genome Provides Insights into Genome Evolution and the Domestication of Sweetness/Acidity Taste in Fruit Trees.

    Science.gov (United States)

    Huang, Jian; Zhang, Chunmei; Zhao, Xing; Fei, Zhangjun; Wan, KangKang; Zhang, Zhong; Pang, Xiaoming; Yin, Xiao; Bai, Yang; Sun, Xiaoqing; Gao, Lizhi; Li, Ruiqiang; Zhang, Jinbo; Li, Xingang

    2016-12-01

    Jujube (Ziziphus jujuba Mill.) belongs to the Rhamnaceae family and is a popular fruit tree species with immense economic and nutritional value. Here, we report a draft genome of the dry jujube cultivar 'Junzao' and the genome resequencing of 31 geographically diverse accessions of cultivated and wild jujubes (Ziziphus jujuba var. spinosa). Comparative analysis revealed that the genome of 'Dongzao', a fresh jujube, was ~86.5 Mb larger than that of the 'Junzao', partially due to the recent insertions of transposable elements in the 'Dongzao' genome. We constructed eight proto-chromosomes of the common ancestor of Rhamnaceae and Rosaceae, two sister families in the order Rosales, and elucidated the evolutionary processes that have shaped the genome structures of modern jujubes. Population structure analysis revealed the complex genetic background of jujubes resulting from extensive hybridizations between jujube and its wild relatives. Notably, several key genes that control fruit organic acid metabolism and sugar content were identified in the selective sweep regions. We also identified S-locus genes controlling gametophytic self-incompatibility and investigated haplotype patterns of the S locus in the jujube genomes, which would provide a guideline for parent selection for jujube crossbreeding. This study provides valuable genomic resources for jujube improvement, and offers insights into jujube genome evolution and its population structure and domestication.

  7. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    Science.gov (United States)

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Structural genomic variation in ischemic stroke

    Science.gov (United States)

    Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.

    2008-01-01

    Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507

  9. Isolation and structural elucidation of secondary metabolites from plants of the Rutaceae family, Rubiaceae, Euphorbiaceae and Salicaceae

    International Nuclear Information System (INIS)

    Calderon Castro, Carlos

    2014-01-01

    A phytochemical study was conducted of the Zuelania guidonia plants (Salicaceae), croton ovalifolius (Euphorbiaceae) erythrochiton gymnanthus (Rutaceae) and Faramea occidentalis (Rubiaceae). Purification of the compounds was carried out using chromatographic techniques while structural elucidation was performed by experiments using nuclear magnetic resonance (NMR) and mass spectrometry (MS). Of Z. guidonia has been possible the purification and structural elucidation of 22 compounds (Z1-Z22), two labdane type diterpenes and 20 clerodane-type diterpenes. The clerodanes have presented 16 innovative structure, highlighting the presence of a group of 3,6-dihydro -1.2-dioxin and xylose group in some of them. In addition, 11 of the clerodanes were evaluated with cytotoxicity assays in three cancer cell lines CCRF-CEM (acute lymphoblastic leukemia), CEM-ADR5000 (acute lymphoblastic leukemia resistant to doxorubicin) MIA-Paca-2 (metastatic pancreas) and a line of healthy cells PBMC (peripheral blood mononuclear cells). The Z4, Z6 and Z15 compounds stood out as the most cytotoxic, particularly against CCRF-CEM cells with IC 50 values between 1.6 and 2.5 μM. Seven compounds identified as glutarimide alkaloids (C1-C7) were isolated and elucidated, five of which have presented a novel structure from C. ovalifolius. Three compounds (E1-E3) that are triterpenes derivatives of known structure sitosterol, were isolated and elucidated from E. gymnanthus plant. From F. occidentalis was obtained the structure of a pure compound (F1], which is a flavonoid of known structure. (author) [es

  10. Simultaneous Structural Variation Discovery in Multiple Paired-End Sequenced Genomes

    Science.gov (United States)

    Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk

    Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes [1]. Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.

  11. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication.

    Science.gov (United States)

    Avni, Raz; Nave, Moran; Barad, Omer; Baruch, Kobi; Twardziok, Sven O; Gundlach, Heidrun; Hale, Iago; Mascher, Martin; Spannagl, Manuel; Wiebe, Krystalee; Jordan, Katherine W; Golan, Guy; Deek, Jasline; Ben-Zvi, Batsheva; Ben-Zvi, Gil; Himmelbach, Axel; MacLachlan, Ron P; Sharpe, Andrew G; Fritz, Allan; Ben-David, Roi; Budak, Hikmet; Fahima, Tzion; Korol, Abraham; Faris, Justin D; Hernandez, Alvaro; Mikel, Mark A; Levy, Avraham A; Steffenson, Brian; Maccaferri, Marco; Tuberosa, Roberto; Cattivelli, Luigi; Faccioli, Primetta; Ceriotti, Aldo; Kashkush, Khalil; Pourkheirandish, Mohammad; Komatsuda, Takao; Eilam, Tamar; Sela, Hanan; Sharon, Amir; Ohad, Nir; Chamovitz, Daniel A; Mayer, Klaus F X; Stein, Nils; Ronen, Gil; Peleg, Zvi; Pozniak, Curtis J; Akhunov, Eduard D; Distelfeld, Assaf

    2017-07-07

    Wheat ( Triticum spp.) is one of the founder crops that likely drove the Neolithic transition to sedentary agrarian societies in the Fertile Crescent more than 10,000 years ago. Identifying genetic modifications underlying wheat's domestication requires knowledge about the genome of its allo-tetraploid progenitor, wild emmer ( T. turgidum ssp. dicoccoides ). We report a 10.1-gigabase assembly of the 14 chromosomes of wild tetraploid wheat, as well as analyses of gene content, genome architecture, and genetic diversity. With this fully assembled polyploid wheat genome, we identified the causal mutations in Brittle Rachis 1 ( TtBtr1 ) genes controlling shattering, a key domestication trait. A study of genomic diversity among wild and domesticated accessions revealed genomic regions bearing the signature of selection under domestication. This reference assembly will serve as a resource for accelerating the genome-assisted improvement of modern wheat varieties. Copyright © 2017, American Association for the Advancement of Science.

  12. The Jujube Genome Provides Insights into Genome Evolution and the Domestication of Sweetness/Acidity Taste in Fruit Trees.

    Directory of Open Access Journals (Sweden)

    Jian Huang

    2016-12-01

    Full Text Available Jujube (Ziziphus jujuba Mill. belongs to the Rhamnaceae family and is a popular fruit tree species with immense economic and nutritional value. Here, we report a draft genome of the dry jujube cultivar 'Junzao' and the genome resequencing of 31 geographically diverse accessions of cultivated and wild jujubes (Ziziphus jujuba var. spinosa. Comparative analysis revealed that the genome of 'Dongzao', a fresh jujube, was ~86.5 Mb larger than that of the 'Junzao', partially due to the recent insertions of transposable elements in the 'Dongzao' genome. We constructed eight proto-chromosomes of the common ancestor of Rhamnaceae and Rosaceae, two sister families in the order Rosales, and elucidated the evolutionary processes that have shaped the genome structures of modern jujubes. Population structure analysis revealed the complex genetic background of jujubes resulting from extensive hybridizations between jujube and its wild relatives. Notably, several key genes that control fruit organic acid metabolism and sugar content were identified in the selective sweep regions. We also identified S-locus genes controlling gametophytic self-incompatibility and investigated haplotype patterns of the S locus in the jujube genomes, which would provide a guideline for parent selection for jujube crossbreeding. This study provides valuable genomic resources for jujube improvement, and offers insights into jujube genome evolution and its population structure and domestication.

  13. Structural genomic variations and Parkinson's disease.

    Science.gov (United States)

    Bandrés-Ciga, Sara; Ruz, Clara; Barrero, Francisco J; Escamilla-Sevilla, Francisco; Pelegrina, Javier; Vives, Francisco; Duran, Raquel

    2017-10-01

    Parkinson's disease (PD) is the second most common neurodegenerative disease, whose prevalence is projected to be between 8.7 and 9.3 million by 2030. Until about 20 years ago, PD was considered to be the textbook example of a "non-genetic" disorder. Nowadays, PD is generally considered a multifactorial disorder that arises from the combination and complex interaction of genes and environmental factors. To date, a total of 7 genes including SNCA, LRRK2, PARK2, DJ-1, PINK 1, VPS35 and ATP13A2 have been seen to cause unequivocally Mendelian PD. Also, variants with incomplete penetrance in the genes LRRK2 and GBA are considered to be strong risk factors for PD worldwide. Although genetic studies have provided valuable insights into the pathogenic mechanisms underlying PD, the role of structural variation in PD has been understudied in comparison with other genomic variations. Structural genomic variations might substantially account for such genetic substrates yet to be discovered. The present review aims to provide an overview of the structural genomic variants implicated in the pathogenesis of PD.

  14. Structural dynamics of retroviral genome and the packaging.

    Science.gov (United States)

    Miyazaki, Yasuyuki; Miyake, Ariko; Nomaguchi, Masako; Adachi, Akio

    2011-01-01

    Retroviruses can cause diseases such as AIDS, leukemia, and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5' untranslated region (5' UTR), and contains dimerization site(s). Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5' UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus, human immunodeficiency virus type 1 and 2, and describe the molecular mechanism of retroviral genome packaging.

  15. Nature's Anti-Alzheimer's Drug: Isolation and Structure Elucidation of Galantamine from "Leucojum Aestivum"

    Science.gov (United States)

    Halpin, Catherine M.; Reilly, Ciara; Walsh, John J.

    2010-01-01

    The discovery that galantamine penetrates the blood-brain barrier has led to its clinical use in the treatment of choline-deficiency conditions in the brain, such as Alzheimer's disease. This experiment involves the isolation and structure elucidation of galantamine from "Leucojum aestivum". Isolation of the alkaloid constituents in "L. aestivum"…

  16. Mass Spectra-Based Framework for Automated Structural Elucidation of Metabolome Data to Explore Phytochemical Diversity

    Science.gov (United States)

    Matsuda, Fumio; Nakabayashi, Ryo; Sawada, Yuji; Suzuki, Makoto; Hirai, Masami Y.; Kanaya, Shigehiko; Saito, Kazuki

    2011-01-01

    A novel framework for automated elucidation of metabolite structures in liquid chromatography–mass spectrometer metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method. PMID:22645535

  17. Mass spectra-based framework for automated structural elucidation of metabolome data to explore phytochemical diversity

    Directory of Open Access Journals (Sweden)

    Fumio eMatsuda

    2011-08-01

    Full Text Available A novel framework for automated elucidation of metabolite structures in liquid chromatography-mass spectrometer (LC-MS metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method.

  18. Structural determinants and mechanism of HIV-1 genome packaging.

    Science.gov (United States)

    Lu, Kun; Heng, Xiao; Summers, Michael F

    2011-07-22

    Like all retroviruses, the human immunodeficiency virus selectively packages two copies of its unspliced RNA genome, both of which are utilized for strand-transfer-mediated recombination during reverse transcription-a process that enables rapid evolution under environmental and chemotherapeutic pressures. The viral RNA appears to be selected for packaging as a dimer, and there is evidence that dimerization and packaging are mechanistically coupled. Both processes are mediated by interactions between the nucleocapsid domains of a small number of assembling viral Gag polyproteins and RNA elements within the 5'-untranslated region of the genome. A number of secondary structures have been predicted for regions of the genome that are responsible for packaging, and high-resolution structures have been determined for a few small RNA fragments and protein-RNA complexes. However, major questions regarding the RNA structures (and potentially the structural changes) that are responsible for dimeric genome selection remain unanswered. Here, we review efforts that have been made to identify the molecular determinants and mechanism of human immunodeficiency virus type 1 genome packaging. Copyright © 2011 Elsevier Ltd. All rights reserved.

  19. Multi-scale structural community organisation of the human genome.

    Science.gov (United States)

    Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

    2017-04-11

    Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.

  20. Structural Genomics of Minimal Organisms: Pipeline and Results

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong-Hae; Kim, Rosalind; Adams, Paul; Chandonia, John-Marc

    2007-09-14

    The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center.

  1. Structural dynamics of retroviral genome and the packaging

    Directory of Open Access Journals (Sweden)

    Yasuyuki eMiyazaki

    2011-12-01

    Full Text Available Retroviruses can cause diseases such as AIDS, leukemia and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid (NC domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5’ untranslated region (5’ UTR, and contains dimerization site(s. Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5’ UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus (MoMLV, human immunodeficiency virus type 1 (HIV-1 and 2 (HIV-2, and describe the molecular mechanism of retroviral genome packaging.

  2. Comprehensive secondary structure elucidation of four genera of the family Pospiviroidae.

    Directory of Open Access Journals (Sweden)

    Tamara Giguère

    Full Text Available Viroids are small, circular, single stranded RNA molecules that infect plants. Since they are non-coding, their structures play a critical role in their life cycles. To date, little effort has been spend on elucidating viroid structures in solution due to both the experimental difficulties and the time-consuming nature of the methodologies implicated. Recently, the technique of high-throughput selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE was adapted for the probing of the members of family Avsunviroidae, all of whom replicate in the chloroplast and demonstrate ribozyme activity. In the present work, twelve viroid species belonging to four different genera of the family Pospiviroidae, whose members are characterized by the presence of a central conserved region (CCR and who replicate in nucleus of the host, were probed. Given that the structures of five distinct viroid species from the family Pospiviroidae have been previously reported, an overview of the different structural characteristics for all genera and the beginning of a manual classification of the different viroids based on their structural features are presented here.

  3. General scheme for elucidating the structure of organic compounds using spectroscopic and spectrometric methods

    International Nuclear Information System (INIS)

    Ribeiro, Carlos Magno R.; Souza, Nelson Angelo de

    2007-01-01

    This work describes a systematic method to be applied in undergraduate courses of organic chemistry, correlating infrared spectra, hydrogen and carbon-13 nuclear magnetic resonance, and mass spectra. To this end, a scheme and a table were developed to conduct the elucidation of the structure of organic compounds initially using infrared spectra. Interpretation of hydrogen and carbon-13 nuclear magnetic resonance spectra and of mass spectra is used to confirm the proposed structure. (author)

  4. Interrogating the druggable genome with structural informatics.

    Science.gov (United States)

    Hambly, Kevin; Danzer, Joseph; Muskal, Steven; Debe, Derek A

    2006-08-01

    Structural genomics projects are producing protein structure data at an unprecedented rate. In this paper, we present the Target Informatics Platform (TIP), a novel structural informatics approach for amplifying the rapidly expanding body of experimental protein structure information to enhance the discovery and optimization of small molecule protein modulators on a genomic scale. In TIP, existing experimental structure information is augmented using a homology modeling approach, and binding sites across multiple target families are compared using a clique detection algorithm. We report here a detailed analysis of the structural coverage for the set of druggable human targets, highlighting drug target families where the level of structural knowledge is currently quite high, as well as those areas where structural knowledge is sparse. Furthermore, we demonstrate the utility of TIP's intra- and inter-family binding site similarity analysis using a series of retrospective case studies. Our analysis underscores the utility of a structural informatics infrastructure for extracting drug discovery-relevant information from structural data, aiding researchers in the identification of lead discovery and optimization opportunities as well as potential "off-target" liabilities.

  5. Visualization of RNA structure models within the Integrative Genomics Viewer.

    Science.gov (United States)

    Busan, Steven; Weeks, Kevin M

    2017-07-01

    Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  6. Producing genome structure populations with the dynamic and automated PGS software.

    Science.gov (United States)

    Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank

    2018-05-01

    Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.

  7. Alignment-free comparative genomic screen for structured RNAs using coarse-grained secondary structure dot plots

    DEFF Research Database (Denmark)

    Kato, Yuki; Gorodkin, Jan; Havgaard, Jakob Hull

    2017-01-01

    . Methods: Here we present a fast and efficient method, DotcodeR, for detecting structurally similar RNAs in genomic sequences by comparing their corresponding coarse-grained secondary structure dot plots at string level. This allows us to perform an all-against-all scan of all window pairs from two genomes...... without alignment. Results: Our computational experiments with simulated data and real chromosomes demonstrate that the presented method has good sensitivity. Conclusions: DotcodeR can be useful as a pre-filter in a genomic comparative scan for structured RNAs....

  8. Pathgroups, a dynamic data structure for genome reconstruction problems.

    Science.gov (United States)

    Zheng, Chunfang

    2010-07-01

    Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available heuristics dedicated to each of these problems are computationally costly for even small instances. We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html chunfang313@gmail.com Supplementary data are available at Bioinformatics online.

  9. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  10. Streamlined structure elucidation of an unknown compound in a pigment formulation.

    Science.gov (United States)

    Yüce, Imanuel; Morlock, Gertrud E

    2016-10-21

    A fast and reliable quality control is important for ink manufacturers to ensure a constant production grade of mixtures and chemical formulations, and unknown components attract their attention. Structure elucidating techniques seem time-consuming in combination with column-based methods, but especially the low solubility of pigment formulations is challenging the analysis. In contrast, layer chromatography is more tolerant with regard to pigment particles. One PLC plate for NMR and FTIR analyses and one HPTLC plate for recording of high resolution mass spectra, MS/MS spectra and for gathering information on polarity and spectral properties were needed to characterize a structure, exemplarily shown for an unknown component in pigment Red 57:1 to be 3-hydroxy-2-naphtoic acid. A preparative layer chromatography (PLC) workflow was developed that used an Automated Multiple Development 2 (AMD 2) system. The 0.5-mm PLC plate could still be operated in the AMD 2 system and allowed a smooth switch from the analytical to the preparative gradient separation. Through automated gradient development and the resulting focusing of bands, the sharpness of the PLC bands was improved. For NMR, the necessary high load of the target compound on the PLC plate was achieved via a selective solvent extraction that discriminated the polar sample matrix and thus increased the application volume of the extract that could maximally be applied without overloading. By doing so, the yield for NMR analysis was improved by a factor of 9. The effectivity gain through a simple, but thoroughly chosen extraction solvent is often overlooked, and for educational purpose, it was clearly illustrated and demonstrated by an extended solvent screening. Thus, PLC using an automated gradient development after a selective extraction was proven to be a new powerful combination for structural elucidation by NMR. Copyright © 2016 Elsevier B.V. All rights reserved.

  11. Chromatin structure and evolution in the human genome

    Directory of Open Access Journals (Sweden)

    Dunlop Malcolm G

    2007-05-01

    Full Text Available Abstract Background Evolutionary rates are not constant across the human genome but genes in close proximity have been shown to experience similar levels of divergence and selection. The higher-order organisation of chromosomes has often been invoked to explain such phenomena but previously there has been insufficient data on chromosome structure to investigate this rigorously. Using the results of a recent genome-wide analysis of open and closed human chromatin structures we have investigated the global association between divergence, selection and chromatin structure for the first time. Results In this study we have shown that, paradoxically, synonymous site divergence (dS at non-CpG sites is highest in regions of open chromatin, primarily as a result of an increased number of transitions, while the rates of other traditional measures of mutation (intergenic, intronic and ancient repeat divergence as well as SNP density are highest in closed regions of the genome. Analysis of human-chimpanzee divergence across intron-exon boundaries indicates that although genes in relatively open chromatin generally display little selection at their synonymous sites, those in closed regions show markedly lower divergence at their fourfold degenerate sites than in neighbouring introns and intergenic regions. Exclusion of known Exonic Splice Enhancer hexamers has little affect on the divergence observed at fourfold degenerate sites across chromatin categories; however, we show that closed chromatin is enriched with certain classes of ncRNA genes whose RNA secondary structure may be particularly important. Conclusion We conclude that, overall, non-CpG mutation rates are lowest in open regions of the genome and that regions of the genome with a closed chromatin structure have the highest background mutation rate. This might reflect lower rates of DNA damage or enhanced DNA repair processes in regions of open chromatin. Our results also indicate that dS is a poor

  12. The Past, Present, and Future of Human Centromere Genomics

    Directory of Open Access Journals (Sweden)

    Megan E. Aldrup-MacDonald

    2014-01-01

    Full Text Available The centromere is the chromosomal locus essential for chromosome inheritance and genome stability. Human centromeres are located at repetitive alpha satellite DNA arrays that compose approximately 5% of the genome. Contiguous alpha satellite DNA sequence is absent from the assembled reference genome, limiting current understanding of centromere organization and function. Here, we review the progress in centromere genomics spanning the discovery of the sequence to its molecular characterization and the work done during the Human Genome Project era to elucidate alpha satellite structure and sequence variation. We discuss exciting recent advances in alpha satellite sequence assembly that have provided important insight into the abundance and complex organization of this sequence on human chromosomes. In light of these new findings, we offer perspectives for future studies of human centromere assembly and function.

  13. Evolutionary genomics and population structure of Entamoeba histolytica

    Directory of Open Access Journals (Sweden)

    Koushik Das

    2014-11-01

    Full Text Available Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba.

  14. Child Development and Structural Variation in the Human Genome

    Science.gov (United States)

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  15. The SGC beyond structural genomics: redefining the role of 3D structures by coupling genomic stratification with fragment-based discovery.

    Science.gov (United States)

    Bradley, Anthony R; Echalier, Aude; Fairhead, Michael; Strain-Damerell, Claire; Brennan, Paul; Bullock, Alex N; Burgess-Brown, Nicola A; Carpenter, Elisabeth P; Gileadi, Opher; Marsden, Brian D; Lee, Wen Hwa; Yue, Wyatt; Bountra, Chas; von Delft, Frank

    2017-11-08

    The ongoing explosion in genomics data has long since outpaced the capacity of conventional biochemical methodology to verify the large number of hypotheses that emerge from the analysis of such data. In contrast, it is still a gold-standard for early phenotypic validation towards small-molecule drug discovery to use probe molecules (or tool compounds), notwithstanding the difficulty and cost of generating them. Rational structure-based approaches to ligand discovery have long promised the efficiencies needed to close this divergence; in practice, however, this promise remains largely unfulfilled, for a host of well-rehearsed reasons and despite the huge technical advances spearheaded by the structural genomics initiatives of the noughties. Therefore the current, fourth funding phase of the Structural Genomics Consortium (SGC), building on its extensive experience in structural biology of novel targets and design of protein inhibitors, seeks to redefine what it means to do structural biology for drug discovery. We developed the concept of a Target Enabling Package (TEP) that provides, through reagents, assays and data, the missing link between genetic disease linkage and the development of usefully potent compounds. There are multiple prongs to the ambition: rigorously assessing targets' genetic disease linkages through crowdsourcing to a network of collaborating experts; establishing a systematic approach to generate the protocols and data that comprise each target's TEP; developing new, X-ray-based fragment technologies for generating high quality chemical matter quickly and cheaply; and exploiting a stringently open access model to build multidisciplinary partnerships throughout academia and industry. By learning how to scale these approaches, the SGC aims to make structures finally serve genomics, as originally intended, and demonstrate how 3D structures systematically allow new modes of druggability to be discovered for whole classes of targets. © 2017 The

  16. Catharanthus alkaloids XXXII: isolation of alkaloids from Catharanthus trichophyllus roots and structure elucidation of cathaphylline.

    Science.gov (United States)

    Cordell, G A; Farnsworth, N R

    1976-03-01

    Further examination of the cytotoxic alkaloid fractions of Catharanthus trichophyllus roots afforded nine alkaloids. Two of these alkaloids, lochnericine and horhammericine, are responsible for part of the cytotoxic activity. The structure elucidation of cathaphylline, a new beta-anilino acrylate derivative, is described.

  17. Structural Genomics and Drug Discovery for Infectious Diseases

    International Nuclear Information System (INIS)

    Anderson, W.F.

    2009-01-01

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  18. Genomics, transcriptomics and proteomics to elucidate the pathogenesis of rheumatoid arthritis.

    Science.gov (United States)

    Song, Xinqiang; Lin, Qingsong

    2017-08-01

    Rheumatoid arthritis is an autoimmune disease that affects several organs and tissues, predominantly the synovial joints. The pathogenesis of this disease is not completely understood, which maybe involved in the genomic variations, gene expression, protein translation and post-translational modifications. These system variations in genomics, transcriptomics and proteomics are dynamic in nature and their crosstalk is overwhelmingly complex, thus analyzing them separately may not be very informative. However, various '-omics' techniques developed in recent years have opened up new possibilities for clarifying disease pathways and thereby facilitating early diagnosis and specific therapies. This review examines how recent advances in the fields of genomics, transcriptomics and proteomics have contributed to our understanding of rheumatoid arthritis.

  19. Structural genomics of infectious disease drug targets: the SSGCID

    International Nuclear Information System (INIS)

    Stacy, Robin; Begley, Darren W.; Phan, Isabelle; Staker, Bart L.; Van Voorhis, Wesley C.; Varani, Gabriele; Buchko, Garry W.; Stewart, Lance J.; Myler, Peter J.

    2011-01-01

    An introduction and overview of the focus, goals and overall mission of the Seattle Structural Genomics Center for Infectious Disease (SSGCID) is given. The Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a consortium of researchers at Seattle BioMed, Emerald BioStructures, the University of Washington and Pacific Northwest National Laboratory that was established to apply structural genomics approaches to drug targets from infectious disease organisms. The SSGCID is currently funded over a five-year period by the National Institute of Allergy and Infectious Diseases (NIAID) to determine the three-dimensional structures of 400 proteins from a variety of Category A, B and C pathogens. Target selection engages the infectious disease research and drug-therapy communities to identify drug targets, essential enzymes, virulence factors and vaccine candidates of biomedical relevance to combat infectious diseases. The protein-expression systems, purified proteins, ligand screens and three-dimensional structures produced by SSGCID constitute a valuable resource for drug-discovery research, all of which is made freely available to the greater scientific community. This issue of Acta Crystallographica Section F, entirely devoted to the work of the SSGCID, covers the details of the high-throughput pipeline and presents a series of structures from a broad array of pathogenic organisms. Here, a background is provided on the structural genomics of infectious disease, the essential components of the SSGCID pipeline are discussed and a survey of progress to date is presented

  20. Detecting uber-operons in prokaryotic genomes.

    Science.gov (United States)

    Che, Dongsheng; Li, Guojun; Mao, Fenglou; Wu, Hongwei; Xu, Ying

    2006-01-01

    We present a study on computational identification of uber-operons in a prokaryotic genome, each of which represents a group of operons that are evolutionarily or functionally associated through operons in other (reference) genomes. Uber-operons represent a rich set of footprints of operon evolution, whose full utilization could lead to new and more powerful tools for elucidation of biological pathways and networks than what operons have provided, and a better understanding of prokaryotic genome structures and evolution. Our prediction algorithm predicts uber-operons through identifying groups of functionally or transcriptionally related operons, whose gene sets are conserved across the target and multiple reference genomes. Using this algorithm, we have predicted uber-operons for each of a group of 91 genomes, using the other 90 genomes as references. In particular, we predicted 158 uber-operons in Escherichia coli K12 covering 1830 genes, and found that many of the uber-operons correspond to parts of known regulons or biological pathways or are involved in highly related biological processes based on their Gene Ontology (GO) assignments. For some of the predicted uber-operons that are not parts of known regulons or pathways, our analyses indicate that their genes are highly likely to work together in the same biological processes, suggesting the possibility of new regulons and pathways. We believe that our uber-operon prediction provides a highly useful capability and a rich information source for elucidation of complex biological processes, such as pathways in microbes. All the prediction results are available at our Uber-Operon Database: http://csbl.bmb.uga.edu/uber, the first of its kind.

  1. The Human Genome Initiative of the Department of Energy

    Science.gov (United States)

    1988-01-01

    The structural characterization of genes and elucidation of their encoded functions have become a cornerstone of modern health research, biology and biotechnology. A genome program is an organized effort to locate and identify the functions of all the genes of an organism. Beginning with the DOE-sponsored, 1986 human genome workshop at Santa Fe, the value of broadly organized efforts supporting total genome characterization became a subject of intensive study. There is now national recognition that benefits will rapidly accrue from an effective scientific infrastructure for total genome research. In the US genome research is now receiving dedicated funds. Several other nations are implementing genome programs. Supportive infrastructure is being improved through both national and international cooperation. The Human Genome Initiative of the Department of Energy (DOE) is a focused program of Resource and Technology Development, with objectives of speeding and bringing economies to the national human genome effort. This report relates the origins and progress of the Initiative.

  2. Gene Composer in a structural genomics environment

    International Nuclear Information System (INIS)

    Lorimer, Don; Raymond, Amy; Mixon, Mark; Burgin, Alex; Staker, Bart; Stewart, Lance

    2011-01-01

    For structural biology applications, protein-construct engineering is guided by comparative sequence analysis and structural information, which allow the researcher to better define domain boundaries for terminal deletions and nonconserved regions for surface mutants. A database software application called Gene Composer has been developed to facilitate construct design. The structural genomics effort at the Seattle Structural Genomics Center for Infectious Disease (SSGCID) requires the manipulation of large numbers of amino-acid sequences and the underlying DNA sequences which are to be cloned into expression vectors. To improve efficiency in high-throughput protein structure determination, a database software package, Gene Composer, has been developed which facilitates the information-rich design of protein constructs and their underlying gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bioinformatics steps used in modern structure-guided protein engineering and synthetic gene engineering. An example of the structure determination of H1N1 RNA-dependent RNA polymerase PB2 subunit is given

  3. Structural biology at York Structural Biology Laboratory; laboratory information management systems for structural genomics

    Czech Academy of Sciences Publication Activity Database

    Dohnálek, Jan

    2005-01-01

    Roč. 12, č. 1 (2005), s. 3 ISSN 1211-5894. [Meeting of Structural Biologists /4./. 10.03.2005-12.03.2005, Nové Hrady] R&D Projects: GA MŠk(CZ) 1K05008 Keywords : structural biology * LIMS * structural genomics Subject RIV: CD - Macromolecular Chemistry

  4. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  5. From structure prediction to genomic screens for novel non-coding RNAs

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.

    2011-01-01

    Abstract: Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction....... This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early...... upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other....

  6. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob H

    2014-01-01

    BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However......, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure...... lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome...

  7. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  8. From structure prediction to genomic screens for novel non-coding RNAs.

    Science.gov (United States)

    Gorodkin, Jan; Hofacker, Ivo L

    2011-08-01

    Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  9. Functional RNA structures throughout the Hepatitis C Virus genome.

    Science.gov (United States)

    Adams, Rebecca L; Pirakitikulr, Nathan; Pyle, Anna Marie

    2017-06-01

    The single-stranded Hepatitis C Virus (HCV) genome adopts a set of elaborate RNA structures that are involved in every stage of the viral lifecycle. Recent advances in chemical probing, sequencing, and structural biology have facilitated analysis of RNA folding on a genome-wide scale, revealing novel structures and networks of interactions. These studies have underscored the active role played by RNA in every function of HCV and they open the door to new types of RNA-targeted therapeutics. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

    Science.gov (United States)

    Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine

    2013-01-01

    Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).

  11. The eastern oyster genome: A resource for comparative genomics in shellfish aquaculture species

    Science.gov (United States)

    Oyster aquaculture is an important sector of world food production. As such, it is imperative to develop a high quality reference genome for the eastern oyster, Crassostrea virginica, to assist in the elucidation of the genomic basis of commercially important traits. All genetic, gene expression and...

  12. Statistical properties of thermodynamically predicted RNA secondary structures in viral genomes

    Science.gov (United States)

    Spanò, M.; Lillo, F.; Miccichè, S.; Mantegna, R. N.

    2008-10-01

    By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups, hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present in coding regions of dsDNA group. In fact, we detect evolutionary conserved RNA secondary structures in conserved coding and noncoding regions of a large set of complete genomes of dsDNA herpesviruses.

  13. From structure prediction to genomic screens for novel non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Jan Gorodkin

    2011-08-01

    Full Text Available Non-coding RNAs (ncRNAs are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs. A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  14. Production of specific-structured lipids by enzymatic interesterification: elucidation of acyl migration by response surface design

    DEFF Research Database (Denmark)

    Xu, Xuebing; Skands, Anja; Høy, Carl-Erik

    1998-01-01

    Production of specific-structured lipids (SSL) by lipase-catalyzed interesterification has been attracting more and more attention recently. However, it was found that acyl migration occurs during the reaction and causes the production of by-products. In this paper, the elucidation of acyl...

  15. Structure Elucidation and Cytotoxic Evaluation of New Polyacetylenes from a Marine Sponge Petrosia sp.

    Directory of Open Access Journals (Sweden)

    Yung-Shun Juan

    2014-09-01

    Full Text Available The sponge Petrosia sp. yielded five polyacetylenic compounds (1–5, including two new polyacetylenes, petrosianynes A (1 and B (2. The structures of these compounds were elucidated by detailed spectroscopic analysis and by comparison with the physical and spectral data of related known analogues. Compounds 1–5 exhibited significant cytotoxic activity against a limited panel of cancer cell lines.

  16. Technologies and techniques for analysis and use of genome information, 1997; Genome joho kaidoku riyo gijutsu no chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1998-03-01

    The paper clarified the whole image of cell functions by elucidating the function and manifestation control mechanism of genes existing in genomes, and the network of their interactions, and surveyed applicability of the useful functions obtained of cells and proteins to the industrial field. The survey was made from a viewpoint of the fields of both biology and information science. Especially, based on the function-known DNA base sequence database, the following technologies were surveyed: technology to predict the function of the function-unknown DNA base sequence, search/separation technology to acquire the genes to be functionally elucidated in a state of being suitable for manifestation, technology to get perfect proteins by effectively manifesting the genes to be functionally elucidated, and technology to analyze the function of the proteins obtained by manifestation of genes. Further, the International Symposium was held which is titled `Genome Research Opens a New World to Bioindustry (New Developments in Genome Informatics Technologies). With the future progress of technology to decipher and use genome information, the construction of much newer genome industry is anticipated. 165 refs., 44 figs., 10 tabs.

  17. Structural Elucidation of Novel Saponins in the Sea Cucumber Holothuria lessoni

    Science.gov (United States)

    Bahrami, Yadollah; Zhang, Wei; Chataway, Tim; Franco, Chris

    2014-01-01

    Sea cucumbers are prolific producers of a wide range of bioactive compounds. This study aimed to purify and characterize one class of compound, the saponins, from the viscera of the Australian sea cucumber Holothuria lessoni. The saponins were obtained by ethanolic extraction of the viscera and enriched by a liquid-liquid partition process and adsorption column chromatography. A high performance centrifugal partition chromatography (HPCPC) was applied to the saponin-enriched mixture to obtain saponins with high purity. The resultant purified saponins were profiled using MALDI-MS/MS and ESI-MS/MS which revealed the structure of isomeric saponins to contain multiple aglycones and/or sugar residues. We have elucidated the structure of five novel saponins, Holothurins D/E and Holothurinosides X/Y/Z, along with seven reported triterpene glycosides, including sulfated and non-sulfated saponins containing a range of aglycones and sugar moieties, from the viscera of H. lessoni. The abundance of novel compounds from this species holds promise for biotechnological applications. PMID:25110919

  18. Elucidation of Peptide-Directed Palladium Surface Structure for Biologically Tunable Nanocatalysts

    Energy Technology Data Exchange (ETDEWEB)

    Bedford, Nicholas M.; Ramezani-Dakhel, Hadi; Slocik, Joseph M.; Briggs, Beverly D.; Ren, Yang; Frenkel, Anatoly I.; Petkov, Valeri; Heinz, Hendrik; Naik, Rajesh R.; Knecht, Mark R.

    2015-05-01

    Peptide-enabled synthesis of inorganic nanostructures represents an avenue to access catalytic materials with tunable and optimized properties. This is achieved via peptide complexity and programmability that is missing in traditional ligands for catalytic nanomaterials. Unfortunately, there is limited information available to correlate peptide sequence to particle structure and catalytic activity to date. As such, the application of peptide-enabled nanocatalysts remains limited to trial and error approaches. In this paper, a hybrid experimental and computational approach is introduced to systematically elucidate biomolecule-dependent structure/function relationships for peptide-capped Pd nanocatalysts. Synchrotron X-ray techniques were used to uncover substantial particle surface structural disorder, which was dependent upon the amino acid sequence of the peptide capping ligand. Nanocatalyst configurations were then determined directly from experimental data using reverse Monte Carlo methods and further refined using molecular dynamics simulation, obtaining thermodynamically stable peptide-Pd nanoparticle configurations. Sequence-dependent catalytic property differences for C-C coupling and olefin hydrogenation were then eluddated by identification of the catalytic active sites at the atomic level and quantitative prediction of relative reaction rates. This hybrid methodology provides a clear route to determine peptide-dependent structure/function relationships, enabling the generation of guidelines for catalyst design through rational tailoring of peptide sequences

  19. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  20. Computer-assisted structure elucidation from 13C-NMR-Spectra. I. The development of a three-dimensional structure code. II. The development of an isomer generating program

    International Nuclear Information System (INIS)

    Schuetz, V.

    1999-05-01

    The presented thesis consists of two separate programs which both aid the automated structure elucidation in the CSEARCH database system. A successful utilization of a large collection of NMR reference spectra for the prediction of chemical shift values is dependent on a strong correlation between the spectral data and the structural information via unique coding. By now, this was done using the two-dimensional HOSE code, which turned out to be insufficient whenever stereochemical effects other than cis/trans-isomerism contribute to the chemical shift values. Therefore, this new algorithm has been developed to derive the demanded three-dimensional descriptors. The calculation is performed by matching the query structures against pattern molecules taken from a carefully selected library of ring skeletons. No three-dimensional coordinates are necessary, since the algorithm elucidates the descriptors on base of two-dimensional structures having their stereocenters specified using 'up/down' bonds. The descriptors are defined as number of interactions over 3 to 5 bonds, number of cis-substituents over 1 to 2 ringbonds and markers for axial substituents. This approach of deriving descriptors for steric interactions has successfully extended the HOSE coding scheme and has been implemented into a neural network; both methods allow for high-quality prediction of 13 C-NMR chemical shift values. The second algorithm is an isomer generating program named GENERAL, which efficiently supports the structure elucidation process by calculating all mathematically possible structures to a given molecular formula. The resulting list of structures is exhaustive and free of redundancy. Besides the basic input information - like the molecular formula and the specification of structural fragments, constraints can be defined to restrict the number of resulting structures. The most valuable information is provided by state-of-the-art 2D-NMR experiments and can be easily incorporated into

  1. Population genomics of Fusarium graminearum reveals signatures of divergent evolution within a major cereal pathogen

    Science.gov (United States)

    The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the...

  2. Tannin structural elucidation and quantitative ³¹P NMR analysis. 1. Model compounds.

    Science.gov (United States)

    Melone, Federica; Saladino, Raffaele; Lange, Heiko; Crestini, Claudia

    2013-10-02

    Tannins and flavonoids are secondary metabolites of plants that display a wide array of biological activities. This peculiarity is related to the inhibition of extracellular enzymes that occurs through the complexation of peptides by tannins. Not only the nature of these interactions, but more fundamentally also the structure of these heterogeneous polyphenolic molecules are not completely clear. This first paper describes the development of a new analytical method for the structural characterization of tannins on the basis of tannin model compounds employing an in situ labeling of all labile H groups (aliphatic OH, phenolic OH, and carboxylic acids) with a phosphorus reagent. The ³¹P NMR analysis of ³¹P-labeled samples allowed the unprecedented quantitative and qualitative structural characterization of hydrolyzable tannins, proanthocyanidins, and catechin tannin model compounds, forming the foundations for the quantitative structural elucidation of a variety of actual tannin samples described in part 2 of this series.

  3. The Crystal Structure of the Malaria Pigment Hemozoin as Elucidated by X-ray Powder Diffraction

    DEFF Research Database (Denmark)

    Straasø, Tine

    survival. Successful inhibition of hemozoin crystallization will lead to parasitic death and thus break the cycle. The aim of this thesis is to elucidate the structure of hemozoin by means of X-ray diffraction techniques. Knowledge of the structure will help facilitate intelligent drug design in the future....... As part of the project an all-in-vacuum powder diffractometer was developed, which provides data with a minimum background level and an improved signal-to-noise ratio. Moreover, the diffractometer is designed with the particular purpose of decreasing the number of parameters to be fitted. Installation...

  4. Breeding experiments and genome-wide association analysis elucidate two genetically different forms of non-syndromic congenital cleft lip and jaw in Vorderwald × Montbéliarde cattle.

    Science.gov (United States)

    Reinartz, S; Distl, O

    2017-10-01

    Non-syndromic congenital cleft lip and jaw (CLJ) is a condition reported in Vorderwald × Montbéliarde cattle. The objective of the present study was to perform a genome-wide association study (GWAS) for 10 CLJ-affected and 50 unaffected Vorderwald × Montbéliarde cattle using the bovine Illumina high density bead chip to identify loci for this condition. Phenotypic classification of CLJ was based on a detailed recording of orofacial structures using computed tomography. A breeding experiment among CLJ-affected Vorderwald × Montbéliarde cattle and CLJ-affected Vorderwald × Montbéliarde cattle with unaffected Holsteins confirmed recessive inheritance and different loci for bilateral or left-sided versus right-sided CLJ. The GWAS for the five cases with right-sided CLJ gave a genome-wide signal on bovine chromosome (BTA) 29 at 16 Mb. For the four left-sided and one bilateral CLJ case, a genome-wide significant association was identified on BTA4 at 32 Mb. Two different loci are very likely to be involved in CLJ in Vorderwald × Montbéliarde cattle because experimental matings among affected cows and bulls with different types of CLJ did not result in CLJ-affected progeny, and in addition, two different loci were also found through GWAS and mapped on two different bovine chromosomes. Validation in 346 Vorderwald × Montbéliarde cattle for the highly associated SNPs on BTA4 and 29 gave ratios of 33/346 (0.095, BTA4) and 6/346 (0.017, BTA29) homozygous mutant genotypes. Further studies should elucidate the responsible mutations underlying the different types of CLJ in Vorderwald × Montbéliarde cattle. © 2017 Stichting International Foundation for Animal Genetics.

  5. Isolation and Structure Elucidation of Three New Dolastanes from the Brown Alga Dilophus spiralis

    Directory of Open Access Journals (Sweden)

    Vassilios Roussis

    2013-04-01

    Full Text Available Three new dolastane diterpenes (1–3 and five previously reported perhydroazulenes were isolated from the organic extracts of the brown alga Dilophus spiralis. The structure elucidation and the assignment of the relative configurations of the isolated natural products were based on extensive analyses of their spectroscopic data, whereas the absolute configuration of metabolite 2 was determined through its chemical conversion to a previously isolated compound of known configuration.

  6. Elucidating the Molecular Basis and Regulation of Chromium (VI) Reduction by Shewanella oneidensis MR-1 Using Biochemical, Genomic, and Proteomic Approaches

    Energy Technology Data Exchange (ETDEWEB)

    Hettich, Robert L.

    2006-10-30

    Although microbial metal reduction has been investigated intensively from physiological and biochemical perspectives, little is known about the genetic basis and regulatory mechanisms underlying the ability of certain bacteria to transform, detoxify, or immobilize a wide array of heavy metals contaminating DOE-relevant environments. The major goal of this work is to elucidate the molecular components comprising the chromium(VI) response pathway, with an emphasis on components involved in Cr(VI) detoxification and the enzyme complex catalyzing the terminal step in Cr(VI) reduction by Shewanella oneidensis MR-1. We have identified and characterized (in the case of DNA-binding response regulator [SO2426] and a putative azoreductase [SO3585]) the genes and gene products involved in the molecular response of MR-1 to chromium(VI) stress using whole-genome sequence information for MR-1 and recently developed proteomic technology, in particular liquid chromatographymass spectrometry (LC-MS), in conjunction with conventional protein purification and characterization techniques. The proteome datasets were integrated with information from whole-genome expression arrays for S. oneidensis MR-1 (as illustrated in Figure 1). The genes and their encoded products identified in this study are of value in understanding metal reduction and bacterial resistance to metal toxicity and in developing effective metal immobilization strategies.

  7. Isolation and structural elucidation of secondary metabolites of plants of the families asteraceae and urticaceae

    International Nuclear Information System (INIS)

    Villagra Quesada, E.

    2002-01-01

    A phytochemistry study of plant's species of the Asteraceae and Urticaceae family is proposed in order to isolate and to elucidate the structure of active principles; due to the fact that several studies have found that some of these families have compounds with anti-inflammatory activity, mainly lactonas sesquiterpenicas . The phytochemistry study was carried out through the application of chromatography techniques, for the separation and purification of the compounds. Includes chromatography of column, fine and liquid layer of high resolution. On the other hand, spectroscopic techniques were used for the elucidation, mainly of nuclear magnetic resonance (RMN) as much of one as of two dimensions. In this way, it was possible to isolate 14 compounds in Decachaeta thieleana and 10 in Phenax mexicanus, from which 6 correspond compounds of innovative structure. The comparison of the results obtained in Decachaeta thieleana (with previous studies) evidences that specimens, orphologically identical (the same species, but different locations), possess totally different compounds. This suggests that the studied specimens do not correspond to the same species. However, the determination of such a cause not only evade the objectives of this work but also the area of study of Chemistry [es

  8. Three-dimensional Structure of a Viral Genome-delivery Portal Vertex

    Energy Technology Data Exchange (ETDEWEB)

    A Olia; P Prevelige Jr.; J Johnson; G Cingolani

    2011-12-31

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-{angstrom}-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a {approx}1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a {approx}200-{angstrom}-long {alpha}-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell.

  9. Isolation and Structural Elucidation of Chondrosterins F–H from the Marine Fungus Chondrostereum sp.

    Directory of Open Access Journals (Sweden)

    Wen-Jian Lan

    2013-02-01

    Full Text Available The marine fungus Chondrostereum sp. was collected from a soft coral of the species Sarcophyton tortuosum from the South China Sea. Three new compounds, chondrosterins F–H (1, 4 and 5, together with three known compounds, incarnal (2, arthrosporone (3, and (2E-decene-4,6,8-triyn-1-ol (6, were isolated. Their structures were elucidated primarily based on NMR and MS data. Incarnal (2 exhibited potent cytotoxic activity against various cancer cell lines.

  10. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.......Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...

  11. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

    DEFF Research Database (Denmark)

    Parker, Brian John; Moltke, Ida; Roth, Adam

    2011-01-01

    a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein...

  12. Complete mitochondrial genomes elucidate phylogenetic relationships of the deep-sea octocoral families Coralliidae and Paragorgiidae

    Science.gov (United States)

    Figueroa, Diego F.; Baco, Amy R.

    2014-01-01

    In the past decade, molecular phylogenetic analyses of octocorals have shown that the current morphological taxonomic classification of these organisms needs to be revised. The latest phylogenetic analyses show that most octocorals can be divided into three main clades. One of these clades contains the families Coralliidae and Paragorgiidae. These families share several taxonomically important characters and it has been suggested that they may not be monophyletic; with the possibility of the Coralliidae being a derived branch of the Paragorgiidae. Uncertainty exists not only in the relationship of these two families, but also in the classification of the two genera that make up the Coralliidae, Corallium and Paracorallium. Molecular analyses suggest that the genus Corallium is paraphyletic, and it can be divided into two main clades, with the Paracorallium as members of one of these clades. In this study we sequenced the whole mitochondrial genome of five species of Paragorgia and of five species of Corallium to use in a phylogenetic analysis to achieve two main objectives; the first to elucidate the phylogenetic relationship between the Paragorgiidae and Coralliidae and the second to determine whether the genera Corallium and Paracorallium are monophyletic. Our results show that other members of the Coralliidae share the two novel mitochondrial gene arrangements found in a previous study in Corallium konojoi and Paracorallium japonicum; and that the Corallium konojoi arrangement is also found in the Paragorgiidae. Our phylogenetic reconstruction based on all the protein coding genes and ribosomal RNAs of the mitochondrial genome suggest that the Coralliidae are not a derived branch of the Paragorgiidae, but rather a monophyletic sister branch to the Paragorgiidae. While our manuscript was in review a study was published using morphological data and several fragments from mitochondrial genes to redefine the taxonomy of the Coralliidae. Paracorallium was subsumed

  13. Elucidation of hepatitis C virus transmission and early diversification by single genome sequencing.

    Science.gov (United States)

    Li, Hui; Stoddard, Mark B; Wang, Shuyi; Blair, Lily M; Giorgi, Elena E; Parrish, Erica H; Learn, Gerald H; Hraber, Peter; Goepfert, Paul A; Saag, Michael S; Denny, Thomas N; Haynes, Barton F; Hahn, Beatrice H; Ribeiro, Ruy M; Perelson, Alan S; Korber, Bette T; Bhattacharya, Tanmoy; Shaw, George M

    2012-01-01

    A precise molecular identification of transmitted hepatitis C virus (HCV) genomes could illuminate key aspects of transmission biology, immunopathogenesis and natural history. We used single genome sequencing of 2,922 half or quarter genomes from plasma viral RNA to identify transmitted/founder (T/F) viruses in 17 subjects with acute community-acquired HCV infection. Sequences from 13 of 17 acute subjects, but none of 14 chronic controls, exhibited one or more discrete low diversity viral lineages. Sequences within each lineage generally revealed a star-like phylogeny of mutations that coalesced to unambiguous T/F viral genomes. Numbers of transmitted viruses leading to productive clinical infection were estimated to range from 1 to 37 or more (median = 4). Four acutely infected subjects showed a distinctly different pattern of virus diversity that deviated from a star-like phylogeny. In these cases, empirical analysis and mathematical modeling suggested high multiplicity virus transmission from individuals who themselves were acutely infected or had experienced a virus population bottleneck due to antiviral drug therapy. These results provide new quantitative and qualitative insights into HCV transmission, revealing for the first time virus-host interactions that successful vaccines or treatment interventions will need to overcome. Our findings further suggest a novel experimental strategy for identifying full-length T/F genomes for proteome-wide analyses of HCV biology and adaptation to antiviral drug or immune pressures.

  14. Genomic structure, expression and association study of the porcine FSD2.

    Science.gov (United States)

    Lim, Kyu-Sang; Lee, Kyung-Tai; Lee, Si-Woo; Chai, Han-Ha; Jang, Gulwon; Hong, Ki-Chang; Kim, Tae-Hun

    2016-09-01

    The fibronectin type III and SPRY domain containing 2 (FSD2) on porcine chromosome 7 is considered a candidate gene for pork quality, since its two domains, which were present in fibronectin and ryanodine receptor. The fibronectin type III and SPRY domains were first identified in fibronectin and ryanodine receptor, respectively, which are candidate genes for meat quality. The aim of this study was to elucidate the genomic structure of FSD2 and functions of single nucleotide polymorphisms (SNPs) within FSD2 that are related to meat quality in pigs. Using a bacterial artificial chromosome clone sequence, we revealed that porcine FSD2 consisted of 13 exons encoding 750 amino acids. In addition, FSD2 was expressed in heart, longissimus dorsi muscle, psoas muscle, and tendon among 23 kinds of porcine tissues tested. A total of ten SNPs, including four missense mutations, were identified in the exonic region of FSD2, and two major haplotypes were obtained based on the SNP genotypes of 633 Berkshire pigs. Both haplotypes were associated significantly with intramuscular fat content (IMF, P meat color, affecting yellowness (P = 0.002). These haplotype effects were further supported by the alteration of putative protein structures with amino acid substitutions. Taken together, our results suggest that FSD2 haplotypes are involved in regulating meat quality including IMF, MP, and meat color in pigs, and may be used as meaningful molecular makers to identify pigs with preferable pork quality.

  15. Tree decomposition based fast search of RNA structures including pseudoknots in genomes.

    Science.gov (United States)

    Song, Yinglei; Liu, Chunmei; Malmberg, Russell; Pan, Fangfang; Cai, Liming

    2005-01-01

    Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t = 2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k(t)N(2)), where k is a small parameter; and N is the size of the projiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a Covariance model with a significant reduction in computation time. In particular; very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site h t t p ://w.uga.edu/RNA-informatics/software/index.php.

  16. Isolation and Structural Elucidation of an Unknown compound from Murraya alternans (Kurz)Swingle

    International Nuclear Information System (INIS)

    Mya Aye; Hla Myoe Min; Sein Htun

    2002-02-01

    A new taxon of a species, Murraya alternans (Kurz) Swingle (Myanmar name, Naganaing) the series of Murraya belonging to the family Rutaceae had been recognized by Peter G. Waterman in 1986. However, this species has not been undertaken in botanical, medical, and chemical aspects. In this paper, scientific study on this taxon was chemically carried out for the first time. One of the unknown compounds was isolated from this species by column and high performance liquid chromatographic methods. It's partial structure could also be elucidated by spectral analysis such as IR, MS, H NMR(400MHz), C NMR (100MHz) spectrometry respectively. (author)

  17. A sequence-based survey of the complex structural organization of tumor genomes

    Energy Technology Data Exchange (ETDEWEB)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  18. Structured Matrix Completion with Applications to Genomic Data Integration.

    Science.gov (United States)

    Cai, Tianxi; Cai, T Tony; Zhang, Anru

    2016-01-01

    Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival.

  19. APPLICATION OF A C-13 NMR TOPOLOGICAL MODEL TO THE STRUCTURE ELUCIDATION OF ORGANIC COMPOUNDS

    Institute of Scientific and Technical Information of China (English)

    袁身刚; 彭琛; 郑崇直

    1992-01-01

    This paper presents an approach which can elucidate automatically the structures of simple organic compounds from their C-13 NMR spectral data by using a computer. Based on a substructure/C-13 NMR chemical shift topological correlation model, the approach deduces the candidate substructures and the constraints for the substructure assembling from the molecular formula and C-13 NMR spectral data. Then, candidate structures are generated under these constraints by assembling the candidate substructures in a partial superposition manner. Candidate substructures or structures are evaluated once they are generated in order to eliminate those conflicting with the original data as early as possible. The evaluation of a (sub)structure is mainly carried out by simulating its C-13 NMR (sub) spectrum, which is again based on the model, and comparing the simulated spectrum with the original data.

  20. Comparative genomics of the white-rot fungi, Phanerochaete carnosa and P. chrysosporium, to elucidate the genetic basis of the distinct wood types they colonize

    Energy Technology Data Exchange (ETDEWEB)

    Suzuki, Hitoshi; MacDonald, Jacqueline; Syed, Khajamohiddin; Salamov, Asaf; Hori, Chiaki; Aerts, Andrea; Henrissat, Bernard; Wiebenga, Ad; vanKuyk, Patricia A.; Barry, Kerrie; Lindquist, Erika; LaButti, Kurt; Lapidus, Alla; Lucas, Susan; Coutinho, Pedro; Gong, Yunchen; Samejima, Masahiro; Mahadevan, Radhakrishnan; Abou-Zaid, Mamdouh; de Vries, Ronald P.; Igarashi, Kiyohiko; Yadav, Jagit S.; Grigoriev, Igor V.; Master, Emma R.

    2012-02-17

    Background Softwood is the predominant form of land plant biomass in the Northern hemisphere, and is among the most recalcitrant biomass resources to bioprocess technologies. The white rot fungus, Phanerochaete carnosa, has been isolated almost exclusively from softwoods, while most other known white-rot species, including Phanerochaete chrysosporium, were mainly isolated from hardwoods. Accordingly, it is anticipated that P. carnosa encodes a distinct set of enzymes and proteins that promote softwood decomposition. To elucidate the genetic basis of softwood bioconversion by a white-rot fungus, the present study reports the P. carnosa genome sequence and its comparative analysis with the previously reported P. chrysosporium genome. Results P. carnosa encodes a complete set of lignocellulose-active enzymes. Comparative genomic analysis revealed that P. carnosa is enriched with genes encoding manganese peroxidase, and that the most divergent glycoside hydrolase families were predicted to encode hemicellulases and glycoprotein degrading enzymes. Most remarkably, P. carnosa possesses one of the largest P450 contingents (266 P450s) among the sequenced and annotated wood-rotting basidiomycetes, nearly double that of P. chrysosporium. Along with metabolic pathway modeling, comparative growth studies on model compounds and chemical analyses of decomposed wood components showed greater tolerance of P. carnosa to various substrates including coniferous heartwood. Conclusions The P. carnosa genome is enriched with genes that encode P450 monooxygenases that can participate in extractives degradation, and manganese peroxidases involved in lignin degradation. The significant expansion of P450s in P. carnosa, along with differences in carbohydrate- and lignin-degrading enzymes, could be correlated to the utilization of heartwood and sapwood preparations from both coniferous and hardwood species.

  1. Megabase replication domains along the human genome: relation to chromatin structure and genome organisation.

    Science.gov (United States)

    Audit, Benjamin; Zaghloul, Lamia; Baker, Antoine; Arneodo, Alain; Chen, Chun-Long; d'Aubenton-Carafa, Yves; Thermes, Claude

    2013-01-01

    In higher eukaryotes, the absence of specific sequence motifs, marking the origins of replication has been a serious hindrance to the understanding of (i) the mechanisms that regulate the spatio-temporal replication program, and (ii) the links between origins activation, chromatin structure and transcription. In this chapter, we review the partitioning of the human genome into megabased-size replication domains delineated as N-shaped motifs in the strand compositional asymmetry profiles. They collectively span 28.3% of the genome and are bordered by more than 1,000 putative replication origins. We recapitulate the comparison of this partition of the human genome with high-resolution experimental data that confirms that replication domain borders are likely to be preferential replication initiation zones in the germline. In addition, we highlight the specific distribution of experimental and numerical chromatin marks along replication domains. Domain borders correspond to particular open chromatin regions, possibly encoded in the DNA sequence, and around which replication and transcription are highly coordinated. These regions also present a high evolutionary breakpoint density, suggesting that susceptibility to breakage might be linked to local open chromatin fiber state. Altogether, this chapter presents a compartmentalization of the human genome into replication domains that are landmarks of the human genome organization and are likely to play a key role in genome dynamics during evolution and in pathological situations.

  2. Isolation and structure elucidation of the nucleoside antibiotic strepturidin from Streptomyces albus DSM 40763.

    Science.gov (United States)

    Pesic, Alexander; Steinhaus, Britta; Kemper, Sebastian; Nachtigall, Jonny; Kutzner, Hans Jürgen; Höfle, Gerhard; Süssmuth, Roderich D

    2014-06-01

    The antibiotic strepturidin (1) was isolated from the microorganism Streptomyces albus DSM 40763, and its structure elucidated by spectroscopic methods and chemical degradation studies. The determination of the relative and absolute stereocenters was partially achieved using chiral GC/EI-MS analysis and microderivatization by acetal ring formation and subsequent 2D-NMR analysis of key (1)H,(1)H-NOESY NMR correlations and extraction of (1)H,(13)C coupling constants from (1)H,(13)C-HMBC NMR spectra. Based on these results, a biosynthesis model was proposed.

  3. Structure elucidation of the new citharoxazole from the Mediterranean deep-sea sponge Latrunculia (Biannulata) citharistae.

    Science.gov (United States)

    Genta-Jouve, Grégory; Francezon, Nellie; Puissant, Alexandre; Auberger, Patrick; Vacelet, Jean; Pérez, Thierry; Fontana, Angelo; Mourabit, Ali Al; Thomas, Olivier P

    2011-08-01

    Citharoxazole (1), a new batzelline derivative featuring a benzoxazole moiety, was isolated from the Mediterranean deep-sea sponge Latrunculia (Biannulata) citharistae Vacelet, 1969, together with the known batzelline C (2). This is the first chemical study of a Mediterranean Latrunculia species and the benzoxazole moiety is unprecedented for this family of marine natural products. The structure was mainly elucidated by the interpretation of NMR spectra and especially HMBC correlations. Copyright © 2011 John Wiley & Sons, Ltd.

  4. Structure elucidation of metabolite x17299 by interpretation of mass spectrometric data.

    Science.gov (United States)

    Zhang, Qibo; Ford, Lisa A; Evans, Anne M; Toal, Douglas R

    2017-01-01

    A major bottleneck in metabolomic studies is metabolite identification from accurate mass spectrometric data. Metabolite x17299 was identified in plasma as an unknown in a metabolomic study using a compound-centric approach where the associated ion features of the compound were used to determine the true molecular mass. The aim of this work is to elucidate the chemical structure of x17299, a new compound by de novo interpretation of mass spectrometric data. An Orbitrap Elite mass spectrometer was used for acquisition of mass spectra up to MS 4 at high resolution. Synthetic standards of N,N,N -trimethyl-l-alanyl-l-proline betaine (l,l-TMAP), a diastereomer, and an enantiomer were chemically prepared. The planar structure of x17299 was successfully proposed by de novo mechanistic interpretation of mass spectrometric data without any laborious purification and nuclear magnetic resonance spectroscopic analysis. The proposed structure was verified by deuterium exchanged mass spectrometric analysis and confirmed by comparison to a synthetic standard. Relative configuration of x17299 was determined by direct chromatographic comparison to a pair of synthetic diastereomers. Absolute configuration was assigned after derivatization of x17299 with a chiral auxiliary group followed by its chromatographic comparison to a pair of synthetic standards. The chemical structure of metabolite x17299 was determined to be l,l-TMAP.

  5. Isolation and structure elucidation of an interaction product of aminotadalafil found in an illegal health food product.

    Science.gov (United States)

    Häberli, Adrian; Girard, Philippe; Low, Min-Yong; Ge, Xiaowei

    2010-09-21

    An interaction product of aminotadalafil was isolated from an illegal health food product. The structure of the interaction product was elucidated by means of IR, NMR, and mass spectroscopy. The hitherto unknown compound was characterized as condensation product of aminotadalafil and hydroxymethylfuraldehyde and is probably the result of a drug-excipient incompatibility. Copyright 2010. Published by Elsevier B.V.

  6. Structured RNAs in the ENCODE selected regions of the human genome

    DEFF Research Database (Denmark)

    Washietl, Stefan; Pedersen, Jakob Skou; Korbel, Jan O

    2007-01-01

    Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack...... with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz...

  7. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication

    Science.gov (United States)

    Wheat (Triticum spp.) is one of the founder crops that likely drove the Neolithic transition to sedentary agrarian societies in the Fertile Crescent over 10,000 years ago. Identifying genetic modifications underlying wheat's domestication requires knowledge of the genome of its allo-tetraploid proge...

  8. High throughput platforms for structural genomics of integral membrane proteins.

    Science.gov (United States)

    Mancia, Filippo; Love, James

    2011-08-01

    Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.

  9. G2S: A web-service for annotating genomic variants on 3D protein structures.

    Science.gov (United States)

    Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong

    2018-01-27

    Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that support programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design conception and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  10. SeMPI: a genome-based secondary metabolite prediction and identification web server.

    Science.gov (United States)

    Zierep, Paul F; Padilla, Natàlia; Yonchev, Dimitar G; Telukunta, Kiran K; Klementz, Dennis; Günther, Stefan

    2017-07-03

    The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. RNA 3D modules in genome-wide predictions of RNA 2D structure

    DEFF Research Database (Denmark)

    Theis, Corinna; Zirbel, Craig L; Zu Siederdissen, Christian Höner

    2015-01-01

    . These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D......Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational...... approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution...

  12. Rumen microbial genomics

    International Nuclear Information System (INIS)

    Morrison, M.; Nelson, K.E.

    2005-01-01

    Improving microbial degradation of plant cell wall polysaccharides remains one of the highest priority goals for all livestock enterprises, including the cattle herds and draught animals of developing countries. The North American Consortium for Genomics of Fibrolytic Ruminal Bacteria was created to promote the sequencing and comparative analysis of rumen microbial genomes, offering the potential to fully assess the genetic potential in a functional and comparative fashion. It has been found that the Fibrobacter succinogenes genome encodes many more endoglucanases and cellodextrinases than previously isolated, and several new processive endoglucanases have been identified by genome and proteomic analysis of Ruminococcus albus, in addition to a variety of strategies for its adhesion to fibre. The ramifications of acquiring genome sequence data for rumen microorganisms are profound, including the potential to elucidate and overcome the biochemical, ecological or physiological processes that are rate limiting for ruminal fibre degradation. (author)

  13. pH-Controlled Oxidation of an Aromatic Ketone: Structural Elucidation of the Products of Two Green Chemical Reactions

    Science.gov (United States)

    Ballard, C. Eric

    2010-01-01

    A laboratory experiment emphasizing the structural elucidation of organic compounds has been developed as a discovery exercise. The "unknown" compounds are the products of the pH-controlled oxidation of 4'-methoxyacetophenone with bleach. The chemoselectivity of this reaction is highly dependent on the pH of the reaction media: under basic…

  14. Population Structure Analysis of Bull Genomes of European and Western Ancestry

    DEFF Research Database (Denmark)

    Chung, Neo Christopher; Szyda, Joanna; Frąszczak, Magdalena

    2017-01-01

    Since domestication, population bottlenecks, breed formation, and selective breeding have radically shaped the genealogy and genetics of Bos taurus. In turn, characterization of population structure among diverse bull (males of Bos taurus) genomes enables detailed assessment of genetic resources...... and origins. By analyzing 432 unrelated bull genomes from 13 breeds and 16 countries, we demonstrate genetic diversity and structural complexity among the European/Western cattle population. Importantly, we relaxed a strong assumption of discrete or admixed population, by adapting latent variable models...... harboring largest genetic differentiation suggest positive selection underlying population structure. We carried out gene set analysis using SNP annotations to identify enriched functional categories such as energy-related processes and multiple development stages. Our population structure analysis of bull...

  15. Structure Elucidation of Unknown Metabolites in Metabolomics by Combined NMR and MS/MS Prediction.

    Science.gov (United States)

    Boiteau, Rene M; Hoyt, David W; Nicora, Carrie D; Kinmonth-Schultz, Hannah A; Ward, Joy K; Bingol, Kerem

    2018-01-17

    We introduce a cheminformatics approach that combines highly selective and orthogonal structure elucidation parameters; accurate mass, MS/MS (MS²), and NMR into a single analysis platform to accurately identify unknown metabolites in untargeted studies. The approach starts with an unknown LC-MS feature, and then combines the experimental MS/MS and NMR information of the unknown to effectively filter out the false positive candidate structures based on their predicted MS/MS and NMR spectra. We demonstrate the approach on a model mixture, and then we identify an uncatalogued secondary metabolite in Arabidopsis thaliana . The NMR/MS² approach is well suited to the discovery of new metabolites in plant extracts, microbes, soils, dissolved organic matter, food extracts, biofuels, and biomedical samples, facilitating the identification of metabolites that are not present in experimental NMR and MS metabolomics databases.

  16. Genome sequencing reveals complex secondary metabolome in themarine actinomycete Salinispora tropica

    Energy Technology Data Exchange (ETDEWEB)

    Udwary, Daniel W.; Zeigler, Lisa; Asolkar, Ratnakar; Singan,Vasanth; Lapidus, Alla; Fenical, William; Jensen, Paul R.; Moore, BradleyS.

    2007-05-01

    Recent fermentation studies have identified actinomycetes ofthe marine-dwelling genus Salinispora as prolific natural productproducers. To further evaluate their biosynthetic potential, we analyzedall identifiable secondary natural product gene clusters from therecently sequenced 5,184,724 bp S. tropica CNB-440 circular genome. Ouranalysis shows that biosynthetic potential meets or exceeds that shown byprevious Streptomyces genome sequences as well as other naturalproduct-producing actinomycetes. The S. tropica genome features ninepolyketide synthase systems of every known formally classified family,non-ribosomal peptide synthetases and several hybrid clusters. While afew clusters appear to encode molecules previously identified inStreptomyces species,the majority of the 15 biosynthetic loci are novel.Specific chemical information about putative and observed natural productmolecules is presented and discussed. In addition, our bioinformaticanalysis was critical for the structure elucidation of the novelpolyenemacrolactam salinilactam A. This study demonstrates the potentialfor genomic analysis to complement and strengthen traditional naturalproduct isolation studies and firmly establishes the genus Salinispora asa rich source of novel drug-like molecules.

  17. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain

    DEFF Research Database (Denmark)

    Sükösd, Zsuzsanna; Andersen, Ebbe Sloth; Seemann, Ernst Stefan

    2015-01-01

    of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping...

  18. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus.

    Directory of Open Access Journals (Sweden)

    Chuan Hong

    2014-12-01

    Full Text Available Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion.

  19. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus.

    Science.gov (United States)

    Hong, Chuan; Oksanen, Hanna M; Liu, Xiangan; Jakana, Joanita; Bamford, Dennis H; Chiu, Wah

    2014-12-01

    Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds) DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM) and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion.

  20. Large-scale trends in the evolution of gene structures within 11 animal genomes.

    Directory of Open Access Journals (Sweden)

    Mark Yandell

    2006-03-01

    Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.

  1. In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.

    Science.gov (United States)

    Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M

    2014-01-30

    RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.

  2. Complete Mitochondrial Genome of the Medicinal Mushroom Ganoderma lucidum

    Science.gov (United States)

    Chen, Haimei; Chen, Xiangdong; Lan, Jin; Liu, Chang

    2013-01-01

    Ganoderma lucidum is one of the well-known medicinal basidiomycetes worldwide. The mitochondrion, referred to as the second genome, is an organelle found in most eukaryotic cells and participates in critical cellular functions. Elucidating the structure and function of this genome is important to understand completely the genetic contents of G. lucidum. In this study, we assembled the mitochondrial genome of G. lucidum and analyzed the differential expressions of its encoded genes across three developmental stages. The mitochondrial genome is a typical circular DNA molecule of 60,630 bp with a GC content of 26.67%. Genome annotation identified genes that encode 15 conserved proteins, 27 tRNAs, small and large rRNAs, four homing endonucleases, and two hypothetical proteins. Except for genes encoding trnW and two hypothetical proteins, all genes were located on the positive strand. For the repeat structure analysis, eight forward, two inverted, and three tandem repeats were detected. A pair of fragments with a total length around 5.5 kb was found in both the nuclear and mitochondrial genomes, which suggests the possible transfer of DNA sequences between two genomes. RNA-Seq data for samples derived from three stages, namely, mycelia, primordia, and fruiting bodies, were mapped to the mitochondrial genome and qualified. The protein-coding genes were expressed higher in mycelia or primordial stages compared with those in the fruiting bodies. The rRNA abundances were significantly higher in all three stages. Two regions were transcribed but did not contain any identified protein or tRNA genes. Furthermore, three RNA-editing sites were detected. Genome synteny analysis showed that significant genome rearrangements occurred in the mitochondrial genomes. This study provides valuable information on the gene contents of the mitochondrial genome and their differential expressions at various developmental stages of G. lucidum. The results contribute to the understanding of the

  3. The Impact of Structural Genomics: Expectations and Outcomes

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2005-12-21

    Structural Genomics (SG) projects aim to expand our structural knowledge of biological macromolecules, while lowering the average costs of structure determination. We quantitatively analyzed the novelty, cost, and impact of structures solved by SG centers, and contrast these results with traditional structural biology. The first structure from a protein family is particularly important to reveal the fold and ancient relationships to other proteins. In the last year, approximately half of such structures were solved at a SG center rather than in a traditional laboratory. Furthermore, the cost of solving a structure at the most efficient U.S. center has now dropped to one-quarter the estimated cost of solving a structure by traditional methods. However, top structural biology laboratories are much more efficient than the average, and comparable to SG centers despite working on very challenging structures. Moreover, traditional structural biology papers are cited significantly more often, suggesting greater current impact.

  4. Exploring the role of genome and structural ions in preventing viral capsid collapse during dehydration

    Science.gov (United States)

    Martín-González, Natalia; Guérin Darvas, Sofía M.; Durana, Aritz; Marti, Gerardo A.; Guérin, Diego M. A.; de Pablo, Pedro J.

    2018-03-01

    Even though viruses evolve mainly in liquid milieu, their horizontal transmission routes often include episodes of dry environment. Along their life cycle, some insect viruses, such as viruses from the Dicistroviridae family, withstand dehydrated conditions with presently unknown consequences to their structural stability. Here, we use atomic force microscopy to monitor the structural changes of viral particles of Triatoma virus (TrV) after desiccation. Our results demonstrate that TrV capsids preserve their genome inside, conserving their height after exposure to dehydrating conditions, which is in stark contrast with other viruses that expel their genome when desiccated. Moreover, empty capsids (without genome) resulted in collapsed particles after desiccation. We also explored the role of structural ions in the dehydration process of the virions (capsid containing genome) by chelating the accessible cations from the external solvent milieu. We observed that ion suppression helps to keep the virus height upon desiccation. Our results show that under drying conditions, the genome of TrV prevents the capsid from collapsing during dehydration, while the structural ions are responsible for promoting solvent exchange through the virion wall.

  5. SINEs, evolution and genome structure in the opossum.

    Science.gov (United States)

    Gu, Wanjun; Ray, David A; Walker, Jerilyn A; Barnes, Erin W; Gentles, Andrew J; Samollow, Paul B; Jurka, Jerzy; Batzer, Mark A; Pollock, David D

    2007-07-01

    Short INterspersed Elements (SINEs) are non-autonomous retrotransposons, usually between 100 and 500 base pairs (bp) in length, which are ubiquitous components of eukaryotic genomes. Their activity, distribution, and evolution can be highly informative on genomic structure and evolutionary processes. To determine recent activity, we amplified more than one hundred SINE1 loci in a panel of 43 M. domestica individuals derived from five diverse geographic locations. The SINE1 family has expanded recently enough that many loci were polymorphic, and the SINE1 insertion-based genetic distances among populations reflected geographic distance. Genome-wide comparisons of SINE1 densities and GC content revealed that high SINE1 density is associated with high GC content in a few long and many short spans. Young SINE1s, whether fixed or polymorphic, showed an unbiased GC content preference for insertion, indicating that the GC preference accumulates over long time periods, possibly in periodic bursts. SINE1 evolution is thus broadly similar to human Alu evolution, although it has an independent origin. High GC content adjacent to SINE1s is strongly correlated with bias towards higher AT to GC substitutions and lower GC to AT substitutions. This is consistent with biased gene conversion, and also indicates that like chickens, but unlike eutherian mammals, GC content heterogeneity (isochore structure) is reinforced by substitution processes in the M. domestica genome. Nevertheless, both high and low GC content regions are apparently headed towards lower GC content equilibria, possibly due to a relative shift to lower recombination rates in the recent Monodelphis ancestral lineage. Like eutherians, metatherian (marsupial) mammals have evolved high CpG substitution rates, but this is apparently a convergence in process rather than a shared ancestral state.

  6. Elucidation of the structure-property relationship of p-type organic semiconductors through rapid library construction via a one-pot, Suzuki-Miyaura coupling reaction.

    Science.gov (United States)

    Fuse, Shinichiro; Matsumura, Keisuke; Wakamiya, Atsushi; Masui, Hisashi; Tanaka, Hiroshi; Yoshikawa, Susumu; Takahashi, Takashi

    2014-09-08

    The elucidation of the structure-property relationship is an important issue in the development of organic electronics. Combinatorial synthesis and the evaluation of systematically modified compounds is a powerful tool in the work of elucidating structure-property relationships. In this manuscript, D-π-A structure, 32 p-type organic semiconductors were rapidly synthesized via a one-pot, Suzuki-Miyaura coupling with subsequent Knoevenagel condensation. Evaluation of the solubility and photovoltaic properties of the prepared compounds revealed that the measured solubility was strongly correlated with the solubility parameter (SP), as reported by Fedors. In addition, the SPs were correlated with the Jsc of thin-film organic solar cells prepared using synthesized compounds. Among the evaluated photovoltaic properties of the solar cells, Jsc and Voc had strong correlations with the photoconversion efficiency (PCE).

  7. Genome projects and the functional-genomic era.

    Science.gov (United States)

    Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans

    2005-12-01

    The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.

  8. Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops

    Directory of Open Access Journals (Sweden)

    Gendrault-Jacquemard A

    2005-07-01

    Full Text Available Abstract Background Public databases now contain multitude of complete bacterial genomes, including several genomes of the same species. The available data offers new opportunities to address questions about bacterial genome evolution, a task that requires reliable fine comparison data of closely related genomes. Recent analyses have shown, using pairwise whole genome alignments, that it is possible to segment bacterial genomes into a common conserved backbone and strain-specific sequences called loops. Results Here, we generalize this approach and propose a strategy that allows systematic and non-biased genome segmentation based on multiple genome alignments. Segmentation analyses, as applied to 13 different bacterial species, confirmed the feasibility of our approach to discern the 'mosaic' organization of bacterial genomes. Segmentation results are available through a Web interface permitting functional analysis, extraction and visualization of the backbone/loops structure of documented genomes. To illustrate the potential of this approach, we performed a precise analysis of the mosaic organization of three E. coli strains and functional characterization of the loops. Conclusion The segmentation results including the backbone/loops structure of 13 bacterial species genomes are new and available for use by the scientific community at the URL: http://genome.jouy.inra.fr/mosaic.

  9. Functional Genomics Approaches to Studying Symbioses between Legumes and Nitrogen-Fixing Rhizobia.

    Science.gov (United States)

    Lardi, Martina; Pessi, Gabriella

    2018-05-18

    Biological nitrogen fixation gives legumes a pronounced growth advantage in nitrogen-deprived soils and is of considerable ecological and economic interest. In exchange for reduced atmospheric nitrogen, typically given to the plant in the form of amides or ureides, the legume provides nitrogen-fixing rhizobia with nutrients and highly specialised root structures called nodules. To elucidate the molecular basis underlying physiological adaptations on a genome-wide scale, functional genomics approaches, such as transcriptomics, proteomics, and metabolomics, have been used. This review presents an overview of the different functional genomics approaches that have been performed on rhizobial symbiosis, with a focus on studies investigating the molecular mechanisms used by the bacterial partner to interact with the legume. While rhizobia belonging to the alpha-proteobacterial group (alpha-rhizobia) have been well studied, few studies to date have investigated this process in beta-proteobacteria (beta-rhizobia).

  10. LATERAL GENE TRANSFER AND THE HISTORY OF BACTERIAL GENOMES

    Energy Technology Data Exchange (ETDEWEB)

    Howard Ochman

    2006-02-22

    The aims of this research were to elucidate the role and extent of lateral transfer in the differentiation of bacterial strains and species, and to assess the impact of gene transfer on the evolution of bacterial genomes. The ultimate goal of the project is to examine the dynamics of a core set of protein-coding genes (i.e., those that are distributed universally among Bacteria) by developing conserved primers that would allow their amplification and sequencing in any bacterial taxa. In addition, we adopted a bioinformatic approach to elucidate the extent of lateral gene transfer in sequenced genome.

  11. Genome sequences of Listeria monocytogenes strains with resistance to arsenic

    Science.gov (United States)

    Listeria monocytogenes frequently exhibits resistance to arsenic. We report here the draft genome sequences of eight genetically diverse arsenic-resistant L. monocytogenes strains from human listeriosis and food-associated environments. Availability of these genomes would help to elucidate the role ...

  12. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Directory of Open Access Journals (Sweden)

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  13. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  14. Recognizing genes and other components of genomic structure

    Energy Technology Data Exchange (ETDEWEB)

    Burks, C. (Los Alamos National Lab., NM (USA)); Myers, E. (Arizona Univ., Tucson, AZ (USA). Dept. of Computer Science); Stormo, G.D. (Colorado Univ., Boulder, CO (USA). Dept. of Molecular, Cellular and Developmental Biology)

    1991-01-01

    The Aspen Center for Physics (ACP) sponsored a three-week workshop, with 26 scientists participating, from 28 May to 15 June, 1990. The workshop, entitled Recognizing Genes and Other Components of Genomic Structure, focussed on discussion of current needs and future strategies for developing the ability to identify and predict the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians. No participant stayed for less than one week, and most attended for two or three weeks. Computers, software, and databases were available for use as electronic blackboards'' and as the basis for collaborative exploration of ideas being discussed and developed at the workshop. 23 refs., 2 tabs.

  15. Natural Product Biosynthetic Diversity and Comparative Genomics of the Cyanobacteria.

    Science.gov (United States)

    Dittmann, Elke; Gugger, Muriel; Sivonen, Kaarina; Fewer, David P

    2015-10-01

    Cyanobacteria are an ancient lineage of slow-growing photosynthetic bacteria and a prolific source of natural products with intricate chemical structures and potent biological activities. The bulk of these natural products are known from just a handful of genera. Recent efforts have elucidated the mechanisms underpinning the biosynthesis of a diverse array of natural products from cyanobacteria. Many of the biosynthetic mechanisms are unique to cyanobacteria or rarely described from other organisms. Advances in genome sequence technology have precipitated a deluge of genome sequences for cyanobacteria. This makes it possible to link known natural products to biosynthetic gene clusters but also accelerates the discovery of new natural products through genome mining. These studies demonstrate that cyanobacteria encode a huge variety of cryptic gene clusters for the production of natural products, and the known chemical diversity is likely to be just a fraction of the true biosynthetic capabilities of this fascinating and ancient group of organisms. Copyright © 2015. Published by Elsevier Ltd.

  16. Development of Computational Tools for Metabolic Model Curation, Flux Elucidation and Strain Design

    Energy Technology Data Exchange (ETDEWEB)

    Maranas, Costas D

    2012-05-21

    An overarching goal of the Department of Energy mission is the efficient deployment and engineering of microbial and plant systems to enable biomass conversion in pursuit of high energy density liquid biofuels. This has spurred the pace at which new organisms are sequenced and annotated. This torrent of genomic information has opened the door to understanding metabolism in not just skeletal pathways and a handful of microorganisms but for truly genome-scale reconstructions derived for hundreds of microbes and plants. Understanding and redirecting metabolism is crucial because metabolic fluxes are unique descriptors of cellular physiology that directly assess the current cellular state and quantify the effect of genetic engineering interventions. At the same time, however, trying to keep pace with the rate of genomic data generation has ushered in a number of modeling and computational challenges related to (i) the automated assembly, testing and correction of genome-scale metabolic models, (ii) metabolic flux elucidation using labeled isotopes, and (iii) comprehensive identification of engineering interventions leading to the desired metabolism redirection.

  17. Genome-wide survey of repetitive DNA elements in the button mushroom Agaricus bisporus

    NARCIS (Netherlands)

    Foulongne-Oriol, M.; Murat, C.; Castanera, R.; Ramírez, L.; Sonnenberg, A.S.M.

    2013-01-01

    Repetitive DNA elements are ubiquitous constituents of eukaryotic genomes. The biological roles of these repetitive elements, supposed to impact genome organization and evolution, are not completely elucidated yet. The availability of whole genome sequence offers the opportunity to draw a picture of

  18. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.; Lobzin, V.V.

    2004-01-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions

  19. Isolation and structural elucidation of tiamulin metabolites formed in liver microsomes of pigs.

    Science.gov (United States)

    Lykkeberg, Anne Kruse; Cornett, Claus; Halling-Sørensen, Bent; Hansen, Steen Honoré

    2006-09-18

    Although the antimicrobial tiamulin is extensively metabolized in pigs, the metabolism is not well investigated. In this work the NADPH dependent metabolism of tiamulin in liver microsomes from pigs has been studied. The tiamulin metabolites formed in the incubations were analysed using LC-MS, and three major metabolites were isolated using solid phase extraction and preparative HPLC. The final structure elucidations were performed by tandem mass spectrometry and (1)H and (13)C NMR. The structures of the metabolites were found to be 2beta-hydroxy-tiamulin, 8alpha-hydroxy-tiamulin and N-deethyl-tiamulin. In addition, the LC-MS chromatograms revealed two other minor metabolites. From their chromatography and from MS(2) analysis the structures were estimated to be 2beta-hydroxy-N-deethyl-tiamulin and 8alpha-hydroxy-N-deethyl-tiamulin, but the structures were not confirmed by NMR. In these studies approximately 20% of tiamulin was deethylated, 10% was hydroxylated in the 2beta-position and 7% was hydroxylated in the 8alpha-position. About 40% of tiamulin was metabolized during the incubation conditions used. The protein precipitation in the incubations was performed using perchloric acid, and the preparative purification was performed under alkaline conditions. Therefore, the stability of the metabolites under these conditions was studied. The metabolites were found to be stable in the acid solution, but under alkaline conditions, particularly at room temperature, the stability of especially 8alpha-hydroxy-tiamulin was considerably reduced (40% loss after 1 week).

  20. Whole genome PCR scanning reveals the syntenic genome structure of toxigenic Vibrio cholerae strains in the O1/O139 population.

    Directory of Open Access Journals (Sweden)

    Bo Pang

    Full Text Available Vibrio cholerae is commonly found in estuarine water systems. Toxigenic O1 and O139 V. cholerae strains have caused cholera epidemics and pandemics, whereas the nontoxigenic strains within these serogroups only occasionally lead to disease. To understand the differences in the genome and clonality between the toxigenic and nontoxigenic strains of V. cholerae serogroups O1 and O139, we employed a whole genome PCR scanning (WGPScanning method, an rrn operon-mediated fragment rearrangement analysis and comparative genomic hybridization (CGH to analyze the genome structure of different strains. WGPScanning in conjunction with CGH revealed that the genomic contents of the toxigenic strains were conservative, except for a few indels located mainly in mobile elements. Minor nucleotide variation in orthologous genes appeared to be the major difference between the toxigenic strains. rrn operon-mediated rearrangements were infrequent in El Tor toxigenic strains tested using I-CeuI digested pulsed-field gel electrophoresis (PFGE analysis and PCR analysis based on flanking sequence of rrn operons. Using these methods, we found that the genomic structures of toxigenic El Tor and O139 strains were syntenic. The nontoxigenic strains exhibited more extensive sequence variations, but toxin coregulated pilus positive (TCP+ strains had a similar structure. TCP+ nontoxigenic strains could be subdivided into multiple lineages according to the TCP type, suggesting the existence of complex intermediates in the evolution of toxigenic strains. The data indicate that toxigenic O1 El Tor and O139 strains were derived from a single lineage of intermediates from complex clones in the environment. The nontoxigenic strains with non-El Tor type TCP may yet evolve into new epidemic clones after attaining toxigenic attributes.

  1. Genome-wide identification of structural variants in genes encoding drug targets

    DEFF Research Database (Denmark)

    Rasmussen, Henrik Berg; Dahmcke, Christina Mackeprang

    2012-01-01

    The objective of the present study was to identify structural variants of drug target-encoding genes on a genome-wide scale. We also aimed at identifying drugs that are potentially amenable for individualization of treatments based on knowledge about structural variation in the genes encoding...

  2. Building blocks for automated elucidation of metabolites: natural product-likeness for candidate ranking.

    Science.gov (United States)

    Jayaseelan, Kalai Vanii; Steinbeck, Christoph

    2014-07-05

    In metabolomics experiments, spectral fingerprints of metabolites with no known structural identity are detected routinely. Computer-assisted structure elucidation (CASE) has been used to determine the structural identities of unknown compounds. It is generally accepted that a single 1D NMR spectrum or mass spectrum is usually not sufficient to establish the identity of a hitherto unknown compound. When a suite of spectra from 1D and 2D NMR experiments supplemented with a molecular formula are available, the successful elucidation of the chemical structure for candidates with up to 30 heavy atoms has been reported previously by one of the authors. In high-throughput metabolomics, usually 1D NMR or mass spectrometry experiments alone are conducted for rapid analysis of samples. This method subsequently requires that the spectral patterns are analyzed automatically to quickly identify known and unknown structures. In this study, we investigated whether additional existing knowledge, such as the fact that the unknown compound is a natural product, can be used to improve the ranking of the correct structure in the result list after the structure elucidation process. To identify unknowns using as little spectroscopic information as possible, we implemented an evolutionary algorithm-based CASE mechanism to elucidate candidates in a fully automated fashion, with input of the molecular formula and 13C NMR spectrum of the isolated compound. We also tested how filters like natural product-likeness, a measure that calculates the similarity of the compounds to known natural product space, might enhance the performance and quality of the structure elucidation. The evolutionary algorithm is implemented within the SENECA package for CASE reported previously, and is available for free download under artistic license at http://sourceforge.net/projects/seneca/. The natural product-likeness calculator is incorporated as a plugin within SENECA and is available as a GUI client and

  3. Structural elucidation of the polysaccharide moiety of a glycopeptide (GLPCW-II) from Ganoderma lucidum fruiting bodies.

    Science.gov (United States)

    Ye, LiBin; Zhang, JingSong; Ye, XiJun; Tang, QingJiu; Liu, YanFang; Gong, ChunYu; Du, XiuJui; Pan, YingJie

    2008-03-17

    A water-soluble glycopeptide (GLPCW-II) was isolated from the fruiting bodies of Ganoderma lucidum by DEAE-Sepharose Fast-Flow and Sephacryl S-300 High Resolution Chromatography. The glycopeptide had a molecular weight of 1.2x10(4)Da (determined by HPLC), and consisted of approximately 90% carbohydrate and approximately 8% protein as determined using the phenol-sulfuric acid method and the BCA protein assay reagent kit, respectively. The polysaccharide moiety was composed mainly of D-Glc, L-Fuc, and D-Gal in the ratio of 1.00:1.09:4.09. To facilitate structure-activity studies, the structure of the GLPCW-II polysaccharide moiety was elucidated using 1H and 13C NMR spectroscopy including COSY, TOCSY, HMBC, HSQC, and ROESY, combined with GC-MS of methylated derivatives, and shown to consist of repeating units with the following structure: [Formula: see text].

  4. Pathophysiology of MDS: genomic aberrations.

    Science.gov (United States)

    Ichikawa, Motoshi

    2016-01-01

    Myelodysplastic syndromes (MDS) are characterized by clonal proliferation of hematopoietic stem/progenitor cells and their apoptosis, and show a propensity to progress to acute myelogenous leukemia (AML). Although MDS are recognized as neoplastic diseases caused by genomic aberrations of hematopoietic cells, the details of the genetic abnormalities underlying disease development have not as yet been fully elucidated due to difficulties in analyzing chromosomal abnormalities. Recent advances in comprehensive analyses of disease genomes including whole-genome sequencing technologies have revealed the genomic abnormalities in MDS. Surprisingly, gene mutations were found in approximately 80-90% of cases with MDS, and the novel mutations discovered with these technologies included previously unknown, MDS-specific, mutations such as those of the genes in the RNA-splicing machinery. It is anticipated that these recent studies will shed new light on the pathophysiology of MDS due to genomic aberrations.

  5. Terminal structures of West Nile virus genomic RNA and their interactions with viral NS5 protein

    International Nuclear Information System (INIS)

    Dong Hongping; Zhang Bo; Shi Peiyong

    2008-01-01

    Genome cyclization is essential for flavivirus replication. We used RNases to probe the structures formed by the 5'-terminal 190 nucleotides and the 3'-terminal 111 nucleotides of the West Nile virus (WNV) genomic RNA. When analyzed individually, the two RNAs adopt stem-loop structures as predicted by the thermodynamic-folding program. However, when mixed together, the two RNAs form a duplex that is mediated through base-pairings of two sets of RNA elements (5'CS/3'CSI and 5'UAR/3'UAR). Formation of the RNA duplex facilitates a conformational change that leaves the 3'-terminal nucleotides of the genome (position - 8 to - 16) to be single-stranded. Viral NS5 binds specifically to the 5'-terminal stem-loop (SL1) of the genomic RNA. The 5'SL1 RNA structure is essential for WNV replication. The study has provided further evidence to suggest that flavivirus genome cyclization and NS5/5'SL1 RNA interaction facilitate NS5 binding to the 3' end of the genome for the initiation of viral minus-strand RNA synthesis

  6. Elucidating Functional Aspects of P-type ATPases

    DEFF Research Database (Denmark)

    Autzen, Henriette Elisabeth

    2015-01-01

    and helped enlighten how thapsigargin, a potent inhibitor of SERCA1a, depends on a water mediated hydrogen bond network when bound to SERCA1a. Furthermore, molecular dynamics (MD) simulations of the same P-type ATPase were used to assess a long-standing question whether cholesterol affects SERCA1a through...... similar to that of the wild type (WT) protein. The discrepancy between the newly determined crystal structure of LpCopA and the functional manifestations of the missense mutation in human CopA, could indicate that LpCopA is insufficient in structurally elucidating the effect of disease-causing mutations...... in the human CopA proteins. MD simulations, which combine coarse-grained (CG) and atomistic procedures, were set up in order to elucidate mechanistic implications exerted by the lipid bilayer on LpCopA. The MD simulations of LpCopA corroborated previous and new in vivo activity data and showed...

  7. Justice and the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Murphy, T.F.; Lappe, M. (eds.)

    1992-01-01

    Most of the essays gathered in this volume were first presented at a conference, Justice and the Human Genome, in Chicago in early November, 1991. The goal of the, conference was to consider questions of justice as they are and will be raised by the Human Genome Project. To achieve its goal of identifying and elucidating the challenges of justice inherent in genomic research and its social applications the conference drew together in one forum members from academia, medicine, and industry with interests divergent as rate-setting for insurance, the care of newborns, and the history of ethics. The essays in this volume address a number of theoretical and practical concerns relative to the meaning of genomic research.

  8. Justice and the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Murphy, T.F.; Lappe, M. [eds.

    1992-12-31

    Most of the essays gathered in this volume were first presented at a conference, Justice and the Human Genome, in Chicago in early November, 1991. The goal of the, conference was to consider questions of justice as they are and will be raised by the Human Genome Project. To achieve its goal of identifying and elucidating the challenges of justice inherent in genomic research and its social applications the conference drew together in one forum members from academia, medicine, and industry with interests divergent as rate-setting for insurance, the care of newborns, and the history of ethics. The essays in this volume address a number of theoretical and practical concerns relative to the meaning of genomic research.

  9. Plastid genome structure and loss of photosynthetic ability in the parasitic genus Cuscuta.

    Science.gov (United States)

    Revill, Meredith J W; Stanley, Susan; Hibberd, Julian M

    2005-09-01

    The genus Cuscuta (dodder) is composed of parasitic plants, some species of which appear to be losing the ability to photosynthesize. A molecular phylogeny was constructed using 15 species of Cuscuta in order to assess whether changes in photosynthetic ability and alterations in structure of the plastid genome relate to phylogenetic position within the genus. The molecular phylogeny provides evidence for four major clades within Cuscuta. Although DNA blot analysis showed that Cuscuta species have smaller plastid genomes than tobacco, and that plastome size varied significantly even within one Cuscuta clade, dot blot analysis indicated that the dodders possess homologous sequence to 101 genes from the tobacco plastome. Evidence is provided for significant rates of DNA transfer from plastid to nucleus in Cuscuta. Size and structure of Cuscuta plastid genomes, as well as photosynthetic ability, appear to vary independently of position within the phylogeny, thus supporting the hypothesis that within Cuscuta photosynthetic ability and organization of the plastid genome are changing in an unco-ordinated manner.

  10. A comparative study of the DG-OMEGA (DG Omega), DGII, and GAT method for the structure elucidation of a methylene-acetal linked thymine dinucleotide

    NARCIS (Netherlands)

    van Kampen, A. H. C.; Beckers, M. L. M.; Buydens, L. M. C.

    1997-01-01

    This research continues the investigation of the properties of the recently developed structure elucidation method DG-OMEGA (DG Omega). Towards this end it was applied for the structure determination of a methylene-acetal linked thymine dinucleotide. The performance of DG Omega was compared to the

  11. Dihydronaflavonols from the leaves of Derris urucu (Leguminosae): structural elucidation and DPPH radical-scavenging activity

    Energy Technology Data Exchange (ETDEWEB)

    Lobo, Livia T.; Silva, Geilson A. da; Ferreira, Malisson; Silva, Milton N. da; Santos, Alberdan S.; Arruda, Alberto C.; Guilhon, Gisele M.S.P.; Santos, Lourivaldo S.; Arruda, Mara Silvia P. [Universidade Federal do Para (UFPA), Belem, PA (Brazil). Inst. de Ciencias Exatas e Naturais. Programa de Pos-Graduacao em Quimica], e-mail: mspa@ufpa.br; Borges, Rosilvaldo dos Santos [Universidade Federal do Para (UFPA), Belem, PA (Brazil). Fac. de Farmacia. Inst. de Ciencias

    2009-07-01

    Derris urucu is an Amazonian plant with insecticide and ichthyotoxic properties. Studies with this species show the presence of flavonoids, mainly rotenoids, as well as stilbenes. The ethanol extract of the leaves of Derris urucu (Leguminosae) afforded three new dihydroflavonols named urucuol A (1), B (2) and C (3), and the dihydroflavonol isotirumalin (4). Their structures were elucidated by extensive analysis of 1D and 2D NMR, UV and IR spectra and MS data and comparison with literature data. The isolated compounds (1-4) were evaluated for DPPH radical scavenging activity and showed a relatively lower antioxidant ability compared to the commercial antioxidant trans-resveratrol. (author)

  12. cDNA structure, genomic organization and expression patterns of ...

    African Journals Online (AJOL)

    Visfatin was a newly identified adipocytokine, which was involved in various physiologic and pathologic processes of organisms. The cDNA structure, genomic organization and expression patterns of silver Prussian carp visfatin were described in this report. The silver Prussian carp visfatin cDNA cloned from the liver was ...

  13. Complete plastid genomes from Ophioglossum californicum, Psilotum nudum, and Equisetum hyemale reveal an ancestral land plant genome structure and resolve the position of Equisetales among monilophytes

    Directory of Open Access Journals (Sweden)

    Grewe Felix

    2013-01-01

    Full Text Available Abstract Background Plastid genome structure and content is remarkably conserved in land plants. This widespread conservation has facilitated taxon-rich phylogenetic analyses that have resolved organismal relationships among many land plant groups. However, the relationships among major fern lineages, especially the placement of Equisetales, remain enigmatic. Results In order to understand the evolution of plastid genomes and to establish phylogenetic relationships among ferns, we sequenced the plastid genomes from three early diverging species: Equisetum hyemale (Equisetales, Ophioglossum californicum (Ophioglossales, and Psilotum nudum (Psilotales. A comparison of fern plastid genomes showed that some lineages have retained inverted repeat (IR boundaries originating from the common ancestor of land plants, while other lineages have experienced multiple IR changes including expansions and inversions. Genome content has remained stable throughout ferns, except for a few lineage-specific losses of genes and introns. Notably, the losses of the rps16 gene and the rps12i346 intron are shared among Psilotales, Ophioglossales, and Equisetales, while the gain of a mitochondrial atp1 intron is shared between Marattiales and Polypodiopsida. These genomic structural changes support the placement of Equisetales as sister to Ophioglossales + Psilotales and Marattiales as sister to Polypodiopsida. This result is augmented by some molecular phylogenetic analyses that recover the same relationships, whereas others suggest a relationship between Equisetales and Polypodiopsida. Conclusions Although molecular analyses were inconsistent with respect to the position of Marattiales and Equisetales, several genomic structural changes have for the first time provided a clear placement of these lineages within the ferns. These results further demonstrate the power of using rare genomic structural changes in cases where molecular data fail to provide strong phylogenetic

  14. Structure and genome organization of AFV2, a novel archaeal lipothrixvirus with unusual terminal and core structures

    DEFF Research Database (Denmark)

    Häring, Monika; Vestergaard, Gisle Alberg; Brügger, Kim

    2005-01-01

    A novel filamentous virus, AFV2, from the hyperthermophilic archaeal genus Acidianus shows structural similarity to lipothrixviruses but differs from them in its unusual terminal and core structures. The double-stranded DNA genome contains 31,787 bp and carries eight open reading frames homologous...

  15. Comparative Annotation of Viral Genomes with Non-Conserved Gene Structure

    DEFF Research Database (Denmark)

    de Groot, Saskia; Mailund, Thomas; Hein, Jotun

    2007-01-01

    Motivation: Detecting genes in viral genomes is a complex task. Due to the biological necessity of them being constrained in length, RNA viruses in particular tend to code in overlapping reading frames. Since one amino acid is encoded by a triplet of nucleic acids, up to three genes may be coded...... allows for coding in unidirectional nested and overlapping reading frames, to annotate two homologous aligned viral genomes. Our method does not insist on conserved gene structure between the two sequences, thus making it applicable for the pairwise comparison of more distantly related sequences. Results...... and HIV2, as well as of two different Hepatitis Viruses, attaining results of ~87% sensitivity and ~98.5% specificity. We subsequently incorporate prior knowledge by "knowing" the gene structure of one sequence and annotating the other conditional on it. Boosting accuracy close to perfect we demonstrate...

  16. Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome.

    Science.gov (United States)

    Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne

    2015-02-10

    Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.

  17. Imaging mass spectrometry and genome mining reveal highly antifungal virulence factor of mushroom soft rot pathogen.

    Science.gov (United States)

    Graupner, Katharina; Scherlach, Kirstin; Bretschneider, Tom; Lackner, Gerald; Roth, Martin; Gross, Harald; Hertweck, Christian

    2012-12-21

    Caught in the act: imaging mass spectrometry of a button mushroom infected with the soft rot pathogen Janthinobacterium agaricidamnosum in conjunction with genome mining revealed jagaricin as a highly antifungal virulence factor that is not produced under standard cultivation conditions. The structure of jagaricin was rigorously elucidated by a combination of physicochemical analyses, chemical derivatization, and bioinformatics. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Genomes in Turmoil: Frugality Drives Microbial Community Structure in Extremely Acidic Environments

    Science.gov (United States)

    Holmes, D. S.

    2016-12-01

    Extremely acidic environments (To gain insight into these issues, we have conducted deep bioinformatic analyses, including metabolic reconstruction of key assimilatory pathways, phylogenomics and network scrutiny of >160 genomes of acidophiles, including representatives from Archaea, Bacteria and Eukarya and at least ten metagenomes of acidic environments [Cardenas JP, et al. pp 179-197 in Acidophiles, eds R. Quatrini and D. B. Johnson, Caister Academic Press, UK (2016)]. Results yielded valuable insights into cellular processes, including carbon and nitrogen management and energy production, linking biogeochemical processes to organismal physiology. They also provided insight into the evolutionary forces that shape the genomic structure of members of acidophile communities. Niche partitioning can explain diversity patterns in rapidly changing acidic environments such as bioleaching heaps. However, in spatially and temporally homogeneous acidic environments genome flux appears to provide deeper insight into the composition and evolution of acidic consortia. Acidophiles have undergone genome streamlining by gene loss promoting mutual coexistence of species that exploit complementarity use of scarce resources consistent with the Black Queen hypothesis [Morris JJ et al. mBio 3: e00036-12 (2012)]. Acidophiles also have a large pool of accessory genes (the microbial super-genome) that can be accessed by horizontal gene transfer. This further promotes dependency relationships as drivers of community structure and the evolution of keystone species. Acknowledgements: Fondecyt 1130683; Basal CCTE PFB16

  19. Global MLST of Salmonella Typhi Revisited in Post-Genomic Era: Genetic conservation, Population Structure and Comparative genomics of rare sequence types

    Directory of Open Access Journals (Sweden)

    Kien-Pong eYap

    2016-03-01

    Full Text Available Typhoid fever, caused by Salmonella enterica serovar Typhi, remains an important public health burden in Southeast Asia and other endemic countries. Various genotyping methods have been applied to study the genetic variations of this human-restricted pathogen. Multilocus Sequence Typing (MLST is one of the widely accepted methods, and recently, there is a growing interest in the re-application of MLST in the post-genomic era. In this study, we provide the global MLST distribution of S. Typhi utilizing both publicly available 1,826 S. Typhi genome sequences in addition to performing conventional MLST on S. Typhi strains isolated from various endemic regions spanning over a century. Our global MLST analysis confirms the predominance of two sequence types (ST1 and ST2 co-existing in the endemic regions. Interestingly, S. Typhi strains with ST8 are currently confined within the African continent. Comparative genomic analyses of ST8 and other rare STs with genomes of ST1/ST2 revealed unique mutations in important virulence genes such as flhB, sipC and tviD that may explain the variations that differentiate between seemingly successful (widespread and unsuccessful (poor dissemination S. Typhi populations. Large scale whole-genome phylogeny demonstrated evidence of phylogeographical structuring and showed that ST8 may have diverged from the earlier ancestral population of ST1 and ST2, which later lost some of its fitness advantages, leading to poor worldwide dissemination. In response to the unprecedented increase in genomic data, this study demonstrates and highlights the utility of large-scale genome-based MLST as a quick and effective approach to narrow the scope of in-depth comparative genomic analysis and consequently provide new insights into the fine scale of pathogen evolution and population structure.

  20. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    2009-07-01

    Full Text Available The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen, with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs. SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome. To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of

  1. Structural elucidation and identification of a new derivative of phenethylamine using quadrupole time-of-flight mass spectrometry.

    Science.gov (United States)

    Sekuła, Karolina; Zuba, Dariusz

    2013-09-30

    In recent years, the phenomenon of uncontrolled distribution of new psychoactive substances that were marketed without prior toxicological studies has been observed. Because many designer drugs are related in chemical structure, the potential for misidentifying them is an important problem. It is therefore essential to develop an analytical procedure for unequivocal elucidation of the structures of these compounds. The issue has been discussed in the context of 25I-NBMD [2-(4-iodo-2,5-dimethoxyphenyl)-N-[(2,3-methylenedioxyphenyl)methyl]ethanamine], a psychoactive substance first discovered on the drug market in 2012. The substance was extracted from blotter papers with methanol. Separation was achieved via liquid chromatography. Analysis was conducted by electrospray ionization quadrupole time-of-flight mass spectrometry (ESI-QTOFMS). Identification of the psychoactive component was supported by electron impact gas chromatography/mass spectrometry (GC/EI-MS). The high accuracy of the LC/ESI-QTOFMS method allowed the molecular mass of the investigated substance (M(exp) = 441.0438 Da; mass error, ∆m = 0.2 ppm) and the formulae of ions formed during fragmentation to be determined. The main ions were recorded at m/z = 135.0440, 290.9876 and 305.9981. Structures of the obtained ions were elucidated in the tandem mass spectrometry (MS/MS) experiments by comparing them to mass spectra of previously detected derivatives of phenethylamine. The performed study indicated the potential for using LC/QTOFMS method to identify new designer drugs. This technique can be used supplementary to standard GC/MS. Prior knowledge of the fragmentation mechanisms of phenethylamines allowed to predict the mass spectra of the novel substance--25I-NBMD. Copyright © 2013 John Wiley & Sons, Ltd.

  2. Structural and sequence diversity of the transposon Galileo in the Drosophila willistoni genome.

    Science.gov (United States)

    Gonçalves, Juliana W; Valiati, Victor Hugo; Delprat, Alejandra; Valente, Vera L S; Ruiz, Alfredo

    2014-09-13

    Galileo is one of three members of the P superfamily of DNA transposons. It was originally discovered in Drosophila buzzatii, in which three segregating chromosomal inversions were shown to have been generated by ectopic recombination between Galileo copies. Subsequently, Galileo was identified in six of 12 sequenced Drosophila genomes, indicating its widespread distribution within this genus. Galileo is strikingly abundant in Drosophila willistoni, a neotropical species that is highly polymorphic for chromosomal inversions, suggesting a role for this transposon in the evolution of its genome. We carried out a detailed characterization of all Galileo copies present in the D. willistoni genome. A total of 191 copies, including 133 with two terminal inverted repeats (TIRs), were classified according to structure in six groups. The TIRs exhibited remarkable variation in their length and structure compared to the most complete copy. Three copies showed extended TIRs due to internal tandem repeats, the insertion of other transposable elements (TEs), or the incorporation of non-TIR sequences into the TIRs. Phylogenetic analyses of the transposase (TPase)-encoding and TIR segments yielded two divergent clades, which we termed Galileo subfamilies V and W. Target-site duplications (TSDs) in D. willistoni Galileo copies were 7- or 8-bp in length, with the consensus sequence GTATTAC. Analysis of the region around the TSDs revealed a target site motif (TSM) with a 15-bp palindrome that may give rise to a stem-loop secondary structure. There is a remarkable abundance and diversity of Galileo copies in the D. willistoni genome, although no functional copies were found. The TIRs in particular have a dynamic structure and extend in different ways, but their ends (required for transposition) are more conserved than the rest of the element. The D. willistoni genome harbors two Galileo subfamilies (V and W) that diverged ~9 million years ago and may have descended from an ancestral

  3. Chromatin structure and dynamics in hot environments: architectural proteins and DNA topoisomerases of thermophilic archaea.

    Science.gov (United States)

    Visone, Valeria; Vettone, Antonella; Serpe, Mario; Valenti, Anna; Perugino, Giuseppe; Rossi, Mosè; Ciaramella, Maria

    2014-09-25

    In all organisms of the three living domains (Bacteria, Archaea, Eucarya) chromosome-associated proteins play a key role in genome functional organization. They not only compact and shape the genome structure, but also regulate its dynamics, which is essential to allow complex genome functions. Elucidation of chromatin composition and regulation is a critical issue in biology, because of the intimate connection of chromatin with all the essential information processes (transcription, replication, recombination, and repair). Chromatin proteins include architectural proteins and DNA topoisomerases, which regulate genome structure and remodelling at two hierarchical levels. This review is focussed on architectural proteins and topoisomerases from hyperthermophilic Archaea. In these organisms, which live at high environmental temperature (>80 °C <113 °C), chromatin proteins and modulation of the DNA secondary structure are concerned with the problem of DNA stabilization against heat denaturation while maintaining its metabolic activity.

  4. Chromatin Structure and Dynamics in Hot Environments: Architectural Proteins and DNA Topoisomerases of Thermophilic Archaea

    Directory of Open Access Journals (Sweden)

    Valeria Visone

    2014-09-01

    Full Text Available In all organisms of the three living domains (Bacteria, Archaea, Eucarya chromosome-associated proteins play a key role in genome functional organization. They not only compact and shape the genome structure, but also regulate its dynamics, which is essential to allow complex genome functions. Elucidation of chromatin composition and regulation is a critical issue in biology, because of the intimate connection of chromatin with all the essential information processes (transcription, replication, recombination, and repair. Chromatin proteins include architectural proteins and DNA topoisomerases, which regulate genome structure and remodelling at two hierarchical levels. This review is focussed on architectural proteins and topoisomerases from hyperthermophilic Archaea. In these organisms, which live at high environmental temperature (>80 °C <113 °C, chromatin proteins and modulation of the DNA secondary structure are concerned with the problem of DNA stabilization against heat denaturation while maintaining its metabolic activity.

  5. Genetic variation of temperature-regulated curd induction in cauliflower: elucidation of floral transition by genome-wide association mapping and gene expression analysis

    Science.gov (United States)

    Matschegewski, Claudia; Zetzsche, Holger; Hasan, Yaser; Leibeguth, Lena; Briggs, William; Ordon, Frank; Uptmoor, Ralf

    2015-01-01

    Cauliflower (Brassica oleracea var. botrytis) is a vernalization-responsive crop. High ambient temperatures delay harvest time. The elucidation of the genetic regulation of floral transition is highly interesting for a precise harvest scheduling and to ensure stable market supply. This study aims at genetic dissection of temperature-dependent curd induction in cauliflower by genome-wide association studies and gene expression analysis. To assess temperature-dependent curd induction, two greenhouse trials under distinct temperature regimes were conducted on a diversity panel consisting of 111 cauliflower commercial parent lines, genotyped with 14,385 SNPs. Broad phenotypic variation and high heritability (0.93) were observed for temperature-related curd induction within the cauliflower population. GWA mapping identified a total of 18 QTL localized on chromosomes O1, O2, O3, O4, O6, O8, and O9 for curding time under two distinct temperature regimes. Among those, several QTL are localized within regions of promising candidate flowering genes. Inferring population structure and genetic relatedness among the diversity set assigned three main genetic clusters. Linkage disequilibrium (LD) patterns estimated global LD extent of r2 = 0.06 and a maximum physical distance of 400 kb for genetic linkage. Transcriptional profiling of flowering genes FLOWERING LOCUS C (BoFLC) and VERNALIZATION 2 (BoVRN2) was performed, showing increased expression levels of BoVRN2 in genotypes with faster curding. However, functional relevance of BoVRN2 and BoFLC2 could not consistently be supported, which probably suggests to act facultative and/or might evidence for BoVRN2/BoFLC-independent mechanisms in temperature-regulated floral transition in cauliflower. Genetic insights in temperature-regulated curd induction can underpin genetically informed phenology models and benefit molecular breeding strategies toward the development of thermo-tolerant cultivars. PMID:26442034

  6. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes.

    Science.gov (United States)

    Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou

    2011-11-01

    Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.

  7. Considerations in the identification of functional RNA structural elements in genomic alignments

    Directory of Open Access Journals (Sweden)

    Blencowe Benjamin J

    2007-01-01

    Full Text Available Abstract Background Accurate identification of novel, functional noncoding (nc RNA features in genome sequence has proven more difficult than for exons. Current algorithms identify and score potential RNA secondary structures on the basis of thermodynamic stability, conservation, and/or covariance in sequence alignments. Neither the algorithms nor the information gained from the individual inputs have been independently assessed. Furthermore, due to issues in modelling background signal, it has been difficult to gauge the precision of these algorithms on a genomic scale, in which even a seemingly small false-positive rate can result in a vast excess of false discoveries. Results We developed a shuffling algorithm, shuffle-pair.pl, that simultaneously preserves dinucleotide frequency, gaps, and local conservation in pairwise sequence alignments. We used shuffle-pair.pl to assess precision and recall of six ncRNA search tools (MSARI, QRNA, ddbRNA, RNAz, Evofold, and several variants of simple thermodynamic stability on a test set of 3046 alignments of known ncRNAs. Relative to mononucleotide shuffling, preservation of dinucleotide content in shuffling the alignments resulted in a drastic increase in estimated false-positive detection rates for ncRNA elements, precluding evaluation of higher order alignments, which cannot not be adequately shuffled maintaining both dinucleotides and alignment structure. On pairwise alignments, none of the covariance-based tools performed markedly better than thermodynamic scoring alone. Although the high false-positive rates call into question the veracity of any individual predicted secondary structural element in our analysis, we nevertheless identified intriguing global trends in human genome alignments. The distribution of ncRNA prediction scores in 75-base windows overlapping UTRs, introns, and intergenic regions analyzed using both thermodynamic stability and EvoFold (which has no thermodynamic component was

  8. Isolation and Elucidation of 15-Acetylguanacone from Soursop ...

    African Journals Online (AJOL)

    ADOWIE PERE

    002/C. The structure of compound 200/A2 was elucidated using 1H-NMR spectroscopy, infrared spectroscopy .... Molecular Docking Experiments, Protein preparation ... ligand conformations in the binding sites through the .... Molecular dynamics simulation of diacetyl-guancone. Bioorgnic and Medicinal Chemistry 15: 4369-.

  9. Dynamical structure analysis of crystalline-state reaction and elucidation of chemical reactivity in crystalline environment

    International Nuclear Information System (INIS)

    Ohashi, Yuji

    2010-01-01

    It was found that a chiral alkyl group bonded to the cobalt atom in a cobalt complex crystal was racemized with retention of the single crystal form on exposure to visible light. Such reactions, which are called crystalline-state reactions, have been found in a variety of cobalt complex crystals. The concept of reaction cavity was introduced to explain the reaction rate quantitatively and the chirality of the photo-product. The new diffractometers and detectors were made for rapid data collection. The reaction mechanism was also elucidated using neutron diffraction analysis. The unstable reaction intermediates were analyzed using cryo-trapping method. The excited-state structures were obtained at the equilibrium state between ground and excited states. (author)

  10. De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture.

    Science.gov (United States)

    Di Pierro, Michele; Cheng, Ryan R; Lieberman Aiden, Erez; Wolynes, Peter G; Onuchic, José N

    2017-11-14

    Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. Copyright © 2017 the Author(s). Published by PNAS.

  11. Genome-wide association studies in Africans and African Americans: Expanding the Framework of the Genomics of Human Traits and Disease

    Science.gov (United States)

    Peprah, Emmanuel; Xu, Huichun; Tekola-Ayele, Fasil; Royal, Charmaine D.

    2014-01-01

    Genomic research is one of the tools for elucidating the pathogenesis of diseases of global health relevance, and paving the research dimension to clinical and public health translation. Recent advances in genomic research and technologies have increased our understanding of human diseases, genes associated with these disorders, and the relevant mechanisms. Genome-wide association studies (GWAS) have proliferated since the first studies were published several years ago, and have become an important tool in helping researchers comprehend human variation and the role genetic variants play in disease. However, the need to expand the diversity of populations in GWAS has become increasingly apparent as new knowledge is gained about genetic variation. Inclusion of diverse populations in genomic studies is critical to a more complete understanding of human variation and elucidation of the underpinnings of complex diseases. In this review, we summarize the available data on GWAS in recent-African ancestry populations within the western hemisphere (i.e. African Americans and peoples of the Caribbean) and continental African populations. Furthermore, we highlight ways in which genomic studies in populations of recent African ancestry have led to advances in the areas of malaria, HIV, prostate cancer, and other diseases. Finally, we discuss the advantages of conducting GWAS in recent African ancestry populations in the context of addressing existing and emerging global health conditions. PMID:25427668

  12. Genome-wide association studies in Africans and African Americans: expanding the framework of the genomics of human traits and disease.

    Science.gov (United States)

    Peprah, Emmanuel; Xu, Huichun; Tekola-Ayele, Fasil; Royal, Charmaine D

    2015-01-01

    Genomic research is one of the tools for elucidating the pathogenesis of diseases of global health relevance and paving the research dimension to clinical and public health translation. Recent advances in genomic research and technologies have increased our understanding of human diseases, genes associated with these disorders, and the relevant mechanisms. Genome-wide association studies (GWAS) have proliferated since the first studies were published several years ago and have become an important tool in helping researchers comprehend human variation and the role genetic variants play in disease. However, the need to expand the diversity of populations in GWAS has become increasingly apparent as new knowledge is gained about genetic variation. Inclusion of diverse populations in genomic studies is critical to a more complete understanding of human variation and elucidation of the underpinnings of complex diseases. In this review, we summarize the available data on GWAS in recent African ancestry populations within the western hemisphere (i.e. African Americans and peoples of the Caribbean) and continental African populations. Furthermore, we highlight ways in which genomic studies in populations of recent African ancestry have led to advances in the areas of malaria, HIV, prostate cancer, and other diseases. Finally, we discuss the advantages of conducting GWAS in recent African ancestry populations in the context of addressing existing and emerging global health conditions.

  13. Deeper insight into the structure of the anaerobic digestion microbial community; the biogas microbiome database is expanded with 157 new genomes

    DEFF Research Database (Denmark)

    Treu, Laura; Kougias, Panagiotis; Campanaro, Stefano

    2016-01-01

    strategy resulted in the highest, up to now, extraction of microbial genomes involved in biogas producing systems. From the 236 extracted genome bins, it was remarkably found that the vast majority of them could only be characterized at high taxonomic levels. This result confirms that the biogas microbiome......This research aimed to better characterize the biogas microbiome by means of high throughput metagenomic sequencing and to elucidate the core microbial consortium existing in biogas reactors independently from the operational conditions. Assembly of shotgun reads followed by an established binning...... is comprised by a consortium of unknown species. A comparative analysis between the genome bins of the current study and those extracted from a previous metagenomic assembly demonstrated a similar phylogenetic distribution of the main taxa. Finally, this analysis led to the identification of a subset of common...

  14. Genome sequencing elucidates Sardinian genetic architecture and augments association analyses for lipid and blood inflammatory markers

    Science.gov (United States)

    Zoledziewska, Magdalena; Mulas, Antonella; Pistis, Giorgio; Steri, Maristella; Danjou, Fabrice; Kwong, Alan; Ortega del Vecchyo, Vicente Diego; Chiang, Charleston W. K.; Bragg-Gresham, Jennifer; Pitzalis, Maristella; Nagaraja, Ramaiah; Tarrier, Brendan; Brennan, Christine; Uzzau, Sergio; Fuchsberger, Christian; Atzeni, Rossano; Reinier, Frederic; Berutti, Riccardo; Huang, Jie; Timpson, Nicholas J; Toniolo, Daniela; Gasparini, Paolo; Malerba, Giovanni; Dedoussis, George; Zeggini, Eleftheria; Soranzo, Nicole; Jones, Chris; Lyons, Robert; Angius, Andrea; Kang, Hyun M.; Novembre, John; Sanna, Serena; Schlessinger, David; Cucca, Francesco; Abecasis, Gonçalo R

    2015-01-01

    We report ~17.6M genetic variants from whole-genome sequencing of 2,120 Sardinians; 22% are absent from prior sequencing-based compilations and enriched for predicted functional consequence. Furthermore, ~76K variants common in our sample (frequency >5%) are rare elsewhere (Genomes Project). We assessed the impact of these variants on circulating lipid levels and five inflammatory biomarkers. Fourteen signals, including two major new loci, were observed for lipid levels, and 19, including two novel loci, for inflammatory markers. New associations would be missed in analyses based on 1000 Genomes data, underlining the advantages of large-scale sequencing in this founder population. PMID:26366554

  15. Cytotoxic 1,3-Thiazole and 1,2,4-Thiadiazole Alkaloids from Penicillium oxalicum: Structural Elucidation and Total Synthesis

    Directory of Open Access Journals (Sweden)

    Zheng Yang

    2016-02-01

    Full Text Available Two new thiazole and thiadiazole alkaloids, penicilliumthiamine A and B (2 and 3, were isolated from the culture broth of Penicillium oxalicum, a fungus found in Acrida cinerea. Their structures were elucidated mainly by spectroscopic analysis, total synthesis and X-ray crystallographic analysis. Biological evaluations indicated that compound 1, 3a and 3 exhibit potent cytotoxicity against different cancer cell lines through inhibiting the phosphorylation of AKT/PKB (Ser 473, one of important cancer drugs target.

  16. Grass genomes

    OpenAIRE

    Bennetzen, Jeffrey L.; SanMiguel, Phillip; Chen, Mingsheng; Tikhonov, Alexander; Francki, Michael; Avramova, Zoya

    1998-01-01

    For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that in...

  17. Local chromatin structure of heterochromatin regulates repeated DNA stability, nucleolus structure, and genome integrity

    Energy Technology Data Exchange (ETDEWEB)

    Peng, Jamy C. [Univ. of California, Berkeley, CA (United States)

    2007-01-01

    Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) that binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in

  18. Whole-Genome Analysis of a Novel Fish Reovirus (MsReV Discloses Aquareovirus Genomic Structure Relationship with Host in Saline Environments

    Directory of Open Access Journals (Sweden)

    Zhong-Yuan Chen

    2015-08-01

    Full Text Available Aquareoviruses are serious pathogens of aquatic animals. Here, genome characterization and functional gene analysis of a novel aquareovirus, largemouth bass Micropterus salmoides reovirus (MsReV, was described. It comprises 11 dsRNA segments (S1–S11 covering 24,024 bp, and encodes 12 putative proteins including the inclusion forming-related protein NS87 and the fusion-associated small transmembrane (FAST protein NS22. The function of NS22 was confirmed by expression in fish cells. Subsequently, MsReV was compared with two representative aquareoviruses, saltwater fish turbot Scophthalmus maximus reovirus (SMReV and freshwater fish grass carp reovirus strain 109 (GCReV-109. MsReV NS87 and NS22 genes have the same structure and function with those of SMReV, whereas GCReV-109 is either missing the coiled-coil region in NS79 or the gene-encoding NS22. Significant similarities are also revealed among equivalent genome segments between MsReV and SMReV, but a difference is found between MsReV and GCReV-109. Furthermore, phylogenetic analysis showed that 13 aquareoviruses could be divided into freshwater and saline environments subgroups, and MsReV was closely related to SMReV in saline environments. Consequently, these viruses from hosts in saline environments have more genomic structural similarities than the viruses from hosts in freshwater. This is the first study of the relationships between aquareovirus genomic structure and their host environments.

  19. Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution.

    Science.gov (United States)

    Yap, Jia-Yee S; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y H; Wilton, Alan; Wilkins, Marc R; Rossetto, Maurizio; Delaney, Sven K

    2015-01-01

    The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine.

  20. DNA is structured as a linear "jigsaw puzzle" in the genomes of Arabidopsis, rice, and budding yeast.

    Science.gov (United States)

    Liu, Yun-Hua; Zhang, Meiping; Wu, Chengcang; Huang, James J; Zhang, Hong-Bin

    2014-01-01

    Knowledge of how a genome is structured and organized from its constituent elements is crucial to understanding its biology and evolution. Here, we report the genome structuring and organization pattern as revealed by systems analysis of the sequences of three model species, Arabidopsis, rice and yeast, at the whole-genome and chromosome levels. We found that all fundamental function elements (FFE) constituting the genomes, including genes (GEN), DNA transposable elements (DTE), retrotransposable elements (RTE), simple sequence repeats (SSR), and (or) low complexity repeats (LCR), are structured in a nonrandom and correlative manner, thus leading to a hypothesis that the DNA of the species is structured as a linear "jigsaw puzzle". Furthermore, we showed that different FFE differ in their importance in the formation and evolution of the DNA jigsaw puzzle structure between species. DTE and RTE play more important roles than GEN, LCR, and SSR in Arabidopsis, whereas GEN and RTE play more important roles than LCR, SSR, and DTE in rice. The genes having multiple recognized functions play more important roles than those having single functions. These results provide useful knowledge necessary for better understanding genome biology and evolution of the species and for effective molecular breeding of rice.

  1. Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

    Energy Technology Data Exchange (ETDEWEB)

    Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang; Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinlzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

    2013-01-24

    The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.

  2. Structural and functional analysis of the finished genome of the recently isolated toxic Anabaena sp. WA102.

    Science.gov (United States)

    Brown, Nathan M; Mueller, Ryan S; Shepardson, Jonathan W; Landry, Zachary C; Morré, Jeffrey T; Maier, Claudia S; Hardy, F Joan; Dreher, Theo W

    2016-06-13

    Very few closed genomes of the cyanobacteria that commonly produce toxic blooms in lakes and reservoirs are available, limiting our understanding of the properties of these organisms. A new anatoxin-a-producing member of the Nostocaceae, Anabaena sp. WA102, was isolated from a freshwater lake in Washington State, USA, in 2013 and maintained in non-axenic culture. The Anabaena sp. WA102 5.7 Mbp genome assembly has been closed with long-read, single-molecule sequencing and separately a draft genome assembly has been produced with short-read sequencing technology. The closed and draft genome assemblies are compared, showing a correlation between long repeats in the genome and the many gaps in the short-read assembly. Anabaena sp. WA102 encodes anatoxin-a biosynthetic genes, as does its close relative Anabaena sp. AL93 (also introduced in this study). These strains are distinguished by differences in the genes for light-harvesting phycobilins, with Anabaena sp. AL93 possessing a phycoerythrocyanin operon. Biologically relevant structural variants in the Anabaena sp. WA102 genome were detected only by long-read sequencing: a tandem triplication of the anaBCD promoter region in the anatoxin-a synthase gene cluster (not triplicated in Anabaena sp. AL93) and a 5-kbp deletion variant present in two-thirds of the population. The genome has a large number of mobile elements (160). Strikingly, there was no synteny with the genome of its nearest fully assembled relative, Anabaena sp. 90. Structural and functional genome analyses indicate that Anabaena sp. WA102 has a flexible genome. Genome closure, which can be readily achieved with long-read sequencing, reveals large scale (e.g., gene order) and local structural features that should be considered in understanding genome evolution and function.

  3. SL1 revisited: functional analysis of the structure and conformation of HIV-1 genome RNA.

    Science.gov (United States)

    Sakuragi, Sayuri; Yokoyama, Masaru; Shioda, Tatsuo; Sato, Hironori; Sakuragi, Jun-Ichi

    2016-11-11

    The dimer initiation site/dimer linkage sequence (DIS/DLS) region of HIV is located on the 5' end of the viral genome and suggested to form complex secondary/tertiary structures. Within this structure, stem-loop 1 (SL1) is believed to be most important and an essential key to dimerization, since the sequence and predicted secondary structure of SL1 are highly stable and conserved among various virus subtypes. In particular, a six-base palindromic sequence is always present at the hairpin loop of SL1 and the formation of kissing-loop structure at this position between the two strands of genomic RNA is suggested to trigger dimerization. Although the higher-order structure model of SL1 is well accepted and perhaps even undoubted lately, there could be stillroom for consideration to depict the functional SL1 structure while in vivo (in virion or cell). In this study, we performed several analyses to identify the nucleotides and/or basepairing within SL1 which are necessary for HIV-1 genome dimerization, encapsidation, recombination and infectivity. We unexpectedly found that some nucleotides that are believed to contribute the formation of the stem do not impact dimerization or infectivity. On the other hand, we found that one G-C basepair involved in stem formation may serve as an alternative dimer interactive site. We also report on our further investigation of the roles of the palindromic sequences on viral replication. Collectively, we aim to assemble a more-comprehensive functional map of SL1 on the HIV-1 viral life cycle. We discovered several possibilities for a novel structure of SL1 in HIV-1 DLS. The newly proposed structure model suggested that the hairpin loop of SL1 appeared larger, and genome dimerization process might consist of more complicated mechanism than previously understood. Further investigations would be still required to fully understand the genome packaging and dimerization of HIV.

  4. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  5. Genomic view of bipolar disorder revealed by whole genome sequencing in a genetic isolate.

    Directory of Open Access Journals (Sweden)

    Benjamin Georgi

    2014-03-01

    Full Text Available Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders.

  6. Genomic View of Bipolar Disorder Revealed by Whole Genome Sequencing in a Genetic Isolate

    Science.gov (United States)

    Georgi, Benjamin; Craig, David; Kember, Rachel L.; Liu, Wencheng; Lindquist, Ingrid; Nasser, Sara; Brown, Christopher; Egeland, Janice A.; Paul, Steven M.; Bućan, Maja

    2014-01-01

    Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. PMID:24625924

  7. RNA structural constraints in the evolution of the influenza A virus genome NP segment

    NARCIS (Netherlands)

    A.P. Gultyaev (Alexander); A. Tsyganov-Bodounov (Anton); M.I. Spronken (Monique); S. Van Der Kooij (Sander); R.A.M. Fouchier (Ron); R.C.L. Olsthoorn (René)

    2014-01-01

    textabstractConserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length,

  8. Structure elucidation of a flavonoid glycoside from the roots of Clerodendrum serratum (L. Moon, Lamiaceae

    Directory of Open Access Journals (Sweden)

    S. S. Bhujbal

    2010-12-01

    Full Text Available Apigenin-7-glucoside, C21H20O10 (7-(β-D-glucopyranosyloxy-5-hydroxy-2-(4-hydroxyphenyl-4H-1-benzopyran-4-one, was first time isolated from the roots of Clerodendrum serratum (L. Moon, Lamiaceae. Structure elucidation of the compound was carried out by ¹H NMR and FAB-MS studies.Apigenin-7-glucosídeo, C21H20O10 (7-(β-D-glucopiranosiloxi-5-hidroxi-2-(4-hidroxifenil-4H-1-benzopiran-4-ona, foi isolado pela primeira vez das raízes de Clerodendrum serratum (L. Moon, Lamiaceae. A elucidação estrutural da susbtância foi feita através de estudos de ¹H NMR e FAB-MS.

  9. 16th Carbonyl Metabolism Meeting: from enzymology to genomics

    Directory of Open Access Journals (Sweden)

    Maser Edmund

    2012-12-01

    Full Text Available Abstract The 16th International Meeting on the Enzymology and Molecular Biology of Carbonyl Metabolism, Castle of Ploen (Schleswig-Holstein, Germany, July 10–15, 2012, covered all aspects of NAD(P-dependent oxido-reductases that are involved in the general metabolism of xenobiotic and physiological carbonyl compounds. Starting 30 years ago with enzyme purification, structure elucidation and enzyme kinetics, the Carbonyl Society members have meanwhile established internationally recognized enzyme nomenclature systems and now consider aspects of enzyme genomics and enzyme evolution along with their roles in diseases. The 16th international meeting included lectures from international speakers from all over the world.

  10. GenRGenS: Software for Generating Random Genomic Sequences and Structures

    OpenAIRE

    Ponty , Yann; Termier , Michel; Denise , Alain

    2006-01-01

    International audience; GenRGenS is a software tool dedicated to randomly generating genomic sequences and structures. It handles several classes of models useful for sequence analysis, such as Markov chains, hidden Markov models, weighted context-free grammars, regular expressions and PROSITE expressions. GenRGenS is the only program that can handle weighted context-free grammars, thus allowing the user to model and to generate structured objects (such as RNA secondary structures) of any giv...

  11. Leishmania naiffi and Leishmania guyanensis reference genomes highlight genome structure and gene evolution in the Viannia subgenus.

    Science.gov (United States)

    Coughlan, Simone; Taylor, Ali Shirley; Feane, Eoghan; Sanders, Mandy; Schonian, Gabriele; Cotton, James A; Downing, Tim

    2018-04-01

    The unicellular protozoan parasite Leishmania causes the neglected tropical disease leishmaniasis, affecting 12 million people in 98 countries. In South America, where the Viannia subgenus predominates, so far only L. ( Viannia ) braziliensis and L. ( V. ) panamensis have been sequenced, assembled and annotated as reference genomes. Addressing this deficit in molecular information can inform species typing, epidemiological monitoring and clinical treatment. Here, L. ( V. ) naiffi and L. ( V. ) guyanensis genomic DNA was sequenced to assemble these two genomes as draft references from short sequence reads. The methods used were tested using short sequence reads for L. braziliensis M2904 against its published reference as a comparison. This assembly and annotation pipeline identified 70 additional genes not annotated on the original M2904 reference. Phylogenetic and evolutionary comparisons of L. guyanensis and L. naiffi with 10 other Viannia genomes revealed four traits common to all Viannia : aneuploidy, 22 orthologous groups of genes absent in other Leishmania subgenera, elevated TATE transposon copies and a high NADH-dependent fumarate reductase gene copy number. Within the Viannia , there were limited structural changes in genome architecture specific to individual species: a 45 Kb amplification on chromosome 34 was present in all bar L. lainsoni , L. naiffi had a higher copy number of the virulence factor leishmanolysin, and laboratory isolate L. shawi M8408 had a possible minichromosome derived from the 3' end of chromosome 34 . This combination of genome assembly, phylogenetics and comparative analysis across an extended panel of diverse Viannia has uncovered new insights into the origin and evolution of this subgenus and can help improve diagnostics for leishmaniasis surveillance.

  12. Genomics of Preterm Birth

    Science.gov (United States)

    Swaggart, Kayleigh A.; Pavlicev, Mihaela; Muglia, Louis J.

    2015-01-01

    The molecular mechanisms controlling human birth timing at term, or resulting in preterm birth, have been the focus of considerable investigation, but limited insights have been gained over the past 50 years. In part, these processes have remained elusive because of divergence in reproductive strategies and physiology shown by model organisms, making extrapolation to humans uncertain. Here, we summarize the evolution of progesterone signaling and variation in pregnancy maintenance and termination. We use this comparative physiology to support the hypothesis that selective pressure on genomic loci involved in the timing of parturition have shaped human birth timing, and that these loci can be identified with comparative genomic strategies. Previous limitations imposed by divergence of mechanisms provide an important new opportunity to elucidate fundamental pathways of parturition control through increasing availability of sequenced genomes and associated reproductive physiology characteristics across diverse organisms. PMID:25646385

  13. The genome and structural proteome of an ocean siphovirus: a new window into the cyanobacterial 'mobilome'.

    Science.gov (United States)

    Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W

    2009-11-01

    Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The approximately 108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element

  14. Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses.

    Science.gov (United States)

    Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y F; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie; Martin, Darren Patrick

    2014-02-01

    Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.

  15. Discovery of global genomic re-organization based on comparison of two newly sequenced rice mitochondrial genomes with cytoplasmic male sterility-related genes

    Directory of Open Access Journals (Sweden)

    Yamada Mari

    2010-03-01

    Full Text Available Abstract Background Plant mitochondrial genomes are known for their complexity, and there is abundant evidence demonstrating that this organelle is important for plant sexual reproduction. Cytoplasmic male sterility (CMS is a phenomenon caused by incompatibility between the nucleus and mitochondria that has been discovered in various plant species. As the exact sequence of steps leading to CMS has not yet been revealed, efforts should be made to elucidate the factors underlying the mechanism of this important trait for crop breeding. Results Two CMS mitochondrial genomes, LD-CMS, derived from Oryza sativa L. ssp. indica (434,735 bp, and CW-CMS, derived from Oryza rufipogon Griff. (559,045 bp, were newly sequenced in this study. Compared to the previously sequenced Nipponbare (Oryza sativa L. ssp. japonica mitochondrial genome, the presence of 54 out of 56 protein-encoding genes (including pseudo-genes, 22 tRNA genes (including pseudo-tRNAs, and three rRNA genes was conserved. Two other genes were not present in the CW-CMS mitochondrial genome, and one of them was present as part of the newly identified chimeric ORF, CW-orf307. At least 12 genomic recombination events were predicted between the LD-CMS mitochondrial genome and Nipponbare, and 15 between the CW-CMS genome and Nipponbare, and novel genetic structures were formed by these genomic rearrangements in the two CMS lines. At least one of the genomic rearrangements was completely unique to each CMS line and not present in 69 rice cultivars or 9 accessions of O. rufipogon. Conclusion Our results demonstrate novel mitochondrial genomic rearrangements that are unique in CMS cytoplasm, and one of the genes that is unique in the CW mitochondrial genome, CW-orf307, appeared to be the candidate most likely responsible for the CW-CMS event. Genomic rearrangements were dynamic in the CMS lines in comparison with those of rice cultivars, suggesting that 'death' and possible 'birth' processes of the

  16. Rhizomes of Eremostachys laciniata: Isolation and Structure Elucidation of Chemical Constituents and a Clinical Trial on Inflammatory Diseases

    Directory of Open Access Journals (Sweden)

    Abbas Delazar

    2013-08-01

    Full Text Available Purpose: The purpose of this study was the isolation and structure elucidation of chemical compounds from the rhizomes of Eremostachys laciniata (L Bunge (EL, an Iranian traditional medicinal herb with a thick root and pale purple or white flowers as well as the clinical studies on the therapeutic efficacy and safety of topical application of the EL extract in the management of some inflammatory conditions, e.g., arthritis, rheumatoid arthritis and septic arthritis (Riter’s syndrome. Methods: The structures of the isolated compounds were elucidated unequivocally on the basis of one and two dimensional NMR, UV and HR-FABMS spectroscopic data analyses. A single-blinded randomized clinical trial was carried out with the extract of the rhizomes of E. laciniata (EL to determine the efficacy and safety of the traditional uses of EL compared to that of piroxicam in treatment of inflammatory diseases, e.g., osteoarthritis, rheumatoid arthritis and Reiter’s syndrome. Results: Eleven iridoid glycosides, two phenylethanoids and two phytosterols were isolated and identified for the first time from the rhizomes of EL. After 14 days of treatment with the EL and piroxicam ointments, all groups showed significant improvements compared to the control groups. EL (5% ointment induced better initial therapeutic response than piroxicam (5% onitment. Conclusion: This clinical trial established that EL was suitable for topical applications as a safe and effective complementary therapy for inflammatory diseases.

  17. The Use of Functional Genomics in Conjunction with Metabolomics for Mycobacterium tuberculosis Research

    Science.gov (United States)

    Swanepoel, Conrad C.

    2014-01-01

    Tuberculosis (TB), caused by Mycobacterium tuberculosis, is a fatal infectious disease, resulting in 1.4 million deaths globally per annum. Over the past three decades, genomic studies have been conducted in an attempt to elucidate the functionality of the genome of the pathogen. However, many aspects of this complex genome remain largely unexplored, as approaches like genomics, proteomics, and transcriptomics have failed to characterize them successfully. In turn, metabolomics, which is relatively new to the “omics” revolution, has shown great potential for investigating biological systems or their modifications. Furthermore, when these data are interpreted in combination with previously acquired genomics, proteomics and transcriptomics data, using what is termed a systems biology approach, a more holistic understanding of these systems can be achieved. In this review we discuss how metabolomics has contributed so far to characterizing TB, with emphasis on the resulting improved elucidation of M. tuberculosis in terms of (1) metabolism, (2) growth and replication, (3) pathogenicity, and (4) drug resistance, from the perspective of systems biology. PMID:24771957

  18. The Use of Functional Genomics in Conjunction with Metabolomics for Mycobacterium tuberculosis Research

    Directory of Open Access Journals (Sweden)

    Conrad C. Swanepoel

    2014-01-01

    Full Text Available Tuberculosis (TB, caused by Mycobacterium tuberculosis, is a fatal infectious disease, resulting in 1.4 million deaths globally per annum. Over the past three decades, genomic studies have been conducted in an attempt to elucidate the functionality of the genome of the pathogen. However, many aspects of this complex genome remain largely unexplored, as approaches like genomics, proteomics, and transcriptomics have failed to characterize them successfully. In turn, metabolomics, which is relatively new to the “omics” revolution, has shown great potential for investigating biological systems or their modifications. Furthermore, when these data are interpreted in combination with previously acquired genomics, proteomics and transcriptomics data, using what is termed a systems biology approach, a more holistic understanding of these systems can be achieved. In this review we discuss how metabolomics has contributed so far to characterizing TB, with emphasis on the resulting improved elucidation of M. tuberculosis in terms of (1 metabolism, (2 growth and replication, (3 pathogenicity, and (4 drug resistance, from the perspective of systems biology.

  19. Correlation exploration of metabolic and genomic diversity in rice

    Directory of Open Access Journals (Sweden)

    Shinozaki Kazuo

    2009-12-01

    Full Text Available Abstract Background It is essential to elucidate the relationship between metabolic and genomic diversity to understand the genetic regulatory networks associated with the changing metabolo-phenotype among natural variation and/or populations. Recent innovations in metabolomics technologies allow us to grasp the comprehensive features of the metabolome. Metabolite quantitative trait analysis is a key approach for the identification of genetic loci involved in metabolite variation using segregated populations. Although several attempts have been made to find correlative relationships between genetic and metabolic diversity among natural populations in various organisms, it is still unclear whether it is possible to discover such correlations between each metabolite and the polymorphisms found at each chromosomal location. To assess the correlative relationship between the metabolic and genomic diversity found in rice accessions, we compared the distance matrices for these two "omics" patterns in the rice accessions. Results We selected 18 accessions from the world rice collection based on their population structure. To determine the genomic diversity of the rice genome, we genotyped 128 restriction fragment length polymorphism (RFLP markers to calculate the genetic distance among the accessions. To identify the variations in the metabolic fingerprint, a soluble extract from the seed grain of each accession was analyzed with one dimensional 1H-nuclear magnetic resonance (NMR. We found no correlation between global metabolic diversity and the phylogenetic relationships among the rice accessions (rs = 0.14 by analyzing the distance matrices (calculated from the pattern of the metabolic fingerprint in the 4.29- to 0.71-ppm 1H chemical shift and the genetic distance on the basis of the RFLP markers. However, local correlation analysis between the distance matrices (derived from each 0.04-ppm integral region of the 1H chemical shift against genetic

  20. Use of Dried Blood Spots to Elucidate Full-Length Transmitted/Founder HIV-1 Genomes

    Directory of Open Access Journals (Sweden)

    Jesus F. Salazar-Gonzalez

    2016-07-01

    Full Text Available Background: Identification of HIV-1 genomes responsible for establishing clinical infection in newly infected individuals is fundamental to prevention and pathogenesis research. Processing, storage, and transportation of the clinical samples required to perform these virologic assays in resource-limited settings requires challenging venipuncture and cold chain logistics. Here, we validate the use of dried-blood spots (DBS as a simple and convenient alternative to collecting and storing frozen plasma. Methods: We performed parallel nucleic acid extraction, single genome amplification (SGA, next generation sequencing (NGS, and phylogenetic analyses on plasma and DBS. Results: We demonstrated the capacity to extract viral RNA from DBS and perform SGA to infer the complete nucleotide sequence of the transmitted/founder (TF HIV-1 envelope gene and full-length genome in two acutely infected individuals. Using both SGA and NGS methodologies, we showed that sequences generated from DBS and plasma display comparable phylogenetic patterns in both acute and chronic infection. SGA was successful on samples with a range of plasma viremia, including samples as low as 1,700 copies/ml and an estimated ~50 viral copies per blood spot. Further, we demonstrated reproducible efficiency in gp160 env sequencing in DBS stored at ambient temperature for up to three weeks or at -20ºC for up to five months. Conclusions: These findings support the use of DBS as a practical and cost-effective alternative to frozen plasma for clinical trials and translational research conducted in resource-limited settings.

  1. Classification, Naming and Evolutionary History of Glycosyltransferases from Sequenced Green and Red Algal Genomes

    DEFF Research Database (Denmark)

    Ulvskov, Peter; Paiva, Dionisio Soares; Domozych, David

    2013-01-01

    . In order to elucidate possible evolutionary links between the three advanced lineages in Archaeplastida, a genomic analysis was initiated. Fully sequenced genomes from the Rhodophyta and Virideplantae and the well-defined CAZy database on glycosyltransferases were included in the analysis. The number...

  2. Deeper insight into the structure of the anaerobic digestion microbial community; the biogas microbiome database is expanded with 157 new genomes.

    Science.gov (United States)

    Treu, Laura; Kougias, Panagiotis G; Campanaro, Stefano; Bassani, Ilaria; Angelidaki, Irini

    2016-09-01

    This research aimed to better characterize the biogas microbiome by means of high throughput metagenomic sequencing and to elucidate the core microbial consortium existing in biogas reactors independently from the operational conditions. Assembly of shotgun reads followed by an established binning strategy resulted in the highest, up to now, extraction of microbial genomes involved in biogas producing systems. From the 236 extracted genome bins, it was remarkably found that the vast majority of them could only be characterized at high taxonomic levels. This result confirms that the biogas microbiome is comprised by a consortium of unknown species. A comparative analysis between the genome bins of the current study and those extracted from a previous metagenomic assembly demonstrated a similar phylogenetic distribution of the main taxa. Finally, this analysis led to the identification of a subset of common microbes that could be considered as the core essential group in biogas production. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Structural constraints in the packaging of bluetongue virus genomic segments.

    Science.gov (United States)

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C; Roy, Polly

    2014-10-01

    The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by biochemical data analysis suggested that a conformational motif formed by interaction of the 5' and 3' ends of the molecule was necessary and sufficient for packaging. A similar structural signal was also identified in S8 of BTV serotype 1. Furthermore, the same conformational analysis of secondary structures for positive-sense ssRNAs was used to generate a chimeric segment that maintained the putative packaging motif but contained unrelated internal sequences. This chimeric segment was packaged successfully, confirming that the motif identified directs the correct packaging of the segment. © 2014 The Authors.

  4. Vitamin D and the brain: Genomic and non-genomic actions.

    Science.gov (United States)

    Cui, Xiaoying; Gooch, Helen; Petty, Alice; McGrath, John J; Eyles, Darryl

    2017-09-15

    1,25(OH) 2 D 3 (vitamin D) is well-recognized as a neurosteroid that modulates multiple brain functions. A growing body of evidence indicates that vitamin D plays a pivotal role in brain development, neurotransmission, neuroprotection and immunomodulation. However, the precise molecular mechanisms by which vitamin D exerts these functions in the brain are still unclear. Vitamin D signalling occurs via the vitamin D receptor (VDR), a zinc-finger protein in the nuclear receptor superfamily. Like other nuclear steroids, vitamin D has both genomic and non-genomic actions. The transcriptional activity of vitamin D occurs via the nuclear VDR. Its faster, non-genomic actions can occur when the VDR is distributed outside the nucleus. The VDR is present in the developing and adult brain where it mediates the effects of vitamin D on brain development and function. The purpose of this review is to summarise the in vitro and in vivo work that has been conducted to characterise the genomic and non-genomic actions of vitamin D in the brain. Additionally we link these processes to functional neurochemical and behavioural outcomes. Elucidation of the precise molecular mechanisms underpinning vitamin D signalling in the brain may prove useful in understanding the role this steroid plays in brain ontogeny and function. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Target Selection and Deselection at the Berkeley StructuralGenomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E.

    2005-03-22

    At the Berkeley Structural Genomics Center (BSGC), our goalis to obtain a near-complete structural complement of proteins in theminimal organisms Mycoplasma genitalium and M. pneumoniae, two closelyrelated pathogens. Current targets for structure determination have beenselected in six major stages, starting with those predicted to be mosttractable to high throughput study and likely to yield new structuralinformation. We report on the process used to select these proteins, aswell as our target deselection procedure. Target deselection reducesexperimental effort by eliminating targets similar to those recentlysolved by the structural biology community or other centers. We measurethe impact of the 69 structures solved at the BSGC as of July 2004 onstructure prediction coverage of the M. pneumoniae and M. genitaliumproteomes. The number of Mycoplasma proteins for which thefold couldfirst be reliably assigned based on structures solved at the BSGC (24 M.pneumoniae and 21 M. genitalium) is approximately 25 percent of the totalresulting from work at all structural genomics centers and the worldwidestructural biology community (94 M. pneumoniae and 86M. genitalium)during the same period. As the number of structures contributed by theBSGC during that period is less than 1 percent of the total worldwideoutput, the benefits of a focused target selection strategy are apparent.If the structures of all current targets were solved, the percentage ofM. pneumoniae proteins for which folds could be reliably assigned wouldincrease from approximately 57 percent (391 of 687) at present to around80 percent (550 of 687), and the percentage of the proteome that could beaccurately modeled would increase from around 37 percent (254 of 687) toabout 64 percent (438 of 687). In M. genitalium, the percentage of theproteome that could be structurally annotated based on structures of ourremaining targets would rise from 72 percent (348 of 486) to around 76percent (371 of 486), with the

  6. Structural elucidation of nanocrystalline biomaterials

    Energy Technology Data Exchange (ETDEWEB)

    Maltsev, S.

    2008-10-23

    Bone diseases, such as osteoporosis and osteoarthritis, are the second most prevalent health problem worldwide. In Germany approximately 5 millions people are affected by arthritis. Investigating biomineralization processes and bone molecular structure is of key importance for developing new drugs for preventing and healing bone diseases. Nuclear magnetic resonance (NMR) was the primary technique used due to its advantages in characterising poorly ordered and disordered materials. Compared to all the diffraction techniques that widely applied in structural investigations, the usefulness of NMR is independent of long range molecular order. This makes NMR an outstanding technique for studies of complex/amorphous materials. Conventional NMR experiments (single pulse, spin-echo, cross polarization (CP), etc.) as well as their modifications and high-end techniques (2D HETCOR, REDOR, etc.) were used in this work. Combining the contributions from different techniques enhances the information content of the investigations and can increase the precision of the overall conclusions. Also XRD, TEM and FTIR were applied to different extent in order to get a general idea of nanocrystalline hydroxyapatite crystallite structure. Results: - A new approach named 'Solid-state NMR spectroscopy using the lost I spin magnetization in polarization transfer experiments' has been developed for measuring the transferred I spin magnetization from abundant nuclei, which is normally lost when detecting the S spin magnetization. - A detailed investigation of nanocrystalline hydroxyapatite core was made to prove that proton environment of the phosphates units and phosphorus environment of hydroxyl units are the same as in highly crystalline hydroxyapatite sample. - Using XRD it was found that the surface of the hydroxyapatite nanocrystals is not completely disordered, as it was suggested before, but resembles the hydroxyapatite structure with HPO{sub 4}{sup 2-} (and some CO{sub 3}{sup

  7. Structural elucidation of nanocrystalline biomaterials

    Energy Technology Data Exchange (ETDEWEB)

    Maltsev, S

    2008-10-23

    Bone diseases, such as osteoporosis and osteoarthritis, are the second most prevalent health problem worldwide. In Germany approximately 5 millions people are affected by arthritis. Investigating biomineralization processes and bone molecular structure is of key importance for developing new drugs for preventing and healing bone diseases. Nuclear magnetic resonance (NMR) was the primary technique used due to its advantages in characterising poorly ordered and disordered materials. Compared to all the diffraction techniques that widely applied in structural investigations, the usefulness of NMR is independent of long range molecular order. This makes NMR an outstanding technique for studies of complex/amorphous materials. Conventional NMR experiments (single pulse, spin-echo, cross polarization (CP), etc.) as well as their modifications and high-end techniques (2D HETCOR, REDOR, etc.) were used in this work. Combining the contributions from different techniques enhances the information content of the investigations and can increase the precision of the overall conclusions. Also XRD, TEM and FTIR were applied to different extent in order to get a general idea of nanocrystalline hydroxyapatite crystallite structure. Results: - A new approach named 'Solid-state NMR spectroscopy using the lost I spin magnetization in polarization transfer experiments' has been developed for measuring the transferred I spin magnetization from abundant nuclei, which is normally lost when detecting the S spin magnetization. - A detailed investigation of nanocrystalline hydroxyapatite core was made to prove that proton environment of the phosphates units and phosphorus environment of hydroxyl units are the same as in highly crystalline hydroxyapatite sample. - Using XRD it was found that the surface of the hydroxyapatite nanocrystals is not completely disordered, as it was suggested before, but resembles the hydroxyapatite structure with HPO{sub 4}{sup 2-} (and some CO{sub 3}{sup 2

  8. Macromolecular structure determination in the post-genome era

    CERN Document Server

    Kuhn, P

    2001-01-01

    Recent advances in genetics, molecular biology and crystallographic instrumentation and methodology have led to a revolution in the field of Structural Molecular Biology (SMB). These combined advances have paved the way to a more complete and detailed understanding of the biological macromolecules that make up an organism, both in terms of their individual functions and also the interactions between them. In this paper we describe a large-scale, genomic approach to the three-dimensional structure determination of macromolecules and their complexes, using high-throughput methodology to streamline all aspects of the process. This task requires the development of automated high-intensity synchrotron beam lines for X-ray diffraction data collection from single crystal samples. Furthermore, these beam lines must be operated within a sophisticated software and hardware environment, which is capable of delivering a completely automated structure determination pipeline. The SMB resource at SSRL is developing a system...

  9. Elucidation of structural isomers from the homogeneous rhodium-catalyzed isomerization of vegetable oils.

    Science.gov (United States)

    Andjelkovic, Dejan D; Min, Byungrok; Ahn, Dong; Larock, Richard C

    2006-12-13

    The structural isomers formed by the homogeneous rhodium-catalyzed isomerization of several vegetable oils have been elucidated. A detailed study of the isomerization of the model compound methyl linoleate has been performed to correlate the distribution of conjugated isomers, the reaction kinetics, and the mechanism of the reaction. It has been shown that [RhCl(C8H8)2]2 is a highly efficient and selective isomerization catalyst for the production of highly conjugated vegetable oils with a high conjugated linoleic acid (CLA) content, which is highly desirable in the food industry. The combined fraction of the two major CLA isomers [(9Z,11E)-CLA and (10E,12Z)-CLA] in the overall CLA mixture is in the range from 76.2% to 93.4%. The high efficiency and selectivity of this isomerization method along with the straightforward purification process render this approach highly promising for the preparation of conjugated oils and CLA. Proposed improvements in catalyst recovery and reusability will only make this method more appealing to the food, paint, coating, and polymer industries in the future.

  10. Characterizing genomic alterations in cancer by complementary functional associations.

    Science.gov (United States)

    Kim, Jong Wook; Botvinnik, Olga B; Abudayyeh, Omar; Birger, Chet; Rosenbluh, Joseph; Shrestha, Yashaswi; Abazeed, Mohamed E; Hammerman, Peter S; DiCara, Daniel; Konieczkowski, David J; Johannessen, Cory M; Liberzon, Arthur; Alizad-Rahvar, Amir Reza; Alexe, Gabriela; Aguirre, Andrew; Ghandi, Mahmoud; Greulich, Heidi; Vazquez, Francisca; Weir, Barbara A; Van Allen, Eliezer M; Tsherniak, Aviad; Shao, Diane D; Zack, Travis I; Noble, Michael; Getz, Gad; Beroukhim, Rameen; Garraway, Levi A; Ardakani, Masoud; Romualdi, Chiara; Sales, Gabriele; Barbie, David A; Boehm, Jesse S; Hahn, William C; Mesirov, Jill P; Tamayo, Pablo

    2016-05-01

    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment. We used REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations, demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes.

  11. Next generation transcriptomics and genomics elucidate biological complexity of microglia in health and disease

    NARCIS (Netherlands)

    Wes, Paul D; Holtman, Inge R; Boddeke, Erik W G M; Möller, Thomas; Eggen, Bart J L

    2015-01-01

    Genome-wide expression profiling technology has resulted in detailed transcriptome data for a wide range of tissues, conditions and diseases. In neuroscience, expression datasets were mostly generated using whole brain tissue samples, resulting in data from a mixture of cell types, including glial

  12. A genome wide survey of SNP variation reveals the genetic structure of sheep breeds.

    Directory of Open Access Journals (Sweden)

    James W Kijas

    Full Text Available The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identifying the first genome-wide set of SNP for sheep, we report on levels of genetic variability both within and between a diverse sample of ovine populations. Then, using cluster analysis and the partitioning of genetic variation, we demonstrate sheep are characterised by weak phylogeographic structure, overlapping genetic similarity and generally low differentiation which is consistent with their short evolutionary history. The degree of population substructure was, however, sufficient to cluster individuals based on geographic origin and known breed history. Specifically, African and Asian populations clustered separately from breeds of European origin sampled from Australia, New Zealand, Europe and North America. Furthermore, we demonstrate the presence of stratification within some, but not all, ovine breeds. The results emphasize that careful documentation of genetic structure will be an essential prerequisite when mapping the genetic basis of complex traits. Furthermore, the identification of a subset of SNP able to assign individuals into broad groupings demonstrates even a small panel of markers may be suitable for applications such as traceability.

  13. Combining Functional and Structural Genomics to Sample the Essential Burkholderia Structome

    Science.gov (United States)

    Baugh, Loren; Gallagher, Larry A.; Patrapuvich, Rapatbhorn; Clifton, Matthew C.; Gardberg, Anna S.; Edwards, Thomas E.; Armour, Brianna; Begley, Darren W.; Dieterich, Shellie H.; Dranow, David M.; Abendroth, Jan; Fairman, James W.; Fox, David; Staker, Bart L.; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W.; Stacy, Robin; Myler, Peter J.; Stewart, Lance J.; Manoil, Colin; Van Voorhis, Wesley C.

    2013-01-01

    Background The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. Methodology/Principal Findings We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an “ortholog rescue” strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. Conclusions/Significance This collection of structures, solubility and experimental essentiality data

  14. Combining functional and structural genomics to sample the essential Burkholderia structome.

    Directory of Open Access Journals (Sweden)

    Loren Baugh

    Full Text Available The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite.We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq. We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail.This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against

  15. Combining functional and structural genomics to sample the essential Burkholderia structome.

    Science.gov (United States)

    Baugh, Loren; Gallagher, Larry A; Patrapuvich, Rapatbhorn; Clifton, Matthew C; Gardberg, Anna S; Edwards, Thomas E; Armour, Brianna; Begley, Darren W; Dieterich, Shellie H; Dranow, David M; Abendroth, Jan; Fairman, James W; Fox, David; Staker, Bart L; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W; Stacy, Robin; Myler, Peter J; Stewart, Lance J; Manoil, Colin; Van Voorhis, Wesley C

    2013-01-01

    The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases

  16. The genome and structural proteome of an ocean siphovirus: a new window into the cyanobacterial ‘mobilome’

    Science.gov (United States)

    Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W

    2009-01-01

    Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element

  17. Genome chaos: survival strategy during crisis.

    Science.gov (United States)

    Liu, Guo; Stevens, Joshua B; Horne, Steven D; Abdallah, Batoul Y; Ye, Karen J; Bremer, Steven W; Ye, Christine J; Chen, David J; Heng, Henry H

    2014-01-01

    Genome chaos, a process of complex, rapid genome re-organization, results in the formation of chaotic genomes, which is followed by the potential to establish stable genomes. It was initially detected through cytogenetic analyses, and recently confirmed by whole-genome sequencing efforts which identified multiple subtypes including "chromothripsis", "chromoplexy", "chromoanasynthesis", and "chromoanagenesis". Although genome chaos occurs commonly in tumors, both the mechanism and detailed aspects of the process are unknown due to the inability of observing its evolution over time in clinical samples. Here, an experimental system to monitor the evolutionary process of genome chaos was developed to elucidate its mechanisms. Genome chaos occurs following exposure to chemotherapeutics with different mechanisms, which act collectively as stressors. Characterization of the karyotype and its dynamic changes prior to, during, and after induction of genome chaos demonstrates that chromosome fragmentation (C-Frag) occurs just prior to chaotic genome formation. Chaotic genomes seem to form by random rejoining of chromosomal fragments, in part through non-homologous end joining (NHEJ). Stress induced genome chaos results in increased karyotypic heterogeneity. Such increased evolutionary potential is demonstrated by the identification of increased transcriptome dynamics associated with high levels of karyotypic variance. In contrast to impacting on a limited number of cancer genes, re-organized genomes lead to new system dynamics essential for cancer evolution. Genome chaos acts as a mechanism of rapid, adaptive, genome-based evolution that plays an essential role in promoting rapid macroevolution of new genome-defined systems during crisis, which may explain some unwanted consequences of cancer treatment.

  18. Glycogenomics as a mass spectrometry-guided genome-mining method for microbial glycosylated molecules.

    Science.gov (United States)

    Kersten, Roland D; Ziemert, Nadine; Gonzalez, David J; Duggan, Brendan M; Nizet, Victor; Dorrestein, Pieter C; Moore, Bradley S

    2013-11-19

    Glycosyl groups are an essential mediator of molecular interactions in cells and on cellular surfaces. There are very few methods that directly relate sugar-containing molecules to their biosynthetic machineries. Here, we introduce glycogenomics as an experiment-guided genome-mining approach for fast characterization of glycosylated natural products (GNPs) and their biosynthetic pathways from genome-sequenced microbes by targeting glycosyl groups in microbial metabolomes. Microbial GNPs consist of aglycone and glycosyl structure groups in which the sugar unit(s) are often critical for the GNP's bioactivity, e.g., by promoting binding to a target biomolecule. GNPs are a structurally diverse class of molecules with important pharmaceutical and agrochemical applications. Herein, O- and N-glycosyl groups are characterized in their sugar monomers by tandem mass spectrometry (MS) and matched to corresponding glycosylation genes in secondary metabolic pathways by a MS-glycogenetic code. The associated aglycone biosynthetic genes of the GNP genotype then classify the natural product to further guide structure elucidation. We highlight the glycogenomic strategy by the characterization of several bioactive glycosylated molecules and their gene clusters, including the anticancer agent cinerubin B from Streptomyces sp. SPB74 and an antibiotic, arenimycin B, from Salinispora arenicola CNB-527.

  19. Gene order data from a model amphibian (Ambystoma: new perspectives on vertebrate genome structure and evolution

    Directory of Open Access Journals (Sweden)

    Voss S Randal

    2006-08-01

    Full Text Available Abstract Background Because amphibians arise from a branch of the vertebrate evolutionary tree that is juxtaposed between fishes and amniotes, they provide important comparative perspective for reconstructing character changes that have occurred during vertebrate evolution. Here, we report the first comparative study of vertebrate genome structure that includes a representative amphibian. We used 491 transcribed sequences from a salamander (Ambystoma genetic map and whole genome assemblies for human, mouse, rat, dog, chicken, zebrafish, and the freshwater pufferfish Tetraodon nigroviridis to compare gene orders and rearrangement rates. Results Ambystoma has experienced a rate of genome rearrangement that is substantially lower than mammalian species but similar to that of chicken and fish. Overall, we found greater conservation of genome structure between Ambystoma and tetrapod vertebrates, nevertheless, 57% of Ambystoma-fish orthologs are found in conserved syntenies of four or more genes. Comparisons between Ambystoma and amniotes reveal extensive conservation of segmental homology for 57% of the presumptive Ambystoma-amniote orthologs. Conclusion Our analyses suggest relatively constant interchromosomal rearrangement rates from the euteleost ancestor to the origin of mammals and illustrate the utility of amphibian mapping data in establishing ancestral amniote and tetrapod gene orders. Comparisons between Ambystoma and amniotes reveal some of the key events that have structured the human genome since diversification of the ancestral amniote lineage.

  20. Primary structure of the human follistatin precursor and its genomic organization

    International Nuclear Information System (INIS)

    Shimasaki, Shunichi; Koga, Makoto; Esch, F.

    1988-01-01

    Follistatin is a single-chain gonadal protein that specifically inhibits follicle-stimulating hormone release. By use of the recently characterized porcine follistatin cDNA as a probe to screen a human testis cDNA library and a genomic library, the structure of the complete human follistatin precursor as well as its genomic organization have been determined. Three of eight cDNA clones that were sequenced predicted a precursor with 344 amino acids, whereas the remaining five cDNA clones encoded a 317 amino acid precursor, resulting from alternative splicing of the precursor mRNA. Mature follistatins contain four contiguous domains that are encoded by precisely separated exons; three of the domains are highly similar to each other, as well as to human epidermal growth factor and human pancreatic secretory trypsin inhibitor. The genomic organization of the human follistatin is similar to that of the human epidermal growth factor gene and thus supports the notion of exon shuffling during evolution

  1. Genomic Evolution of Saccharomyces cerevisiae under Chinese Rice Wine Fermentation

    Science.gov (United States)

    Li, Yudong; Zhang, Weiping; Zheng, Daoqiong; Zhou, Zhan; Yu, Wenwen; Zhang, Lei; Feng, Lifang; Liang, Xinle; Guan, Wenjun; Zhou, Jingwen; Chen, Jian; Lin, Zhenguo

    2014-01-01

    Rice wine fermentation represents a unique environment for the evolution of the budding yeast, Saccharomyces cerevisiae. To understand how the selection pressure shaped the yeast genome and gene regulation, we determined the genome sequence and transcriptome of a S. cerevisiae strain YHJ7 isolated from Chinese rice wine (Huangjiu), a popular traditional alcoholic beverage in China. By comparing the genome of YHJ7 to the lab strain S288c, a Japanese sake strain K7, and a Chinese industrial bioethanol strain YJSH1, we identified many genomic sequence and structural variations in YHJ7, which are mainly located in subtelomeric regions, suggesting that these regions play an important role in genomic evolution between strains. In addition, our comparative transcriptome analysis between YHJ7 and S288c revealed a set of differentially expressed genes, including those involved in glucose transport (e.g., HXT2, HXT7) and oxidoredutase activity (e.g., AAD10, ADH7). Interestingly, many of these genomic and transcriptional variations are directly or indirectly associated with the adaptation of YHJ7 strain to its specific niches. Our molecular evolution analysis suggested that Japanese sake strains (K7/UC5) were derived from Chinese rice wine strains (YHJ7) at least approximately 2,300 years ago, providing the first molecular evidence elucidating the origin of Japanese sake strains. Our results depict interesting insights regarding the evolution of yeast during rice wine fermentation, and provided a valuable resource for genetic engineering to improve industrial wine-making strains. PMID:25212861

  2. Improvisation in evolution of genes and genomes: whose structure is it anyway?

    Science.gov (United States)

    Shakhnovich, Boris E; Shakhnovich, Eugene I

    2008-06-01

    Significant progress has been made in recent years in a variety of seemingly unrelated fields such as sequencing, protein structure prediction, and high-throughput transcriptomics and metabolomics. At the same time, new microscopic models have been developed that made it possible to analyze the evolution of genes and genomes from first principles. The results from these efforts enable, for the first time, a comprehensive insight into the evolution of complex systems and organisms on all scales--from sequences to organisms and populations. Every newly sequenced genome uncovers new genes, families, and folds. Where do these new genes come from? How do gene duplication and subsequent divergence of sequence and structure affect the fitness of the organism? What role does regulation play in the evolution of proteins and folds? Emerging synergism between data and modeling provides first robust answers to these questions.

  3. Secure web book to store structural genomics research data.

    Science.gov (United States)

    Manjasetty, Babu A; Höppner, Klaus; Mueller, Uwe; Heinemann, Udo

    2003-01-01

    Recently established collaborative structural genomics programs aim at significantly accelerating the crystal structure analysis of proteins. These large-scale projects require efficient data management systems to ensure seamless collaboration between different groups of scientists working towards the same goal. Within the Berlin-based Protein Structure Factory, the synchrotron X-ray data collection and the subsequent crystal structure analysis tasks are located at BESSY, a third-generation synchrotron source. To organize file-based communication and data transfer at the BESSY site of the Protein Structure Factory, we have developed the web-based BCLIMS, the BESSY Crystallography Laboratory Information Management System. BCLIMS is a relational data management system which is powered by MySQL as the database engine and Apache HTTP as the web server. The database interface routines are written in Python programing language. The software is freely available to academic users. Here we describe the storage, retrieval and manipulation of laboratory information, mainly pertaining to the synchrotron X-ray diffraction experiments and the subsequent protein structure analysis, using BCLIMS.

  4. Modeling structure of G protein-coupled receptors in huan genome

    KAUST Repository

    Zhang, Yang

    2016-01-26

    G protein-coupled receptors (or GPCRs) are integral transmembrane proteins responsible to various cellular signal transductions. Human GPCR proteins are encoded by 5% of human genes but account for the targets of 40% of the FDA approved drugs. Due to difficulties in crystallization, experimental structure determination remains extremely difficult for human GPCRs, which have been a major barrier in modern structure-based drug discovery. We proposed a new hybrid protocol, GPCR-I-TASSER, to construct GPCR structure models by integrating experimental mutagenesis data with ab initio transmembrane-helix assembly simulations, assisted by the predicted transmembrane-helix interaction networks. The method was tested in recent community-wide GPCRDock experiments and constructed models with a root mean square deviation 1.26 Å for Dopamine-3 and 2.08 Å for Chemokine-4 receptors in the transmembrane domain regions, which were significantly closer to the native than the best templates available in the PDB. GPCR-I-TASSER has been applied to model all 1,026 putative GPCRs in the human genome, where 923 are found to have correct folds based on the confidence score analysis and mutagenesis data comparison. The successfully modeled GPCRs contain many pharmaceutically important families that do not have previously solved structures, including Trace amine, Prostanoids, Releasing hormones, Melanocortins, Vasopressin and Neuropeptide Y receptors. All the human GPCR models have been made publicly available through the GPCR-HGmod database at http://zhanglab.ccmb.med.umich.edu/GPCR-HGmod/ The results demonstrate new progress on genome-wide structure modeling of transmembrane proteins which should bring useful impact on the effort of GPCR-targeted drug discovery.

  5. Genomic structural variation contributes to phenotypic change of industrial bioethanol yeast Saccharomyces cerevisiae.

    Science.gov (United States)

    Zhang, Ke; Zhang, Li-Jie; Fang, Ya-Hong; Jin, Xin-Na; Qi, Lei; Wu, Xue-Chang; Zheng, Dao-Qiong

    2016-03-01

    Genomic structural variation (GSV) is a ubiquitous phenomenon observed in the genomes of Saccharomyces cerevisiae strains with different genetic backgrounds; however, the physiological and phenotypic effects of GSV are not well understood. Here, we first revealed the genetic characteristics of a widely used industrial S. cerevisiae strain, ZTW1, by whole genome sequencing. ZTW1 was identified as an aneuploidy strain and a large-scale GSV was observed in the ZTW1 genome compared with the genome of a diploid strain YJS329. These GSV events led to copy number variations (CNVs) in many chromosomal segments as well as one whole chromosome in the ZTW1 genome. Changes in the DNA dosage of certain functional genes directly affected their expression levels and the resultant ZTW1 phenotypes. Moreover, CNVs of large chromosomal regions triggered an aneuploidy stress in ZTW1. This stress decreased the proliferation ability and tolerance of ZTW1 to various stresses, while aneuploidy response stress may also provide some benefits to the fermentation performance of the yeast, including increased fermentation rates and decreased byproduct generation. This work reveals genomic characters of the bioethanol S. cerevisiae strain ZTW1 and suggests that GSV is an important kind of mutation that changes the traits of industrial S. cerevisiae strains. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  6. Genomic Footprints of Selective Sweeps from Metabolic Resistance to Pyrethroids in African Malaria Vectors Are Driven by Scale up of Insecticide-Based Vector Control.

    Science.gov (United States)

    Barnes, Kayla G; Weedall, Gareth D; Ndula, Miranda; Irving, Helen; Mzihalowa, Themba; Hemingway, Janet; Wondji, Charles S

    2017-02-01

    Insecticide resistance in mosquito populations threatens recent successes in malaria prevention. Elucidating patterns of genetic structure in malaria vectors to predict the speed and direction of the spread of resistance is essential to get ahead of the 'resistance curve' and to avert a public health catastrophe. Here, applying a combination of microsatellite analysis, whole genome sequencing and targeted sequencing of a resistance locus, we elucidated the continent-wide population structure of a major African malaria vector, Anopheles funestus. We identified a major selective sweep in a genomic region controlling cytochrome P450-based metabolic resistance conferring high resistance to pyrethroids. This selective sweep occurred since 2002, likely as a direct consequence of scaled up vector control as revealed by whole genome and fine-scale sequencing of pre- and post-intervention populations. Fine-scaled analysis of the pyrethroid resistance locus revealed that a resistance-associated allele of the cytochrome P450 monooxygenase CYP6P9a has swept through southern Africa to near fixation, in contrast to high polymorphism levels before interventions, conferring high levels of pyrethroid resistance linked to control failure. Population structure analysis revealed a barrier to gene flow between southern Africa and other areas, which may prevent or slow the spread of the southern mechanism of pyrethroid resistance to other regions. By identifying a genetic signature of pyrethroid-based interventions, we have demonstrated the intense selective pressure that control interventions exert on mosquito populations. If this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions will be compromised.

  7. Genomic Footprints of Selective Sweeps from Metabolic Resistance to Pyrethroids in African Malaria Vectors Are Driven by Scale up of Insecticide-Based Vector Control.

    Directory of Open Access Journals (Sweden)

    Kayla G Barnes

    2017-02-01

    Full Text Available Insecticide resistance in mosquito populations threatens recent successes in malaria prevention. Elucidating patterns of genetic structure in malaria vectors to predict the speed and direction of the spread of resistance is essential to get ahead of the 'resistance curve' and to avert a public health catastrophe. Here, applying a combination of microsatellite analysis, whole genome sequencing and targeted sequencing of a resistance locus, we elucidated the continent-wide population structure of a major African malaria vector, Anopheles funestus. We identified a major selective sweep in a genomic region controlling cytochrome P450-based metabolic resistance conferring high resistance to pyrethroids. This selective sweep occurred since 2002, likely as a direct consequence of scaled up vector control as revealed by whole genome and fine-scale sequencing of pre- and post-intervention populations. Fine-scaled analysis of the pyrethroid resistance locus revealed that a resistance-associated allele of the cytochrome P450 monooxygenase CYP6P9a has swept through southern Africa to near fixation, in contrast to high polymorphism levels before interventions, conferring high levels of pyrethroid resistance linked to control failure. Population structure analysis revealed a barrier to gene flow between southern Africa and other areas, which may prevent or slow the spread of the southern mechanism of pyrethroid resistance to other regions. By identifying a genetic signature of pyrethroid-based interventions, we have demonstrated the intense selective pressure that control interventions exert on mosquito populations. If this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions will be compromised.

  8. Population Structure and Genomic Breed Composition in an Angus-Brahman Crossbred Cattle Population.

    Science.gov (United States)

    Gobena, Mesfin; Elzo, Mauricio A; Mateescu, Raluca G

    2018-01-01

    Crossbreeding is a common strategy used in tropical and subtropical regions to enhance beef production, and having accurate knowledge of breed composition is essential for the success of a crossbreeding program. Although pedigree records have been traditionally used to obtain the breed composition of crossbred cattle, the accuracy of pedigree-based breed composition can be reduced by inaccurate and/or incomplete records and Mendelian sampling. Breed composition estimation from genomic data has multiple advantages including higher accuracy without being affected by missing, incomplete, or inaccurate records and the ability to be used as independent authentication of breed in breed-labeled beef products. The present study was conducted with 676 Angus-Brahman crossbred cattle with genotype and pedigree information to evaluate the feasibility and accuracy of using genomic data to determine breed composition. We used genomic data in parametric and non-parametric methods to detect population structure due to differences in breed composition while accounting for the confounding effect of close familial relationships. By applying principal component analysis (PCA) and the maximum likelihood method of ADMIXTURE to genomic data, it was possible to successfully characterize population structure resulting from heterogeneous breed ancestry, while accounting for close familial relationships. PCA results offered additional insight into the different hierarchies of genetic variation structuring. The first principal component was strongly correlated with Angus-Brahman proportions, and the second represented variation within animals that have a relatively more extended Brangus lineage-indicating the presence of a distinct pattern of genetic variation in these cattle. Although there was strong agreement between breed proportions estimated from pedigree and genetic information, there were significant discrepancies between these two methods for certain animals. This was most likely due

  9. Population Structure and Genomic Breed Composition in an Angus–Brahman Crossbred Cattle Population

    Directory of Open Access Journals (Sweden)

    Mesfin Gobena

    2018-03-01

    Full Text Available Crossbreeding is a common strategy used in tropical and subtropical regions to enhance beef production, and having accurate knowledge of breed composition is essential for the success of a crossbreeding program. Although pedigree records have been traditionally used to obtain the breed composition of crossbred cattle, the accuracy of pedigree-based breed composition can be reduced by inaccurate and/or incomplete records and Mendelian sampling. Breed composition estimation from genomic data has multiple advantages including higher accuracy without being affected by missing, incomplete, or inaccurate records and the ability to be used as independent authentication of breed in breed-labeled beef products. The present study was conducted with 676 Angus–Brahman crossbred cattle with genotype and pedigree information to evaluate the feasibility and accuracy of using genomic data to determine breed composition. We used genomic data in parametric and non-parametric methods to detect population structure due to differences in breed composition while accounting for the confounding effect of close familial relationships. By applying principal component analysis (PCA and the maximum likelihood method of ADMIXTURE to genomic data, it was possible to successfully characterize population structure resulting from heterogeneous breed ancestry, while accounting for close familial relationships. PCA results offered additional insight into the different hierarchies of genetic variation structuring. The first principal component was strongly correlated with Angus–Brahman proportions, and the second represented variation within animals that have a relatively more extended Brangus lineage—indicating the presence of a distinct pattern of genetic variation in these cattle. Although there was strong agreement between breed proportions estimated from pedigree and genetic information, there were significant discrepancies between these two methods for certain animals

  10. High-throughput crystal-optimization strategies in the South Paris Yeast Structural Genomics Project: one size fits all?

    Science.gov (United States)

    Leulliot, Nicolas; Trésaugues, Lionel; Bremang, Michael; Sorel, Isabelle; Ulryck, Nathalie; Graille, Marc; Aboulfath, Ilham; Poupon, Anne; Liger, Dominique; Quevillon-Cheruel, Sophie; Janin, Joël; van Tilbeurgh, Herman

    2005-06-01

    Crystallization has long been regarded as one of the major bottlenecks in high-throughput structural determination by X-ray crystallography. Structural genomics projects have addressed this issue by using robots to set up automated crystal screens using nanodrop technology. This has moved the bottleneck from obtaining the first crystal hit to obtaining diffraction-quality crystals, as crystal optimization is a notoriously slow process that is difficult to automatize. This article describes the high-throughput optimization strategies used in the Yeast Structural Genomics project, with selected successful examples.

  11. Mass spectrometry for the elucidation of the subtle molecular structure of biodegradable polymers and their degradation products.

    Science.gov (United States)

    Kowalczuk, Marek; Adamus, Grażyna

    2016-01-01

    Contemporary reports by Polish authors on the application of mass spectrometric methods for the elucidation of the subtle molecular structure of biodegradable polymers and their degradation products will be presented. Special emphasis will be given to natural aliphatic (co)polyesters (PHA) and their synthetic analogues, formed through anionic ring-opening polymerization (ROP) of β-substituted β-lactones. Moreover, the application of MS techniques for the evaluation of the structure of biodegradable polymers obtained in ionic and coordination polymerization of cyclic ethers and esters as well as products of step-growth polymerization, in which bifunctional or multifunctional monomers react to form oligomers and eventually long chain polymers, will be discussed. Furthermore, the application of modern MS techniques for the assessment of polymer degradation products, frequently bearing characteristic end groups that can be revealed and differentiated by MS, will be discussed within the context of specific degradation pathways. Finally, recent Polish accomplishments in the area of mass spectrometry will be outlined. © 2015 Wiley Periodicals, Inc.

  12. Insights into the genome structure and copy-number variation of Eimeria tenella

    Directory of Open Access Journals (Sweden)

    Lim Lik-Sin

    2012-08-01

    method to improve the assembly of the genome of E. tenella from shotgun data, and to help reveal its overall structure. A preliminary assessment of copy-number variation (extra or missing copies of genomic segments between strains of E. tenella was also carried out. The emerging picture is of a very unusual genome architecture displaying inter-strain copy-number variation. We suggest that these features may be related to the known ability of this parasite to rapidly develop drug resistance.

  13. Exploring the possibilities and limitations of a nanomaterials genome.

    Science.gov (United States)

    Qian, Chenxi; Siler, Todd; Ozin, Geoffrey A

    2015-01-07

    What are we going to do with the cornucopia of nanomaterials appearing in the open and patent literature, every day? Imagine the benefits of an intelligent and convenient means of categorizing, organizing, sifting, sorting, connecting, and utilizing this information in scientifically and technologically innovative ways by building a Nanomaterials Genome founded upon an all-purpose Periodic Table of Nanomaterials. In this Concept article, inspired by work on the Human Genome project, which began in 1989 together with motivation from the recent emergence of the Materials Genome project initiated in 2011 and the Nanoinformatics Roadmap 2020 instigated in 2010, we envision the development of a Nanomaterials Genome (NMG) database with the most advanced data-mining tools that leverage inference engines to help connect and interpret patterns of nanomaterials information. It will be equipped with state-of-the-art visualization techniques that rapidly organize and picture, categorize and interrelate the inherited behavior of complex nanomatter from the information programmed in its constituent nanomaterials building blocks. A Nanomaterials Genome Initiative (NMGI) of the type imagined herein has the potential to serve the global nanoscience community with an opportunity to speed up the development continuum of nanomaterials through the innovation process steps of discovery, structure determination and property optimization, functionality elucidation, system design and integration, certification and manufacturing to deployment in technologies that apply these versatile nanomaterials in environmentally responsible ways. The possibilities and limitations of this concept are critically evaluated in this article. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. The eukaryotic genome is structurally and functionally more like a social insect colony than a book.

    Science.gov (United States)

    Qiu, Guo-Hua; Yang, Xiaoyan; Zheng, Xintian; Huang, Cuiqin

    2017-11-01

    Traditionally, the genome has been described as the 'book of life'. However, the metaphor of a book may not reflect the dynamic nature of the structure and function of the genome. In the eukaryotic genome, the number of centrally located protein-coding sequences is relatively constant across species, but the amount of noncoding DNA increases considerably with the increase of organismal evolutional complexity. Therefore, it has been hypothesized that the abundant peripheral noncoding DNA protects the genome and the central protein-coding sequences in the eukaryotic genome. Upon comparison with the habitation, sociality and defense mechanisms of a social insect colony, it is found that the genome is similar to a social insect colony in various aspects. A social insect colony may thus be a better metaphor than a book to describe the spatial organization and physical functions of the genome. The potential implications of the metaphor are also discussed.

  15. Exploiting the Complementarity between Dereplication and Computer-Assisted Structure Elucidation for the Chemical Profiling of Natural Cosmetic Ingredients: Tephrosia purpurea as a Case Study.

    Science.gov (United States)

    Hubert, Jane; Chollet, Sébastien; Purson, Sylvain; Reynaud, Romain; Harakat, Dominique; Martinez, Agathe; Nuzillard, Jean-Marc; Renault, Jean-Hugues

    2015-07-24

    The aqueous-ethanolic extract of Tephrosia purpurea seeds is currently exploited in the cosmetic industry as a natural ingredient of skin lotions. The aim of this study was to chemically characterize this ingredient by combining centrifugal partition extraction (CPE) as a fractionation tool with two complementary identification approaches involving dereplication and computer-assisted structure elucidation. Following two rapid fractionations of the crude extract (2 g), seven major compounds namely, caffeic acid, quercetin-3-O-rutinoside, ethyl galactoside, ciceritol, stachyose, saccharose, and citric acid, were unambiguously identified within the CPE-generated simplified mixtures by a recently developed (13)C NMR-based dereplication method. The structures of four additional compounds, patuletin-3-O-rutinoside, kaempferol-3-O-rutinoside, guaiacylglycerol 8-vanillic acid ether, and 2-methyl-2-glucopyranosyloxypropanoic acid, were automatically elucidated by using the Logic for Structure Determination program based on the interpretation of 2D NMR (HSQC, HMBC, and COSY) connectivity data. As more than 80% of the crude extract mass was characterized without need for tedious and labor-intensive multistep purification procedures, the identification tools involved in this work constitute a promising strategy for an efficient and time-saving chemical profiling of natural extracts.

  16. Elucidation of IL-1/TGF-beta interactions in mouse chondrocyte cell line by genome-wide gene expression

    DEFF Research Database (Denmark)

    Takahashi, N; Rieneck, K; van der Kraan, P M

    2005-01-01

    To elucidate the antagonism between interleukin-1 (IL-1) and transforming growth factor-beta (TGF-beta) at the gene expression level, as IL-1 and TGF-beta are postulated to be critical mediators of cartilage degeneration/protection in rheumatic diseases....

  17. Diverse circovirus-like genome architectures revealed by environmental metagenomics.

    Science.gov (United States)

    Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

    2009-10-01

    Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses.

  18. Structural genomics in endocrinology

    NARCIS (Netherlands)

    Smit, J. W.; Romijn, J. A.

    2001-01-01

    Traditionally, endocrine research evolved from the phenotypical characterisation of endocrine disorders to the identification of underlying molecular pathophysiology. This approach has been, and still is, extremely successful. The introduction of genomics and proteomics has resulted in a reversal of

  19. Cloning, production, and purification of proteins for a medium-scale structural genomics project.

    Science.gov (United States)

    Quevillon-Cheruel, Sophie; Collinet, Bruno; Trésaugues, Lionel; Minard, Philippe; Henckes, Gilles; Aufrère, Robert; Blondeau, Karine; Zhou, Cong-Zhao; Liger, Dominique; Bettache, Nabila; Poupon, Anne; Aboulfath, Ilham; Leulliot, Nicolas; Janin, Joël; van Tilbeurgh, Herman

    2007-01-01

    The South-Paris Yeast Structural Genomics Pilot Project (http://www.genomics.eu.org) aims at systematically expressing, purifying, and determining the three-dimensional structures of Saccharomyces cerevisiae proteins. We have already cloned 240 yeast open reading frames in the Escherichia coli pET system. Eighty-two percent of the targets can be expressed in E. coli, and 61% yield soluble protein. We have currently purified 58 proteins. Twelve X-ray structures have been solved, six are in progress, and six other proteins gave crystals. In this chapter, we present the general experimental flowchart applied for this project. One of the main difficulties encountered in this pilot project was the low solubility of a great number of target proteins. We have developed parallel strategies to recover these proteins from inclusion bodies, including refolding, coexpression with chaperones, and an in vitro expression system. A limited proteolysis protocol, developed to localize flexible regions in proteins that could hinder crystallization, is also described.

  20. Opening plenary speaker: Human genomics, precision medicine, and advancing human health.

    Science.gov (United States)

    Green, Eric D

    2016-08-01

    Starting with the launch of the Human Genome Project in 1990, the past quarter-century has brought spectacular achievements in genomics that dramatically empower the study of human biology and disease. The human genomics enterprise is now in the midst of an important transition, as the growing foundation of genomic knowledge is being used by researchers and clinicians to tackle increasingly complex problems in biomedicine. Of particular prominence is the use of revolutionary new DNA sequencing technologies for generating prodigious amounts of DNA sequence data to elucidate the complexities of genome structure, function, and evolution, as well as to unravel the genomic bases of rare and common diseases. Together, these developments are ushering in the era of genomic medicine. Augmenting the advances in human genomics have been innovations in technologies for measuring environmental and lifestyle information, electronic health records, and data science; together, these provide opportunities of unprecedented scale and scope for investigating the underpinnings of health and disease. To capitalize on these opportunities, U.S. President Barack Obama recently announced a major new research endeavor - the U.S. Precision Medicine Initiative. This bold effort will be framed around several key aims, which include accelerating the use of genomically informed approaches to cancer care, making important policy and regulatory changes, and establishing a large research cohort of >1 million volunteers to facilitate precision medicine research. The latter will include making the partnership with all participants a centerpiece feature in the cohort's design and development. The Precision Medicine Initiative represents a broad-based research program that will allow new approaches for individualized medical care to be rigorously tested, so as to establish a new evidence base for advancing clinical practice and, eventually, human health.

  1. Use of γ-ray-induced mutations in the genome era in rice

    International Nuclear Information System (INIS)

    Kusaba, Makoto

    2007-01-01

    Ionizing radiation has been used for inducing mutations and improving crops since the discovery by STADLER (1928) that X-rays could induce mutations in barley. At the end of 2004, the whole genome sequence of rice was determined (INTERNATIONAL RICE GENOME SEQUENCING PROJECT, 2005). What can γ-ray-induced mutations contribute now that this has been achieved? One answer could be the elucidation of the functions of the numerous genes revealed by the complete sequence of the rice genome. This includes identification of mutants through reverse genetics and the isolation of genes containing mutations through forward genetics using molecular markers and sequence information. Another answer could be mutation breeding using reverse genetics. But first we must know what kind of DNA lesions are caused by γ-rays. In this article, I describe the production of DNA lesions, and then discuss how γ-ray-induced mutations can contribute to the elucidation of gene function and to mutation breeding. (author)

  2. Morphology, genome sequence, and structural proteome of type phage P335 from Lactococcus lactis

    DEFF Research Database (Denmark)

    Labrie, Simon J.; Josephsen, Jytte; Neve, Horst

    2008-01-01

    for a shorter tail and a different collar/whisker structure. Its 33,613-bp double-stranded DNA genome had 50 open reading frames. Putative functions were assigned to 29 of them. Unlike other sequenced genomes from lactococcal phages belonging to this species, P335 did not have a lysogeny module. However, it did...... genome. The genetic diversity of the P335 species indicates that they are exceptional models for studying the modular theory of phage evolution....

  3. Structure elucidation and immunomodulatory activity of a beta glucan from the fruiting bodies of Ganoderma sinense.

    Directory of Open Access Journals (Sweden)

    Xiao-Qiang Han

    Full Text Available A polysaccharide named GSP-2 with a molecular size of 32 kDa was isolated from the fruiting bodies of Ganoderma sinense. Its structure was well elucidated, by a combined utilization of chemical and spectroscopic techniques, to be a β-glucan with a backbone of (1→4- and (1→6-Glcp, bearing terminal- and (1→3-Glcp side-chains at O-3 position of (1→6-Glcp. Immunological assay exhibited that GSP-2 significantly induced the proliferation of BALB/c mice splenocytes with target on only B cells, and enhanced the production of several cytokines in human peripheral blood mononuclear cells and derived dendritic cells. Besides, the fluorescent labeled GSP-2 was phagocytosed by the RAW 264.7 cells and induced the nitric oxide secretion from the cells.

  4. Structural genomics: keeping up with expanding knowledge of the protein universe

    Science.gov (United States)

    Grabowski, Marek; Joachimiak, Andrzej; Otwinowski, Zbyszek; Minor, Wladek

    2010-01-01

    Structural characterization of the protein universe is the main mission of Structural Genomics (SG) programs. However, progress in gene sequencing technology, set in motion in the 1990s, has resulted in rapid expansion of protein sequence space — a twelvefold increase in the past seven years. For the SG field, this creates new challenges and necessitates a reassessment of its strategies. Nevertheless, despite the growth of sequence space, at present nearly half of the content of the Swiss-Prot database and over 40% of Pfam protein families can be structurally modeled based on structures determined so far, with SG projects making an increasingly significant contribution. The SG contribution of new Pfam structures nearly doubled from 27.2% in 2003 to 51.6% in 2006. PMID:17587562

  5. Structural genomics: keeping up with expanding knowledge of the protein universe.

    Science.gov (United States)

    Grabowski, Marek; Joachimiak, Andrzej; Otwinowski, Zbyszek; Minor, Wladek

    2007-06-01

    Structural characterization of the protein universe is the main mission of Structural Genomics (SG) programs. However, progress in gene sequencing technology, set in motion in the 1990s, has resulted in rapid expansion of protein sequence space--a twelvefold increase in the past seven years. For the SG field, this creates new challenges and necessitates a re-assessment of its strategies. Nevertheless, despite the growth of sequence space, at present nearly half of the content of the Swiss-Prot database and over 40% of Pfam protein families can be structurally modeled based on structures determined so far, with SG projects making an increasingly significant contribution. The SG contribution of new Pfam structures nearly doubled from 27.2% in 2003 to 51.6% in 2006.

  6. Macromolecular structure determination in the post-genome era

    International Nuclear Information System (INIS)

    Kuhn, P.; Soltis, S.M.

    2001-01-01

    Recent advances in genetics, molecular biology and crystallographic instrumentation and methodology have led to a revolution in the field of Structural Molecular Biology (SMB). These combined advances have paved the way to a more complete and detailed understanding of the biological macromolecules that make up an organism, both in terms of their individual functions and also the interactions between them. In this paper we describe a large-scale, genomic approach to the three-dimensional structure determination of macromolecules and their complexes, using high-throughput methodology to streamline all aspects of the process. This task requires the development of automated high-intensity synchrotron beam lines for X-ray diffraction data collection from single crystal samples. Furthermore, these beam lines must be operated within a sophisticated software and hardware environment, which is capable of delivering a completely automated structure determination pipeline. The SMB resource at SSRL is developing a system for the structure determination steps of this process, starting with the initial characterization of the frozen sample, followed by data collection, data reduction, phase determination, and model building. This paper focuses on the data collection elements of this high-throughput system

  7. Protein structure similarity clustering (PSSC) and natural product structure as inspiration sources for drug development and chemical genomics

    NARCIS (Netherlands)

    Dekker, Frank J; Koch, Marcus A; Waldmann, Herbert; Dekker, Frans

    Finding small molecules that modulate protein function is of primary importance in drug development and in the emerging field of chemical genomics. To facilitate the identification of such molecules, we developed a novel strategy making use of structural conservatism found in protein domain

  8. Long-Range Order and Fractality in the Structure and Organization of Eukaryotic Genomes

    Science.gov (United States)

    Polychronopoulos, Dimitris; Tsiagkas, Giannis; Athanasopoulou, Labrini; Sellis, Diamantis; Almirantis, Yannis

    2014-12-01

    The late Professor J.S. Nicolis always emphasized, both in his writings and in presentations and discussions with students and friends, the relevance of a dynamical systems approach to biology. In particular, viewing the genome as a "biological text" captures the dynamical character of both the evolution and function of the organisms in the form of correlations indicating the presence of a long-range order. This genomic structure can be expressed in forms reminiscent of natural languages and several temporal and spatial traces l by the functioning of dynamical systems: Zipf laws, self-similarity and fractality. Here we review several works of our group and recent unpublished results, focusing on the chromosomal distribution of biologically active genomic components: Genes and protein-coding segments, CpG islands, transposable elements belonging to all major classes and several types of conserved non-coding genomic elements. We report the systematic appearance of power-laws in the size distribution of the distances between elements belonging to each of these types of functional genomic elements. Moreover, fractality is also found in several cases, using box-counting and entropic scaling.We present here, for the first time in a unified way, an aggregative model of the genomic dynamics which can explain the observed patterns on the grounds of known phenomena accompanying genome evolution. Our results comply with recent findings about a "fractal globule" geometry of chromatin in the eukaryotic nucleus.

  9. 3-Dimensional atomic scale structure of the ionic liquid-graphite interface elucidated by AM-AFM and quantum chemical simulations

    Science.gov (United States)

    Page, Alister J.; Elbourne, Aaron; Stefanovic, Ryan; Addicoat, Matthew A.; Warr, Gregory G.; Voïtchovsky, Kislon; Atkin, Rob

    2014-06-01

    In situ amplitude modulated atomic force microscopy (AM-AFM) and quantum chemical simulations are used to resolve the structure of the highly ordered pyrolytic graphite (HOPG)-bulk propylammonium nitrate (PAN) interface with resolution comparable with that achieved for frozen ionic liquid (IL) monolayers using STM. This is the first time that (a) molecular resolution images of bulk IL-solid interfaces have been achieved, (b) the lateral structure of the IL graphite interface has been imaged for any IL, (c) AM-AFM has elucidated molecular level structure immersed in a viscous liquid and (d) it has been demonstrated that the IL structure at solid surfaces is a consequence of both thermodynamic and kinetic effects. The lateral structure of the PAN-graphite interface is highly ordered and consists of remarkably well-defined domains of a rhomboidal superstructure composed of propylammonium cations preferentially aligned along two of the three directions in the underlying graphite lattice. The nanostructure is primarily determined by the cation. Van der Waals interactions between the propylammonium chains and the surface mean that the cation is enriched in the surface layer, and is much less mobile than the anion. The presence of a heterogeneous lateral structure at an ionic liquid-solid interface has wide ranging ramifications for ionic liquid applications, including lubrication, capacitive charge storage and electrodeposition.In situ amplitude modulated atomic force microscopy (AM-AFM) and quantum chemical simulations are used to resolve the structure of the highly ordered pyrolytic graphite (HOPG)-bulk propylammonium nitrate (PAN) interface with resolution comparable with that achieved for frozen ionic liquid (IL) monolayers using STM. This is the first time that (a) molecular resolution images of bulk IL-solid interfaces have been achieved, (b) the lateral structure of the IL graphite interface has been imaged for any IL, (c) AM-AFM has elucidated molecular level

  10. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...... fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides...

  11. GeneViTo: Visualizing gene-product functional and structural features in genomic datasets

    Directory of Open Access Journals (Sweden)

    Promponas Vasilis J

    2003-10-01

    Full Text Available Abstract Background The availability of increasing amounts of sequence data from completely sequenced genomes boosts the development of new computational methods for automated genome annotation and comparative genomics. Therefore, there is a need for tools that facilitate the visualization of raw data and results produced by bioinformatics analysis, providing new means for interactive genome exploration. Visual inspection can be used as a basis to assess the quality of various analysis algorithms and to aid in-depth genomic studies. Results GeneViTo is a JAVA-based computer application that serves as a workbench for genome-wide analysis through visual interaction. The application deals with various experimental information concerning both DNA and protein sequences (derived from public sequence databases or proprietary data sources and meta-data obtained by various prediction algorithms, classification schemes or user-defined features. Interaction with a Graphical User Interface (GUI allows easy extraction of genomic and proteomic data referring to the sequence itself, sequence features, or general structural and functional features. Emphasis is laid on the potential comparison between annotation and prediction data in order to offer a supplement to the provided information, especially in cases of "poor" annotation, or an evaluation of available predictions. Moreover, desired information can be output in high quality JPEG image files for further elaboration and scientific use. A compilation of properly formatted GeneViTo input data for demonstration is available to interested readers for two completely sequenced prokaryotes, Chlamydia trachomatis and Methanococcus jannaschii. Conclusions GeneViTo offers an inspectional view of genomic functional elements, concerning data stemming both from database annotation and analysis tools for an overall analysis of existing genomes. The application is compatible with Linux or Windows ME-2000-XP operating

  12. Refining the structure and content of clinical genomic reports.

    Science.gov (United States)

    Dorschner, Michael O; Amendola, Laura M; Shirts, Brian H; Kiedrowski, Lesli; Salama, Joseph; Gordon, Adam S; Fullerton, Stephanie M; Tarczy-Hornoch, Peter; Byers, Peter H; Jarvik, Gail P

    2014-03-01

    To effectively articulate the results of exome and genome sequencing we refined the structure and content of molecular test reports. To communicate results of a randomized control trial aimed at the evaluation of exome sequencing for clinical medicine, we developed a structured narrative report. With feedback from genetics and non-genetics professionals, we developed separate indication-specific and incidental findings reports. Standard test report elements were supplemented with research study-specific language, which highlighted the limitations of exome sequencing and provided detailed, structured results, and interpretations. The report format we developed to communicate research results can easily be transformed for clinical use by removal of research-specific statements and disclaimers. The development of clinical reports for exome sequencing has shown that accurate and open communication between the clinician and laboratory is ideally an ongoing process to address the increasing complexity of molecular genetic testing. © 2014 Wiley Periodicals, Inc.

  13. Molecular fossils in modern genomes provide physiological and geochemical insights to the ancient earth (Invited)

    Science.gov (United States)

    Dupont, C.; Caetano-Anolles, G.

    2010-12-01

    The genomes of extant organisms are ultimately derived from ancient life, thus theoretically contain insight to ancient physiology, ecology, and environments. In particular, metalloenzymes may be particularly insightful. The fundamental chemistry of trace elements dictates the molecular speciation and reactivity both within cells and the environment at large. Using protein structure and comparative genomics, we elucidate several major influences this chemistry has had upon biology. All of life exhibits the same proteome size-dependent scaling for the number of metal-binding proteins within a proteome. This fundamental evolutionary constant shows that the selection of one element occurs at the exclusion of another, with the eschewal of Fe for Zn and Ca being a defining feature of eukaryotic pro- teomes. Early life lacked both the structures required to control intracellular metal concentrations and the metal-binding proteins that catalyze electron transport and redox transformations. The development of protein structures for metal homeostasis coincided with the emergence of metal-specific structures, which predomi- nantly bound metals abundant in the Archean ocean. Potentially, this promoted the diversification of emerging lineages of Archaea and Bacteria through the establishment of biogeochemical cycles. In contrast, structures binding Cu and Zn evolved much later, pro- viding further evidence that environmental availability influenced the selection of the elements. The late evolving Zn-binding proteins are fundamental to eukaryotic cellular biology, and Zn bioavailabil- ity may have been a limiting factor in eukaryotic evolution. The results presented here provide an evolutionary timeline based on genomic characteristics, and key hypotheses can be tested by alternative geochemical methods.

  14. Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

    Science.gov (United States)

    Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

    2017-04-01

    There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.

  15. Genomic evolution of Saccharomyces cerevisiae under Chinese rice wine fermentation.

    Science.gov (United States)

    Li, Yudong; Zhang, Weiping; Zheng, Daoqiong; Zhou, Zhan; Yu, Wenwen; Zhang, Lei; Feng, Lifang; Liang, Xinle; Guan, Wenjun; Zhou, Jingwen; Chen, Jian; Lin, Zhenguo

    2014-09-10

    Rice wine fermentation represents a unique environment for the evolution of the budding yeast, Saccharomyces cerevisiae. To understand how the selection pressure shaped the yeast genome and gene regulation, we determined the genome sequence and transcriptome of a S. cerevisiae strain YHJ7 isolated from Chinese rice wine (Huangjiu), a popular traditional alcoholic beverage in China. By comparing the genome of YHJ7 to the lab strain S288c, a Japanese sake strain K7, and a Chinese industrial bioethanol strain YJSH1, we identified many genomic sequence and structural variations in YHJ7, which are mainly located in subtelomeric regions, suggesting that these regions play an important role in genomic evolution between strains. In addition, our comparative transcriptome analysis between YHJ7 and S288c revealed a set of differentially expressed genes, including those involved in glucose transport (e.g., HXT2, HXT7) and oxidoredutase activity (e.g., AAD10, ADH7). Interestingly, many of these genomic and transcriptional variations are directly or indirectly associated with the adaptation of YHJ7 strain to its specific niches. Our molecular evolution analysis suggested that Japanese sake strains (K7/UC5) were derived from Chinese rice wine strains (YHJ7) at least approximately 2,300 years ago, providing the first molecular evidence elucidating the origin of Japanese sake strains. Our results depict interesting insights regarding the evolution of yeast during rice wine fermentation, and provided a valuable resource for genetic engineering to improve industrial wine-making strains. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. The nuclear genome of Rhazya stricta and the evolution of alkaloid diversity in a medically relevant clade of Apocynaceae.

    Science.gov (United States)

    Sabir, Jamal S M; Jansen, Robert K; Arasappan, Dhivya; Calderon, Virginie; Noutahi, Emmanuel; Zheng, Chunfang; Park, Seongjun; Sabir, Meshaal J; Baeshen, Mohammed N; Hajrah, Nahid H; Khiyami, Mohammad A; Baeshen, Nabih A; Obaid, Abdullah Y; Al-Malki, Abdulrahman L; Sankoff, David; El-Mabrouk, Nadia; Ruhlman, Tracey A

    2016-09-22

    Alkaloid accumulation in plants is activated in response to stress, is limited in distribution and specific alkaloid repertoires are variable across taxa. Rauvolfioideae (Apocynaceae, Gentianales) represents a major center of structural expansion in the monoterpenoid indole alkaloids (MIAs) yielding thousands of unique molecules including highly valuable chemotherapeutics. The paucity of genome-level data for Apocynaceae precludes a deeper understanding of MIA pathway evolution hindering the elucidation of remaining pathway enzymes and the improvement of MIA availability in planta or in vitro. We sequenced the nuclear genome of Rhazya stricta (Apocynaceae, Rauvolfioideae) and present this high quality assembly in comparison with that of coffee (Rubiaceae, Coffea canephora, Gentianales) and others to investigate the evolution of genome-scale features. The annotated Rhazya genome was used to develop the community resource, RhaCyc, a metabolic pathway database. Gene family trees were constructed to identify homologs of MIA pathway genes and to examine their evolutionary history. We found that, unlike Coffea, the Rhazya lineage has experienced many structural rearrangements. Gene tree analyses suggest recent, lineage-specific expansion and diversification among homologs encoding MIA pathway genes in Gentianales and provide candidate sequences with the potential to close gaps in characterized pathways and support prospecting for new MIA production avenues.

  17. Salmonella strains isolated from Galápagos iguanas show spatial structuring of serovar and genomic diversity.

    Science.gov (United States)

    Lankau, Emily W; Cruz Bedon, Lenin; Mackie, Roderick I

    2012-01-01

    It is thought that dispersal limitation primarily structures host-associated bacterial populations because host distributions inherently limit transmission opportunities. However, enteric bacteria may disperse great distances during food-borne outbreaks. It is unclear if such rapid long-distance dispersal events happen regularly in natural systems or if these events represent an anthropogenic exception. We characterized Salmonella enterica isolates from the feces of free-living Galápagos land and marine iguanas from five sites on four islands using serotyping and genomic fingerprinting. Each site hosted unique and nearly exclusive serovar assemblages. Genomic fingerprint analysis offered a more complex model of S. enterica biogeography, with evidence of both unique strain pools and of spatial population structuring along a geographic gradient. These findings suggest that even relatively generalist enteric bacteria may be strongly dispersal limited in a natural system with strong barriers, such as oceanic divides. Yet, these differing results seen on two typing methods also suggests that genomic variation is less dispersal limited, allowing for different ecological processes to shape biogeographical patterns of the core and flexible portions of this bacterial species' genome.

  18. Structure Elucidation and Immunomodulatory Activity of A Beta Glucan from the Fruiting Bodies of Ganoderma sinense

    Science.gov (United States)

    Yue, Rui-Qi; Dong, Cai-Xia; Chan, Chung-Lap; Ko, Chun-Hay; Cheung, Wing-Shing; Luo, Ke-Wang; Dai, Hui; Wong, Chun-Kwok; Leung, Ping-Chung; Han, Quan-Bin

    2014-01-01

    A polysaccharide named GSP-2 with a molecular size of 32 kDa was isolated from the fruiting bodies of Ganoderma sinense. Its structure was well elucidated, by a combined utilization of chemical and spectroscopic techniques, to be a β-glucan with a backbone of (1→4)– and (1→6)–Glcp, bearing terminal- and (1→3)–Glcp side-chains at O-3 position of (1→6)–Glcp. Immunological assay exhibited that GSP-2 significantly induced the proliferation of BALB/c mice splenocytes with target on only B cells, and enhanced the production of several cytokines in human peripheral blood mononuclear cells and derived dendritic cells. Besides, the fluorescent labeled GSP-2 was phagocytosed by the RAW 264.7 cells and induced the nitric oxide secretion from the cells. PMID:25014571

  19. Structure elucidation of the diagnostic product ion at m/z 97 derived from androst-4-en-3-one-based steroids by ESI-CID and IRMPD spectroscopy

    NARCIS (Netherlands)

    Thevis, M.; Beuck, S.; Hoeppner, S.; Thomas, A.; Held, J.; Schaefer, M.; Oomens, J.; Schaenzer, W.

    2012-01-01

    Structure elucidation of steroids by mass spectrometry has been of great importance to various analytical arenas and numerous studies were conducted to provide evidence for the composition and origin of (tandem) mass spectrometry-derived product ions used to characterize and identify steroidal

  20. Reconstruction of putative DNA virus from endogenous rice tungro bacilliform virus-like sequences in the rice genome: implications for integration and evolution

    Directory of Open Access Journals (Sweden)

    Kishima Yuji

    2004-10-01

    Full Text Available Abstract Background Plant genomes contain various kinds of repetitive sequences such as transposable elements, microsatellites, tandem repeats and virus-like sequences. Most of them, with the exception of virus-like sequences, do not allow us to trace their origins nor to follow the process of their integration into the host genome. Recent discoveries of virus-like sequences in plant genomes led us to set the objective of elucidating the origin of the repetitive sequences. Endogenous rice tungro bacilliform virus (RTBV-like sequences (ERTBVs have been found throughout the rice genome. Here, we reconstructed putative virus structures from RTBV-like sequences in the rice genome and characterized to understand evolutionary implication, integration manner and involvements of endogenous virus segments in the corresponding disease response. Results We have collected ERTBVs from the rice genomes. They contain rearranged structures and no intact ORFs. The identified ERTBV segments were shown to be phylogenetically divided into three clusters. For each phylogenetic cluster, we were able to make a consensus alignment for a circular virus-like structure carrying two complete ORFs. Comparisons of DNA and amino acid sequences suggested the closely relationship between ERTBV and RTBV. The Oryza AA-genome species vary in the ERTBV copy number. The species carrying low-copy-number of ERTBV segments have been reported to be extremely susceptible to RTBV. The DNA methylation state of the ERTBV sequences was correlated with their copy number in the genome. Conclusions These ERTBV segments are unlikely to have functional potential as a virus. However, these sequences facilitate to establish putative virus that provided information underlying virus integration and evolutionary relationship with existing virus. Comparison of ERTBV among the Oryza AA-genome species allowed us to speculate a possible role of endogenous virus segments against its related disease.

  1. Gene design, cloning and protein-expression methods for high-value targets at the Seattle Structural Genomics Center for Infectious Disease

    International Nuclear Information System (INIS)

    Raymond, Amy; Haffner, Taryn; Ng, Nathan; Lorimer, Don; Staker, Bart; Stewart, Lance

    2011-01-01

    An overview of one salvage strategy for high-value SSGCID targets is given. Any structural genomics endeavor, particularly ambitious ones such as the NIAID-funded Seattle Structural Genomics Center for Infectious Disease (SSGCID) and Center for Structural Genomics of Infectious Disease (CSGID), face technical challenges at all points of the production pipeline. One salvage strategy employed by SSGCID is combined gene engineering and structure-guided construct design to overcome challenges at the levels of protein expression and protein crystallization. Multiple constructs of each target are cloned in parallel using Polymerase Incomplete Primer Extension cloning and small-scale expressions of these are rapidly analyzed by capillary electrophoresis. Using the methods reported here, which have proven particularly useful for high-value targets, otherwise intractable targets can be resolved

  2. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species

    Science.gov (United States)

    Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that i...

  3. Pre-genomic, genomic and post-genomic study of microbial communities involved in bioenergy.

    Science.gov (United States)

    Rittmann, Bruce E; Krajmalnik-Brown, Rosa; Halden, Rolf U

    2008-08-01

    Microorganisms can produce renewable energy in large quantities and without damaging the environment or disrupting food supply. The microbial communities must be robust and self-stabilizing, and their essential syntrophies must be managed. Pre-genomic, genomic and post-genomic tools can provide crucial information about the structure and function of these microbial communities. Applying these tools will help accelerate the rate at which microbial bioenergy processes move from intriguing science to real-world practice.

  4. Interpretation of Genomic Data Questions and Answers

    Science.gov (United States)

    Simon, Richard

    2008-01-01

    Using a question and answer format we describe important aspects of using genomic technologies in cancer research. The main challenges are not managing the mass of data, but rather the design, analysis and accurate reporting of studies that result in increased biological knowledge and medical utility. Many analysis issues address the use of expression microarrays but are also applicable to other whole genome assays. Microarray based clinical investigations have generated both unrealistic hyperbole and excessive skepticism. Genomic technologies are tremendously powerful and will play instrumental roles in elucidating the mechanisms of oncogenesis and in devlopingan era of predictive medicine in which treatments are tailored to individual tumors. Achieving these goals involves challenges in re-thinking many paradigms for the conduct of basic and clinical cancer research and for the organization of interdisciplinary collaboration. PMID:18582627

  5. Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer

    Energy Technology Data Exchange (ETDEWEB)

    Spellman, Paul T.; Heiser, Laura; Gray, Joe W.

    2009-06-18

    Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53, and CHK2 that contribute to the initiation of breast cancer, amplification of ERBB2 (formerly HER2) and mutations of elements of the PI3-kinase pathway that activate aspects of epidermal growth factor receptor (EGFR) signaling and deletion of CDKN2A/B that contributes to cell cycle deregulation and genome instability. It is now apparent that accumulation of these aberrations is a time-dependent process that accelerates with age. Although American women living to an age of 85 have a 1 in 8 chance of developing breast cancer, the incidence of cancer in women younger than 30 years is uncommon. This is consistent with a multistep cancer progression model whereby mutation and selection drive the tumor's development, analogous to traditional Darwinian evolution. In the case of cancer, the driving events are changes in sequence, copy number, and structure of DNA and alterations in chromatin structure or other epigenetic marks. Our understanding of the genetic, genomic, and epigenomic events that influence the development and progression of breast cancer is increasing at a remarkable rate through application of powerful analysis tools that enable genome-wide analysis of DNA sequence and structure, copy number, allelic loss, and epigenomic modification. Application of these techniques to elucidation of the nature and timing of these events is enriching our understanding of mechanisms that increase breast cancer susceptibility, enable tumor initiation and progression to metastatic disease, and determine therapeutic response or resistance. These studies also

  6. Isolation and structure elucidation of avocado seed (Persea americana) lipid derivatives that inhibit Clostridium sporogenes endospore germination.

    Science.gov (United States)

    Rodríguez-Sánchez, Dariana Graciela; Pacheco, Adriana; García-Cruz, María Isabel; Gutiérrez-Uribe, Janet Alejandra; Benavides-Lozano, Jorge Alejandro; Hernández-Brenes, Carmen

    2013-07-31

    Avocado fruit extracts are known to exhibit antimicrobial properties. However, the effects on bacterial endospores and the identity of antimicrobial compounds have not been fully elucidated. In this study, avocado seed extracts were tested against Clostridium sporogenes vegetative cells and active endospores. Bioassay-guided purification of a crude extract based on inhibitory properties linked antimicrobial action to six lipid derivatives from the family of acetogenin compounds. Two new structures and four compounds known to exist in nature were identified as responsible for the activity. Structurally, most potent molecules shared features of an acetyl moiety and a trans-enone group. All extracts produced inhibition zones on vegetative cells and active endospores. Minimum inhibitory concentrations (MIC) of isolated molecules ranged from 7.8 to 15.6 μg/mL, and bactericidal effects were observed for an enriched fraction at 19.5 μg/mL. Identified molecules showed potential as natural alternatives to additives and antibiotics used by the food and pharmaceutical industries to inhibit Gram-positive spore-forming bacteria.

  7. Genomic interrogation of mechanism(s) underlying cellular responses to toxicants

    International Nuclear Information System (INIS)

    Amin, Rupesh P.; Hamadeh, Hisham K.; Bushel, Pierre R.; Bennett, Lee; Afshari, Cynthia A.; Paules, Richard S.

    2002-01-01

    Assessment of the impact of xenobiotic exposure on human health and disease progression is complex. Knowledge of mode(s) of action, including mechanism(s) contributing to toxicity and disease progression, is valuable for evaluating compounds. Toxicogenomics, the subdiscipline which merges genomics with toxicology, holds the promise to contributing significantly toward the goal of elucidating mechanism(s) by studying genome-wide effects of xenobiotics. Global gene expression profiling, revolutionized by microarray technology and a crucial aspect of a toxicogenomic study, allows measuring transcriptional modulation of thousands of genes following exposure to a xenobiotic. We use our results from previous studies on compounds representing two different classes of xenobiotics (barbiturate and peroxisome proliferator) to discuss the application of computational approaches for analyzing microarray data to elucidate mechanism(s) underlying cellular responses to toxicants. In particular, our laboratory demonstrated that chemical-specific patterns of gene expression can be revealed using cDNA microarrays. Transcript profiling provides discrimination between classes of toxicants, as well as, genome-wide insight into mechanism(s) of toxicity and disease progression. Ultimately, the expectation is that novel approaches for predicting xenobiotic toxicity in humans will emerge from such information

  8. Draft genome sequence of a multidrug-resistant Chryseobacterium indologenes isolate from Malaysia

    Directory of Open Access Journals (Sweden)

    Choo Yee Yu

    2016-03-01

    Full Text Available Chryseobacterium indologenes is an emerging pathogen which poses a threat in clinical healthcare setting due to its multidrug-resistant phenotype and its common association with nosocomial infections. Here, we report the draft genome of a multidrug-resistant C. indologenes CI_885 isolated in 2014 from Malaysia. The 908,704-kb genome harbors a repertoire of putative antibiotic resistance determinants which may elucidate the molecular basis and underlying mechanisms of its resistant to various classes of antibiotics. The genome sequence has been deposited in DDBJ/EMBL/GenBank under the accession number LJOD00000000. Keywords: Chryseobacterium indologenes, Genome, Multi-drug resistant, blaIND, Next generation sequencing

  9. Mudskipper genomes provide insights into the terrestrial adaptation of amphibious fishes

    DEFF Research Database (Denmark)

    You, Xinxin; Bian, Chao; Zan, Qijie

    2014-01-01

    Mudskippers are amphibious fishes that have developed morphological and physiological adaptations to match their unique lifestyles. Here we perform whole-genome sequencing of four representative mudskippers to elucidate the molecular mechanisms underlying these adaptations. We discover an expansi...

  10. Structural elucidation of the hormonal inhibition mechanism of the bile acid cholate on human carbonic anhydrase II

    Energy Technology Data Exchange (ETDEWEB)

    Boone, Christopher D. [University of Florida, PO Box 100267, Gainesville, FL 32610 (United States); Tu, Chingkuang [University of Florida, PO Box 100245, Gainesville, FL 32610 (United States); McKenna, Robert, E-mail: rmckenna@ufl.edu [University of Florida, PO Box 100267, Gainesville, FL 32610 (United States)

    2014-06-01

    The structure of human carbonic anhydrase II in complex with cholate has been determined to 1.54 Å resolution. Elucidation of the novel inhibition mechanism of cholate will aid in the development of a nonsulfur-containing, isoform-specific therapeutic agent. The carbonic anhydrases (CAs) are a family of mostly zinc metalloenzymes that catalyze the reversible hydration/dehydration of CO{sub 2} into bicarbonate and a proton. Human isoform CA II (HCA II) is abundant in the surface epithelial cells of the gastric mucosa, where it serves an important role in cytoprotection through bicarbonate secretion. Physiological inhibition of HCA II via the bile acids contributes to mucosal injury in ulcerogenic conditions. This study details the weak biophysical interactions associated with the binding of a primary bile acid, cholate, to HCA II. The X-ray crystallographic structure determined to 1.54 Å resolution revealed that cholate does not make any direct hydrogen-bond interactions with HCA II, but instead reconfigures the well ordered water network within the active site to promote indirect binding to the enzyme. Structural knowledge of the binding interactions of this nonsulfur-containing inhibitor with HCA II could provide the template design for high-affinity, isoform-specific therapeutic agents for a variety of diseases/pathological states, including cancer, glaucoma, epilepsy and osteoporosis.

  11. The Genomic Code: Genome Evolution and Potential Applications

    KAUST Repository

    Bernardi, Giorgio

    2016-01-25

    The genome of metazoans is organized according to a genomic code which comprises three laws: 1) Compositional correlations hold between contiguous coding and non-coding sequences, as well as among the three codon positions of protein-coding genes; these correlations are the consequence of the fact that the genomes under consideration consist of fairly homogeneous, long (≥200Kb) sequences, the isochores; 2) Although isochores are defined on the basis of purely compositional properties, GC levels of isochores are correlated with all tested structural and functional properties of the genome; 3) GC levels of isochores are correlated with chromosome architecture from interphase to metaphase; in the case of interphase the correlation concerns isochores and the three-dimensional “topological associated domains” (TADs); in the case of mitotic chromosomes, the correlation concerns isochores and chromosomal bands. Finally, the genomic code is the fourth and last pillar of molecular biology, the first three pillars being 1) the double helix structure of DNA; 2) the regulation of gene expression in prokaryotes; and 3) the genetic code.

  12. Defining functional DNA elements in the human genome

    Science.gov (United States)

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P.; Bernstein, Bradley E.; Kundaje, Anshul; Marinov, Georgi K.; Ward, Lucas D.; Birney, Ewan; Crawford, Gregory E.; Dekker, Job; Dunham, Ian; Elnitski, Laura L.; Farnham, Peggy J.; Feingold, Elise A.; Gerstein, Mark; Giddings, Morgan C.; Gilbert, David M.; Gingeras, Thomas R.; Green, Eric D.; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D.; Myers, Richard M.; Pazin, Michael J.; Ren, Bing; Stamatoyannopoulos, John A.; Weng, Zhiping; White, Kevin P.; Hardison, Ross C.

    2014-01-01

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. PMID:24753594

  13. Metabolic and genomic analysis elucidates strain-level variation in Microbacterium spp. isolated from chromate contaminated sediment

    Data.gov (United States)

    U.S. Environmental Protection Agency — The data is in the form of genomic sequences deposited in a public database, growth curves, and bioinformatic analysis of sequences. This dataset is associated with...

  14. The Complete Chloroplast Genome Sequences of Six Rehmannia Species

    Directory of Open Access Journals (Sweden)

    Shuyun Zeng

    2017-03-01

    Full Text Available Rehmannia is a non-parasitic genus in Orobanchaceae including six species mainly distributed in central and north China. Its phylogenetic position and infrageneric relationships remain uncertain due to potential hybridization and polyploidization. In this study, we sequenced and compared the complete chloroplast genomes of six Rehmannia species using Illumina sequencing technology to elucidate the interspecific variations. Rehmannia plastomes exhibited typical quadripartite and circular structures with good synteny of gene order. The complete genomes ranged from 153,622 bp to 154,055 bp in length, including 133 genes encoding 88 proteins, 37 tRNAs, and 8 rRNAs. Three genes (rpoA, rpoC2, accD have potentially experienced positive selection. Plastome size variation of Rehmannia was mainly ascribed to the expansion and contraction of the border regions between the inverted repeat (IR region and the single-copy (SC regions. Despite of the conserved structure in Rehmannia plastomes, sequence variations provide useful phylogenetic information. Phylogenetic trees of 23 Lamiales species reconstructed with the complete plastomes suggested that Rehmannia was monophyletic and sister to the clade of Lindenbergia and the parasitic taxa in Orobanchaceae. The interspecific relationships within Rehmannia were completely different with the previous studies. In future, population phylogenomic works based on plastomes are urgently needed to clarify the evolutionary history of Rehmannia.

  15. Structural and functional insights of β-glucosidases identified from the genome of Aspergillus fumigatus

    Science.gov (United States)

    Dodda, Subba Reddy; Aich, Aparajita; Sarkar, Nibedita; Jain, Piyush; Jain, Sneha; Mondal, Sudipa; Aikat, Kaustav; Mukhopadhyay, Sudit S.

    2018-03-01

    Thermostable glucose tolerant β-glucosidase from Aspergillus species has attracted worldwide interest for their potentiality in industrial applications and bioethanol production. A strain of Aspergillus fumigatus (AfNITDGPKA3) identified by our laboratory from straw retting ground showed higher cellulase activity, specifically the β-glucosidase activity, compared to other contemporary strains. Though A. fumigatus has been known for high cellulase activity, detailed identification and characterization of the cellulase genes from their genome is yet to be done. In this work we have been analyzed the cellulase genes from the genome sequence database of Aspergillus fumigatus (Af293). Genome analysis suggests two cellobiohydrolase, eleven endoglucanase and seventeen β-glucosidase genes present. β-Glucosidase genes belong to either Glycohydro1 (GH1 or Bgl1) or Glycohydro3 (GH3 or Bgl3) family. The sequence similarity suggests that Bgl1 and Bgl3 of A. fumagatus are phylogenetically close to those of A. fisheri and A. oryzae. The modelled structure of the Bgl1 predicts the (β/α)8 barrel type structure with deep and narrow active site, whereas, Bgl3 shows the (α/β)8 barrel and (α/β)6 sandwich structure with shallow and open active site. Docking results suggest that amino acids Glu544, Glu466, Trp408,Trp567,Tyr44,Tyr222,Tyr770,Asp844,Asp537,Asn212,Asn217 of Bgl3 and Asp224,Asn242,Glu440, Glu445, Tyr367, Tyr365,Thr994,Trp435,Trp446 of Bgl1 are involved in the hydrolysis. Binding affinity analyses suggest that Bgl3 and Bgl1 enzymes are more active on the substrates like 4-methylumbelliferyl glycoside (MUG) and p-nitrophenyl-β-D-1, 4-glucopyranoside (pNPG) than on cellobiose. Further docking with glucose suggests that Bgl1 is more glucose tolerant than Bgl3. Analysis of the Aspergillus fumigatus genome may help to identify a β-glucosidase enzyme with better property and the structural information may help to develop an engineered recombinant enzyme.

  16. Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

    Science.gov (United States)

    Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

    2014-07-04

    Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was

  17. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  18. Comparative Genomic Hybridization Analysis of Yersinia enterocolitica and Yersinia pseudotuberculosis Identifies Genetic Traits to Elucidate Their Different Ecologies

    Directory of Open Access Journals (Sweden)

    Kaisa Jaakkola

    2015-01-01

    Full Text Available Enteropathogenic Yersinia enterocolitica and Yersinia pseudotuberculosis are both etiological agents for intestinal infection known as yersiniosis, but their epidemiology and ecology bear many differences. Swine are the only known reservoir for Y. enterocolitica 4/O:3 strains, which are the most common cause of human disease, while Y. pseudotuberculosis has been isolated from a variety of sources, including vegetables and wild animals. Infections caused by Y. enterocolitica mainly originate from swine, but fresh produce has been the source for widespread Y. pseudotuberculosis outbreaks within recent decades. A comparative genomic hybridization analysis with a DNA microarray based on three Yersinia enterocolitica and four Yersinia pseudotuberculosis genomes was conducted to shed light on the genomic differences between enteropathogenic Yersinia. The hybridization results identified Y. pseudotuberculosis strains to carry operons linked with the uptake and utilization of substances not found in living animal tissues but present in soil, plants, and rotting flesh. Y. pseudotuberculosis also harbors a selection of type VI secretion systems targeting other bacteria and eukaryotic cells. These genetic traits are not found in Y. enterocolitica, and it appears that while Y. pseudotuberculosis has many tools beneficial for survival in varied environments, the Y. enterocolitica genome is more streamlined and adapted to their preferred animal reservoir.

  19. Draft Genome Sequence of the Human-Pathogenic Fungus Scedosporium boydii

    OpenAIRE

    Duvaux, Ludovic; Shiller, Jason; Vandeputte, Patrick; Dug? de Bernonville, Thomas; Thornton, Christopher; Papon, Nicolas; Le Cam, Bruno; Bouchara, Jean-Philippe; Gastebois, Amandine

    2017-01-01

    ABSTRACT The opportunistic fungal pathogen Scedosporium boydii is the most common Scedosporium species in French patients with cystic fibrosis. Here we present the first genome report for S.?boydii, providing a resource which may enable the elucidation of the pathogenic mechanisms in this species.

  20. Translating the cancer genome: Going beyond p values

    Energy Technology Data Exchange (ETDEWEB)

    Chin, Lynda; Chin, Lynda; Gray, Joe W.

    2008-04-03

    Cancer cells are endowed with diverse biological capabilities driven by myriad inherited and somatic genetic and epigenetic aberrations that commandeer key cancer-relevant pathways. Efforts to elucidate these aberrations began with Boveri's hypothesis of aberrant mitoses causing cancer and continue today with a suite of powerful high-resolution technologies that enable detailed catalogues of genomic aberrations and epigenomic modifications. Tomorrow will likely bring the complete atlas of reversible and irreversible alteration in individual cancers. The challenge now is to discern causal molecular abnormalities from genomic and epigenomic 'noise', to understand how the ensemble of these aberrations collaborate to drive cancer pathophysiology. Here, we highlight lessons learned from now classical examples of successful translation of genomic discoveries into clinical practice, lessons that may be used to guide and accelerate translation of emerging genomic insights into practical clinical endpoints that can impact on practice of cancer medicine.

  1. Genomic and chromatin signals underlying transcription start-site selection

    DEFF Research Database (Denmark)

    Valen, Eivind; Sandelin, Albin Gustav

    2011-01-01

    A central question in cellular biology is how the cell regulates transcription and discerns when and where to initiate it. Locating transcription start sites (TSSs), the signals that specify them, and ultimately elucidating the mechanisms of regulated initiation has therefore been a recurrent theme....... In recent years substantial progress has been made towards this goal, spurred by the possibility of applying genome-wide, sequencing-based analysis. We now have a large collection of high-resolution datasets identifying locations of TSSs, protein-DNA interactions, and chromatin features over whole genomes...

  2. Salmonella Strains Isolated from Galápagos Iguanas Show Spatial Structuring of Serovar and Genomic Diversity

    Science.gov (United States)

    Lankau, Emily W.; Cruz Bedon, Lenin; Mackie, Roderick I.

    2012-01-01

    It is thought that dispersal limitation primarily structures host-associated bacterial populations because host distributions inherently limit transmission opportunities. However, enteric bacteria may disperse great distances during food-borne outbreaks. It is unclear if such rapid long-distance dispersal events happen regularly in natural systems or if these events represent an anthropogenic exception. We characterized Salmonella enterica isolates from the feces of free-living Galápagos land and marine iguanas from five sites on four islands using serotyping and genomic fingerprinting. Each site hosted unique and nearly exclusive serovar assemblages. Genomic fingerprint analysis offered a more complex model of S. enterica biogeography, with evidence of both unique strain pools and of spatial population structuring along a geographic gradient. These findings suggest that even relatively generalist enteric bacteria may be strongly dispersal limited in a natural system with strong barriers, such as oceanic divides. Yet, these differing results seen on two typing methods also suggests that genomic variation is less dispersal limited, allowing for different ecological processes to shape biogeographical patterns of the core and flexible portions of this bacterial species' genome. PMID:22615968

  3. Salmonella strains isolated from Galápagos iguanas show spatial structuring of serovar and genomic diversity.

    Directory of Open Access Journals (Sweden)

    Emily W Lankau

    Full Text Available It is thought that dispersal limitation primarily structures host-associated bacterial populations because host distributions inherently limit transmission opportunities. However, enteric bacteria may disperse great distances during food-borne outbreaks. It is unclear if such rapid long-distance dispersal events happen regularly in natural systems or if these events represent an anthropogenic exception. We characterized Salmonella enterica isolates from the feces of free-living Galápagos land and marine iguanas from five sites on four islands using serotyping and genomic fingerprinting. Each site hosted unique and nearly exclusive serovar assemblages. Genomic fingerprint analysis offered a more complex model of S. enterica biogeography, with evidence of both unique strain pools and of spatial population structuring along a geographic gradient. These findings suggest that even relatively generalist enteric bacteria may be strongly dispersal limited in a natural system with strong barriers, such as oceanic divides. Yet, these differing results seen on two typing methods also suggests that genomic variation is less dispersal limited, allowing for different ecological processes to shape biogeographical patterns of the core and flexible portions of this bacterial species' genome.

  4. Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

    Science.gov (United States)

    Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

    2011-01-01

    Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

  5. Structure Elucidation of A New Coumarin Type Compound, Alternamin, Isolated from the Plant (Murraya Alternans (Kurz) Swingle) Having The Antidote Activity on Snake Venoms

    International Nuclear Information System (INIS)

    Hla Myo Min; Mya Aye

    2004-06-01

    The structure of a new coumarin type compound isolated from the entitled species was elucidated by the full spectral analysis consisting of FTIR, 1H NMR, DQF COSY, 13C NMR, DEPT, EIMS (HR-EIMS), HMQC and HMBC. The antidote activities of the fresh juice and the ethanolic extract of the plant, and the isolated compound alternamin were also determined

  6. Draft Genome Sequence of Marine Sponge Symbiont Pseudoalteromonas luteoviolacea IPB1, Isolated from Hilo, Hawaii.

    Science.gov (United States)

    Sakai-Kawada, Francis E; Yakym, Christopher J; Helmkampf, Martin; Hagiwara, Kehau; Ip, Courtney G; Antonio, Brandi J; Armstrong, Ellie; Ulloa, Wesley J; Awaya, Jonathan D

    2016-09-22

    We report here the 6.0-Mb draft genome assembly of Pseudoalteromonas luteoviolacea strain IPB1 that was isolated from the Hawaiian marine sponge Iotrochota protea Genome mining complemented with bioassay studies will elucidate secondary metabolite biosynthetic pathways and will help explain the ecological interaction between host sponge and microorganism. Copyright © 2016 Sakai-Kawada et al.

  7. Genomic Sequence of Saccharomyces cerevisiae BAW-6, a Yeast Strain Optimal for Brewing Barley Shochu.

    Science.gov (United States)

    Kajiwara, Yasuhiro; Mori, Kazuki; Tashiro, Kosuke; Higuchi, Yujiro; Takegawa, Kaoru; Takashita, Hideharu

    2018-04-05

    Here, we report the draft genome sequence of Saccharomyces cerevisiae strain BAW-6, which is used for the production of barley shochu, a traditional Japanese spirit. This genomic information can be used to elucidate the genetic basis underlying the high alcohol production capacity and citric acid tolerance of shochu yeast. Copyright © 2018 Kajiwara et al.

  8. Symbiodinium genomes reveal adaptive evolution of functions related to symbiosis

    KAUST Repository

    Liu, Huanle; Stephens, Timothy G.; Gonzá lez-Pech, Raú l; Beltran, Victor H.; Lapeyre, Bruno; Bongaerts, Pim; Cooke, Ira; Bourne, David G.; Forê t, Sylvain; Miller, David John; van Oppen, Madeleine J. H.; Voolstra, Christian R.; Ragan, Mark A.; Chan, Cheong Xin

    2017-01-01

    Symbiosis between dinoflagellates of the genus Symbiodinium and reef-building corals forms the trophic foundation of the world's coral reef ecosystems. Here we present the first draft genome of Symbiodinium goreaui (Clade C, type C1: 1.03 Gbp), one of the most ubiquitous endosymbionts associated with corals, and an improved draft genome of Symbiodinium kawagutii (Clade F, strain CS-156: 1.05 Gbp), previously sequenced as strain CCMP2468, to further elucidate genomic signatures of this symbiosis. Comparative analysis of four available Symbiodinium genomes against other dinoflagellate genomes led to the identification of 2460 nuclear gene families that show evidence of positive selection, including genes involved in photosynthesis, transmembrane ion transport, synthesis and modification of amino acids and glycoproteins, and stress response. Further, we identified extensive sets of genes for meiosis and response to light stress. These draft genomes provide a foundational resource for advancing our understanding Symbiodinium biology and the coral-algal symbiosis.

  9. Symbiodinium genomes reveal adaptive evolution of functions related to symbiosis

    KAUST Repository

    Liu, Huanle

    2017-10-06

    Symbiosis between dinoflagellates of the genus Symbiodinium and reef-building corals forms the trophic foundation of the world\\'s coral reef ecosystems. Here we present the first draft genome of Symbiodinium goreaui (Clade C, type C1: 1.03 Gbp), one of the most ubiquitous endosymbionts associated with corals, and an improved draft genome of Symbiodinium kawagutii (Clade F, strain CS-156: 1.05 Gbp), previously sequenced as strain CCMP2468, to further elucidate genomic signatures of this symbiosis. Comparative analysis of four available Symbiodinium genomes against other dinoflagellate genomes led to the identification of 2460 nuclear gene families that show evidence of positive selection, including genes involved in photosynthesis, transmembrane ion transport, synthesis and modification of amino acids and glycoproteins, and stress response. Further, we identified extensive sets of genes for meiosis and response to light stress. These draft genomes provide a foundational resource for advancing our understanding Symbiodinium biology and the coral-algal symbiosis.

  10. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis.

    Science.gov (United States)

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.

  11. Cryo-electron Microscopy Reconstruction and Stability Studies of the Wild Type and the R432A Variant of Adeno-associated Virus Type 2 Reveal that Capsid Structural Stability Is a Major Factor in Genome Packaging.

    Science.gov (United States)

    Drouin, Lauren M; Lins, Bridget; Janssen, Maria; Bennett, Antonette; Chipman, Paul; McKenna, Robert; Chen, Weijun; Muzyczka, Nicholas; Cardone, Giovanni; Baker, Timothy S; Agbandje-McKenna, Mavis

    2016-10-01

    The adeno-associated viruses (AAV) are promising therapeutic gene delivery vectors and better understanding of their capsid assembly and genome packaging mechanism is needed for improved vector production. Empty AAV capsids assemble in the nucleus prior to genome packaging by virally encoded Rep proteins. To elucidate the capsid determinants of this process, structural differences between wild-type (wt) AAV2 and a packaging deficient variant, AAV2-R432A, were examined using cryo-electron microscopy and three-dimensional image reconstruction both at an ∼5.0-Å resolution (medium) and also at 3.8- and 3.7-Å resolutions (high), respectively. The high resolution structures showed that removal of the arginine side chain in AAV2-R432A eliminated hydrogen bonding interactions, resulting in altered intramolecular and intermolecular interactions propagated from under the 3-fold axis toward the 5-fold channel. Consistent with these observations, differential scanning calorimetry showed an ∼10°C decrease in thermal stability for AAV2-R432A compared to wt-AAV2. In addition, the medium resolution structures revealed differences in the juxtaposition of the less ordered, N-terminal region of their capsid proteins, VP1/2/3. A structural rearrangement in AAV2-R432A repositioned the βA strand region under the icosahedral 2-fold axis rather than antiparallel to the βB strand, eliminating many intramolecular interactions. Thus, a single amino acid substitution can significantly alter the AAV capsid integrity to the extent of reducing its stability and possibly rendering it unable to tolerate the stress of genome packaging. Furthermore, the data show that the 2-, 3-, and 5-fold regions of the capsid contributed to producing the packaging defect and highlight a tight connection between the entire capsid in maintaining packaging efficiency. The mechanism of AAV genome packaging is still poorly understood, particularly with respect to the capsid determinants of the required capsid

  12. The structures of bovine herpesvirus 1 virion and concatemeric DNA: implications for cleavage and packaging of herpesvirus genomes

    International Nuclear Information System (INIS)

    Schynts, Frederic; McVoy, Michael A.; Meurens, Francois; Detry, Bruno; Epstein, Alberto L.; Thiry, Etienne

    2003-01-01

    Herpesvirus genomes are often characterized by the presence of direct and inverted repeats that delineate their grouping into six structural classes. Class D genomes consist of a long (L) segment and a short (S) segment. The latter is flanked by large inverted repeats. DNA replication produces concatemers of head-to-tail linked genomes that are cleaved into unit genomes during the process of packaging DNA into capsids. Packaged class D genomes are an equimolar mixture of two isomers in which S is in either of two orientations, presumably a consequence of homologous recombination between the inverted repeats. The L segment remains predominantly fixed in a prototype (P) orientation; however, low levels of genomes having inverted L (I L ) segments have been reported for some class D herpesviruses. Inefficient formation of class D I L genomes has been attributed to infrequent L segment inversion, but recent detection of frequent inverted L segments in equine herpesvirus 1 concatemers [Virology 229 (1997) 415-420] suggests that the defect may be at the level of cleavage and packaging rather than inversion. In this study, the structures of virion and concatemeric DNA of another class D herpesvirus, bovine herpesvirus 1, were determined. Virion DNA contained low levels of I L genomes, whereas concatemeric DNA contained significant amounts of L segments in both P and I L orientations. However, concatemeric termini exhibited a preponderance of L termini derived from P isomers which was comparable to the preponderance of P genomes found in virion DNA. Thus, the defect in formation of I L genomes appears to lie at the level of concatemer cleavage. These results have important implications for the mechanisms by which herpesvirus DNA cleavage and packaging occur

  13. Biochemical research elucidating metabolic pathways in Pneumocystis*

    Directory of Open Access Journals (Sweden)

    Kaneshiro E.S.

    2010-12-01

    Full Text Available Advances in sequencing the Pneumocystis carinii genome have helped identify potential metabolic pathways operative in the organism. Also, data from characterizing the biochemical and physiological nature of these organisms now allow elucidation of metabolic pathways as well as pose new challenges and questions that require additional experiments. These experiments are being performed despite the difficulty in doing experiments directly on this pathogen that has yet to be subcultured indefinitely and produce mass numbers of cells in vitro. This article reviews biochemical approaches that have provided insights into several Pneumocystis metabolic pathways. It focuses on 1 S-adenosyl-L-methionine (AdoMet; SAM, which is a ubiquitous participant in numerous cellular reactions; 2 sterols: focusing on oxidosqualene cyclase that forms lanosterol in P. carinii; SAM:sterol C-24 methyltransferase that adds methyl groups at the C-24 position of the sterol side chain; and sterol 14α-demethylase that removes a methyl group at the C-14 position of the sterol nucleus; and 3 synthesis of ubiquinone homologs, which play a pivotal role in mitochondrial inner membrane and other cellular membrane electron transport.

  14. Structural basis of genomic RNA (gRNA) dimerization and packaging determinants of mouse mammary tumor virus (MMTV).

    Science.gov (United States)

    Aktar, Suriya J; Vivet-Boudou, Valérie; Ali, Lizna M; Jabeen, Ayesha; Kalloush, Rawan M; Richer, Delphine; Mustafa, Farah; Marquet, Roland; Rizvi, Tahir A

    2014-11-14

    One of the hallmarks of retroviral life cycle is the efficient and specific packaging of two copies of retroviral gRNA in the form of a non-covalent RNA dimer by the assembling virions. It is becoming increasingly clear that the process of dimerization is closely linked with gRNA packaging, and in some retroviruses, the latter depends on the former. Earlier mutational analysis of the 5' end of the MMTV genome indicated that MMTV gRNA packaging determinants comprise sequences both within the 5' untranslated region (5' UTR) and the beginning of gag. The RNA secondary structure of MMTV gRNA packaging sequences was elucidated employing selective 2'hydroxyl acylation analyzed by primer extension (SHAPE). SHAPE analyses revealed the presence of a U5/Gag long-range interaction (U5/Gag LRI), not predicted by minimum free-energy structure predictions that potentially stabilizes the global structure of this region. Structure conservation along with base-pair covariations between different strains of MMTV further supported the SHAPE-validated model. The 5' region of the MMTV gRNA contains multiple palindromic (pal) sequences that could initiate intermolecular interaction during RNA dimerization. In vitro RNA dimerization, SHAPE analysis, and structure prediction approaches on a series of pal mutants revealed that MMTV RNA utilizes a palindromic point of contact to initiate intermolecular interactions between two gRNAs, leading to dimerization. This contact point resides within pal II (5' CGGCCG 3') at the 5' UTR and contains a canonical "GC" dyad and therefore likely constitutes the MMTV RNA dimerization initiation site (DIS). Further analyses of these pal mutants employing in vivo genetic approaches indicate that pal II, as well as pal sequences located in the primer binding site (PBS) are both required for efficient MMTV gRNA packaging. Employing structural prediction, biochemical, and genetic approaches, we show that pal II functions as a primary point of contact between

  15. SCHEMA computational design of virus capsid chimeras: calibrating how genome packaging, protection, and transduction correlate with calculated structural disruption.

    Science.gov (United States)

    Ho, Michelle L; Adler, Benjamin A; Torre, Michael L; Silberg, Jonathan J; Suh, Junghae

    2013-12-20

    Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effectively transduce cells. Such improvements to the virus design process may help advance not only gene therapy applications but also other bionanotechnologies dependent upon the development of viruses with new sequences and functions.

  16. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Liu, Siyang; Huang, Shujia; Rao, Junhua

    2015-01-01

    present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome......) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We...... assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction...

  17. Structural Elucidation of Z- and E- Isomers of 5-Alkyl-4-ethoxycarbonyl-5-(4`-chlorophenyl-3-oxa-4-pentenoic Acids

    Directory of Open Access Journals (Sweden)

    H. M. F. Madkour

    2000-05-01

    Full Text Available Z- and E-isomers of 5-alkyl-4-ethoxycarbonyl-5-(4`-chlorophenyl-3-oxa-4-pentenoic acids were prepared via the condensation of p-chloroacetophenone and/or pchloropropiophenone with diethyl-2,2`-oxydiacetate in the presence of sodium hydride as a basic catalyst. The Z-isomers of 2a and 2b were found to be predominant. The behaviour of the corresponding anhydrides towards the action of hydrazine, phenylhydrazine, primary aromatic amines, hydrocarbons and ethanolysis has also been investigated. The structures and configurations of the products have been elucidated by chemical and spectroscopic means.

  18. Advanced Whole-Genome Sequencing and Analysis of Fetal Genomes from Amniotic Fluid.

    Science.gov (United States)

    Mao, Qing; Chin, Robert; Xie, Weiwei; Deng, Yuqing; Zhang, Wenwei; Xu, Huixin; Zhang, Rebecca Yu; Shi, Quan; Peters, Erin E; Gulbahce, Natali; Li, Zhenyu; Chen, Fang; Drmanac, Radoje; Peters, Brock A

    2018-04-01

    Amniocentesis is a common procedure, the primary purpose of which is to collect cells from the fetus to allow testing for abnormal chromosomes, altered chromosomal copy number, or a small number of genes that have small single- to multibase defects. Here we demonstrate the feasibility of generating an accurate whole-genome sequence of a fetus from either the cellular or cell-free DNA (cfDNA) of an amniotic sample. cfDNA and DNA isolated from the cell pellet of 31 amniocenteses were sequenced to approximately 50× genome coverage by use of the Complete Genomics nanoarray platform. In a subset of the samples, long fragment read libraries were generated from DNA isolated from cells and sequenced to approximately 100× genome coverage. Concordance of variant calls between the 2 DNA sources and with parental libraries was >96%. Two fetal genomes were found to harbor potentially detrimental variants in chromodomain helicase DNA binding protein 8 ( CHD8 ) and LDL receptor-related protein 1 ( LRP1 ), variations of which have been associated with autism spectrum disorder and keratosis pilaris atrophicans, respectively. We also discovered drug sensitivities and carrier information of fetuses for a variety of diseases. We were able to elucidate the complete genome sequence of 31 fetuses from amniotic fluid and demonstrate that the cfDNA or DNA from the cell pellet can be analyzed with little difference in quality. We believe that current technologies could analyze this material in a highly accurate and complete manner and that analyses like these should be considered for addition to current amniocentesis procedures. © 2018 American Association for Clinical Chemistry.

  19. Unusual genome complexity in Lactobacillus salivarius JCM1046.

    Science.gov (United States)

    Raftis, Emma J; Forde, Brian M; Claesson, Marcus J; O'Toole, Paul W

    2014-09-08

    Lactobacillus salivarius strains are increasingly being exploited for their probiotic properties in humans and animals. Dissemination of antibiotic resistance genes among species with food or probiotic-association is undesirable and is often mediated by plasmids or integrative and conjugative elements. L. salivarius strains typically have multireplicon genomes including circular megaplasmids that encode strain-specific traits for intestinal survival and probiotic activity. Linear plasmids are less common in lactobacilli and show a very limited distribution in L. salivarius. Here we present experimental evidence that supports an unusually complex multireplicon genome structure in the porcine isolate L. salivarius JCM1046. JCM1046 harbours a 1.83 Mb chromosome, and four plasmids which constitute 20% of the genome. In addition to the known 219 kb repA-type megaplasmid pMP1046A, we identified and experimentally validated the topology of three additional replicons, the circular pMP1046B (129 kb), a linear plasmid pLMP1046 (101 kb) and pCTN1046 (33 kb) harbouring a conjugative transposon. pMP1046B harbours both plasmid-associated replication genes and paralogues of chromosomally encoded housekeeping and information-processing related genes, thus qualifying it as a putative chromid. pLMP1046 shares limited sequence homology or gene synteny with other L. salivarius plasmids, and its putative replication-associated protein is homologous to the RepA/E proteins found in the large circular megaplasmids of L. salivarius. Plasmid pCTN1046 harbours a single copy of an integrated conjugative transposon (Tn6224) which appears to be functionally intact and includes the tetracycline resistance gene tetM. Experimental validation of sequence assemblies and plasmid topology resolved the complex genome architecture of L. salivarius JCM1046. A high-coverage draft genome sequence would not have elucidated the genome complexity in this strain. Given the expanding use of L. salivarius

  20. Genetic diversity and structure of elite cotton germplasm (Gossypium hirsutum L.) using genome-wide SNP data.

    Science.gov (United States)

    Ai, XianTao; Liang, YaJun; Wang, JunDuo; Zheng, JuYun; Gong, ZhaoLong; Guo, JiangPing; Li, XueYuan; Qu, YanYing

    2017-10-01

    Cotton (Gossypium spp.) is the most important natural textile fiber crop, and Gossypium hirsutum L. is responsible for 90% of the annual cotton crop in the world. Information on cotton genetic diversity and population structure is essential for new breeding lines. In this study, we analyzed population structure and genetic diversity of 288 elite Gossypium hirsutum cultivar accessions collected from around the world, and especially from China, using genome-wide single nucleotide polymorphisms (SNP) markers. The average polymorphsim information content (PIC) was 0.25, indicating a relatively low degree of genetic diversity. Population structure analysis revealed extensive admixture and identified three subgroups. Phylogenetic analysis supported the subgroups identified by STRUCTURE. The results from both population structure and phylogenetic analysis were, for the most part, in agreement with pedigree information. Analysis of molecular variance revealed a larger amount of variation was due to diversity within the groups. Establishment of genetic diversity and population structure from this study could be useful for genetic and genomic analysis and systematic utilization of the standing genetic variation in upland cotton.

  1. Draft genome sequence of Sclerospora graminicola, the pearl millet downy mildew pathogen

    Directory of Open Access Journals (Sweden)

    S. Chandra Nayaka

    2017-12-01

    Full Text Available Sclerospora graminicola pathogen is the most important biotic production constraints of pearl millet in India, Africa and other parts of the world. We report a de novo whole genome assembly and analysis of pathotype 1, one of the most virulent pathotypes of S. graminicola from India. The draft genome assembly contained 299,901,251 bp with 65,404 genes. This study may help understand the evolutionary pattern of pathogen and aid elucidation of effector evolution for devising effective durable resistance breeding strategies in pearl millet. Keywords: Sclerospora graminicola, Pathotype 1, Pearl millet, Downy mildew, Whole genome sequence

  2. Genome Structural Diversity among 31 Bordetella pertussis Isolates from Two Recent U.S. Whooping Cough Statewide Epidemics.

    Science.gov (United States)

    Bowden, Katherine E; Weigand, Michael R; Peng, Yanhui; Cassiday, Pamela K; Sammons, Scott; Knipe, Kristen; Rowe, Lori A; Loparev, Vladimir; Sheth, Mili; Weening, Keeley; Tondella, M Lucia; Williams, Margaret M

    2016-01-01

    During 2010 and 2012, California and Vermont, respectively, experienced statewide epidemics of pertussis with differences seen in the demographic affected, case clinical presentation, and molecular epidemiology of the circulating strains. To overcome limitations of the current molecular typing methods for pertussis, we utilized whole-genome sequencing to gain a broader understanding of how current circulating strains are causing large epidemics. Through the use of combined next-generation sequencing technologies, this study compared de novo, single-contig genome assemblies from 31 out of 33 Bordetella pertussis isolates collected during two separate pertussis statewide epidemics and 2 resequenced vaccine strains. Final genome architecture assemblies were verified with whole-genome optical mapping. Sixteen distinct genome rearrangement profiles were observed in epidemic isolate genomes, all of which were distinct from the genome structures of the two resequenced vaccine strains. These rearrangements appear to be mediated by repetitive sequence elements, such as high-copy-number mobile genetic elements and rRNA operons. Additionally, novel and previously identified single nucleotide polymorphisms were detected in 10 virulence-related genes in the epidemic isolates. Whole-genome variation analysis identified state-specific variants, and coding regions bearing nonsynonymous mutations were classified into functional annotated orthologous groups. Comprehensive studies on whole genomes are needed to understand the resurgence of pertussis and develop novel tools to better characterize the molecular epidemiology of evolving B. pertussis populations. IMPORTANCE Pertussis, or whooping cough, is the most poorly controlled vaccine-preventable bacterial disease in the United States, which has experienced a resurgence for more than a decade. Once viewed as a monomorphic pathogen, B. pertussis strains circulating during epidemics exhibit diversity visible on a genome structural

  3. Multiscale modeling of three-dimensional genome

    Science.gov (United States)

    Zhang, Bin; Wolynes, Peter

    The genome, the blueprint of life, contains nearly all the information needed to build and maintain an entire organism. A comprehensive understanding of the genome is of paramount interest to human health and will advance progress in many areas, including life sciences, medicine, and biotechnology. The overarching goal of my research is to understand the structure-dynamics-function relationships of the human genome. In this talk, I will be presenting our efforts in moving towards that goal, with a particular emphasis on studying the three-dimensional organization, the structure of the genome with multi-scale approaches. Specifically, I will discuss the reconstruction of genome structures at both interphase and metaphase by making use of data from chromosome conformation capture experiments. Computationally modeling of chromatin fiber at atomistic level from first principles will also be presented as our effort for studying the genome structure from bottom up.

  4. Syntheses, structural elucidation, thermal properties, theoretical quantum chemical studies (DFT and biological studies of barbituric–hydrazone complexes

    Directory of Open Access Journals (Sweden)

    Amina A. Soayed

    2015-03-01

    Full Text Available Condensation of barbituric acid with hydrazine hydrate yielded barbiturichydrazone (L which was characterized using IR, 1H NMR and mass spectra. The Co(II, Ni(II and Cu(II complexes derived from this ligand have been synthesized and structurally characterized by elemental analyses, spectroscopic methods (IR, UV–Vis and ESR and thermal analyses (TGA, DTG and DTA and the structures were further elucidated using quantum chemical density functional theory. Complexes of L were found to have the ML.nH2O stoichiometry with either tetrahedral or octahedral geometry. The ESR data showed the Cu(II complex to be in a tetragonal geometry. Theoretical investigation of the electronic structure of metal complexes at the TD-DFT/B3LYP level of theory has been carried out and discussed. The fundamental vibrational wavenumbers were calculated and a good agreement between observed and scaled calculated wavenumbers was achieved. Thermal studies were performed to deduce the stabilities of the ligand and complexes. Thermodynamic parameters, such as the order of reactions (n, activation energy ΔE∗, enthalpy of reaction ΔH∗ and entropy ΔS∗ were calculated from DTA curves using Horowitz–Metzger method. The ligand L and its complexes have been screened for their antifungal and antibacterial activities and were found to possess better biological activities compared to those of unsubstituted barbituric acid complexes.

  5. Genome-wide analysis of Dongxiang wild rice (Oryza rufipogon Griff.) to investigate lost/acquired genes during rice domestication.

    Science.gov (United States)

    Zhang, Fantao; Xu, Tao; Mao, Linyong; Yan, Shuangyong; Chen, Xiwen; Wu, Zhenfeng; Chen, Rui; Luo, Xiangdong; Xie, Jiankun; Gao, Shan

    2016-04-26

    It is widely accepted that cultivated rice (Oryza sativa L.) was domesticated from common wild rice (Oryza rufipogon Griff.). Compared to other studies which concentrate on rice origin, this study is to genetically elucidate the substantially phenotypic and physiological changes from wild rice to cultivated rice at the whole genome level. Instead of comparing two assembled genomes, this study directly compared the Dongxiang wild rice (DXWR) Illumina sequencing reads with the Nipponbare (O. sativa) complete genome without assembly of the DXWR genome. Based on the results from the comparative genomics analysis, structural variations (SVs) between DXWR and Nipponbare were determined to locate deleted genes which could have been acquired by Nipponbare during rice domestication. To overcome the limit of the SV detection, the DXWR transcriptome was also sequenced and compared with the Nipponbare transcriptome to discover the genes which could have been lost in DXWR during domestication. Both 1591 Nipponbare-acquired genes and 206 DXWR-lost transcripts were further analyzed using annotations from multiple sources. The NGS data are available in the NCBI SRA database with ID SRP070627. These results help better understanding the domestication from wild rice to cultivated rice at the whole genome level and provide a genomic data resource for rice genetic research or breeding. One finding confirmed transposable elements contribute greatly to the genome evolution from wild rice to cultivated rice. Another finding suggested the photophosphorylation and oxidative phosphorylation system in cultivated rice could have adapted to environmental changes simultaneously during domestication.

  6. Training set optimization under population structure in genomic selection.

    Science.gov (United States)

    Isidro, Julio; Jannink, Jean-Luc; Akdemir, Deniz; Poland, Jesse; Heslot, Nicolas; Sorrells, Mark E

    2015-01-01

    Population structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance. The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.

  7. Unraveling Mycobacterium tuberculosis genomic diversity and evolution in Lisbon, Portugal, a highly drug resistant setting

    KAUST Repository

    Perdigão, João

    2014-11-18

    Background Multidrug- (MDR) and extensively drug resistant (XDR) tuberculosis (TB) presents a challenge to disease control and elimination goals. In Lisbon, Portugal, specific and successful XDR-TB strains have been found in circulation for almost two decades. Results In the present study we have genotyped and sequenced the genomes of 56 Mycobacterium tuberculosis isolates recovered mostly from Lisbon. The genotyping data revealed three major clusters associated with MDR-TB, two of which are associated with XDR-TB. Whilst the genomic data contributed to elucidate the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of Mycobacterium tuberculosis clinical isolates, revealed two major clades responsible for M/XDR-TB in the region: Lisboa3 and Q1 (LAM). The data presented by this study yielded insights on microevolution and identification of novel compensatory mutations associated with rifampicin resistance in rpoB and rpoC. The screening for other structural variations revealed putative clade-defining variants. One deletion in PPE41, found among Lisboa3 isolates, is proposed to contribute to immune evasion and as a selective advantage. Insertion sequence (IS) mapping has also demonstrated the role of IS6110 as a major driver in mycobacterial evolution by affecting gene integrity and regulation. Conclusions Globally, this study contributes with novel genome-wide phylogenetic data and has led to the identification of new genomic variants that support the notion of a growing genomic diversity facing both setting and host adaptation.

  8. Genome packaging in viruses

    OpenAIRE

    Sun, Siyang; Rao, Venigalla B.; Rossmann, Michael G.

    2010-01-01

    Genome packaging is a fundamental process in a viral life cycle. Many viruses assemble preformed capsids into which the genomic material is subsequently packaged. These viruses use a packaging motor protein that is driven by the hydrolysis of ATP to condense the nucleic acids into a confined space. How these motor proteins package viral genomes had been poorly understood until recently, when a few X-ray crystal structures and cryo-electron microscopy structures became available. Here we discu...

  9. Chemical and radiation mutagenesis: Induction and detection by whole genome sequencing

    Science.gov (United States)

    Brachypodium distachyon has emerged as an effective model system to address fundamental questions in grass biology. With its small sequenced genome, short generation time and rapidly expanding array of genetic tools B. distachyon is an ideal system to elucidate the molecular basis of important trai...

  10. Draft Genome Sequence of Lactobacillus delbrueckii subsp. bulgaricus LBB.B5

    NARCIS (Netherlands)

    Urshev, Z.; Hajo, K.; Lenoci, L.; Bron, P.A.; Dijkstra, A.; Alkema, W.; Wels, M.; Siezen, R.J.; Minkova, S.; Hijum, S.A. van

    2016-01-01

    Lactobacillus delbrueckii subsp. bulgaricus LBB.B5 originates from homemade Bulgarian yogurt and was selected for its ability to form a strong association with Streptococcus thermophilus The genome sequence will facilitate elucidating the genetic background behind the contribution of LBB.B5 to the

  11. Identification of copy number variants defining genomic differences among major human groups.

    Directory of Open Access Journals (Sweden)

    Lluís Armengol

    Full Text Available BACKGROUND: Understanding the genetic contribution to phenotype variation of human groups is necessary to elucidate differences in disease predisposition and response to pharmaceutical treatments in different human populations. METHODOLOGY/PRINCIPAL FINDINGS: We have investigated the genome-wide profile of structural variation on pooled samples from the three populations studied in the HapMap project by comparative genome hybridization (CGH in different array platforms. We have identified and experimentally validated 33 genomic loci that show significant copy number differences from one population to the other. Interestingly, we found an enrichment of genes related to environment adaptation (immune response, lipid metabolism and extracellular space within these regions and the study of expression data revealed that more than half of the copy number variants (CNVs translate into gene-expression differences among populations, suggesting that they could have functional consequences. In addition, the identification of single nucleotide polymorphisms (SNPs that are in linkage disequilibrium with the copy number alleles allowed us to detect evidences of population differentiation and recent selection at the nucleotide variation level. CONCLUSIONS: Overall, our results provide a comprehensive view of relevant copy number changes that might play a role in phenotypic differences among major human populations, and generate a list of interesting candidates for future studies.

  12. Discovery of the leinamycin family of natural products by mining actinobacterial genomes.

    Science.gov (United States)

    Pan, Guohui; Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Yang, Dong; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen; Shen, Ben

    2017-12-26

    Nature's ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF-SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF-SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm -type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature's rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature's biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity.

  13. Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species.

    Science.gov (United States)

    Nepal, Madhav P; Andersen, Ethan J; Neupane, Surendra; Benson, Benjamin V

    2017-09-30

    Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis , we investigated nTNL orthologs in the genomes of common bean, Medicago , soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis , common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence.

  14. Genome and Proteome Analysis of Rhodococcus erythropolis MI2: Elucidation of the 4,4´-Dithiodibutyric Acid Catabolism

    Science.gov (United States)

    Khairy, Heba; Meinert, Christina; Wübbeler, Jan Hendrik; Poehlein, Anja; Daniel, Rolf; Voigt, Birgit; Riedel, Katharina; Steinbüchel, Alexander

    2016-01-01

    Rhodococcus erythropolis MI2 has the extraordinary ability to utilize the xenobiotic 4,4´-dithiodibutyric acid (DTDB). Cleavage of DTDB by the disulfide-reductase Nox, which is the only verified enzyme involved in DTDB-degradation, raised 4-mercaptobutyric acid (4MB). 4MB could act as building block of a novel polythioester with unknown properties. To completely unravel the catabolism of DTDB, the genome of R. erythropolis MI2 was sequenced, and subsequently the proteome was analyzed. The draft genome sequence consists of approximately 7.2 Mbp with an overall G+C content of 62.25% and 6,859 predicted protein-encoding genes. The genome of strain MI2 is composed of three replicons: one chromosome and two megaplasmids with sizes of 6.45, 0.4 and 0.35 Mbp, respectively. When cells of strain MI2 were cultivated with DTDB as sole carbon source and compared to cells grown with succinate, several interesting proteins with significantly higher expression levels were identified using 2D-PAGE and MALDI-TOF mass spectrometry. A putative luciferase-like monooxygenase-class F420-dependent oxidoreductase (RERY_05640), which is encoded by one of the 126 monooxygenase-encoding genes of the MI2-genome, showed a 3-fold increased expression level. This monooxygenase could oxidize the intermediate 4MB into 4-oxo-4-sulfanylbutyric acid. Next, a desulfurization step, which forms succinic acid and volatile hydrogen sulfide, is proposed. One gene coding for a putative desulfhydrase (RERY_06500) was identified in the genome of strain MI2. However, the gene product was not recognized in the proteome analyses. But, a significant expression level with a ratio of up to 7.3 was determined for a putative sulfide:quinone oxidoreductase (RERY_02710), which could also be involved in the abstraction of the sulfur group. As response to the toxicity of the intermediates, several stress response proteins were strongly expressed, including a superoxide dismutase (RERY_05600) and an osmotically induced

  15. Genome and Proteome Analysis of Rhodococcus erythropolis MI2: Elucidation of the 4,4´-Dithiodibutyric Acid Catabolism.

    Directory of Open Access Journals (Sweden)

    Heba Khairy

    Full Text Available Rhodococcus erythropolis MI2 has the extraordinary ability to utilize the xenobiotic 4,4´-dithiodibutyric acid (DTDB. Cleavage of DTDB by the disulfide-reductase Nox, which is the only verified enzyme involved in DTDB-degradation, raised 4-mercaptobutyric acid (4MB. 4MB could act as building block of a novel polythioester with unknown properties. To completely unravel the catabolism of DTDB, the genome of R. erythropolis MI2 was sequenced, and subsequently the proteome was analyzed. The draft genome sequence consists of approximately 7.2 Mbp with an overall G+C content of 62.25% and 6,859 predicted protein-encoding genes. The genome of strain MI2 is composed of three replicons: one chromosome and two megaplasmids with sizes of 6.45, 0.4 and 0.35 Mbp, respectively. When cells of strain MI2 were cultivated with DTDB as sole carbon source and compared to cells grown with succinate, several interesting proteins with significantly higher expression levels were identified using 2D-PAGE and MALDI-TOF mass spectrometry. A putative luciferase-like monooxygenase-class F420-dependent oxidoreductase (RERY_05640, which is encoded by one of the 126 monooxygenase-encoding genes of the MI2-genome, showed a 3-fold increased expression level. This monooxygenase could oxidize the intermediate 4MB into 4-oxo-4-sulfanylbutyric acid. Next, a desulfurization step, which forms succinic acid and volatile hydrogen sulfide, is proposed. One gene coding for a putative desulfhydrase (RERY_06500 was identified in the genome of strain MI2. However, the gene product was not recognized in the proteome analyses. But, a significant expression level with a ratio of up to 7.3 was determined for a putative sulfide:quinone oxidoreductase (RERY_02710, which could also be involved in the abstraction of the sulfur group. As response to the toxicity of the intermediates, several stress response proteins were strongly expressed, including a superoxide dismutase (RERY_05600 and an

  16. Discovery of new enzymes and metabolic pathways using structure and genome context

    Science.gov (United States)

    Zhao, Suwen; Kumar, Ritesh; Sakai, Ayano; Vetting, Matthew W.; Wood, B. McKay; Brown, Shoshana; Bonanno, Jeffery B.; Hillerich, Brandan S.; Seidel, Ronald D.; Babbitt, Patricia C.; Almo, Steven C.; Sweedler, Jonathan V.; Gerlt, John A.; Cronan, John E.; Jacobson, Matthew P.

    2014-01-01

    Assigning valid functions to proteins identified in genome projects is challenging, with over-prediction and database annotation errors major concerns1. We, and others2, are developing computation-guided strategies for functional discovery using “metabolite docking” to experimentally derived3 or homology-based4 three-dimensional structures. Bacterial metabolic pathways often are encoded by “genome neighborhoods” (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by “predicting” the intermediates in the glycolytic pathway in E. coli5. Metabolite docking to multiple binding proteins/enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. We report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed i) the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-L-proline betaine (tHyp-B) and cis-4-hydroxy-D-proline betaine (cHyp-B), and ii) the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guide functional predictions to enable the discovery of new metabolic pathways. PMID:24056934

  17. Reference-quality genome sequence of Aegilops tauschii, the source of wheat D genome, shows that recombination shapes genome structure and evolution

    Science.gov (United States)

    Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat and an important genetic resource for wheat. A reference-quality sequence for the Ae. tauschii genome was produced with a combination of ordered-clone sequencing, whole-genome shotgun sequencing, and BioNano optical geno...

  18. Transcriptional Regulation During Zygotic Genome Activation in Zebrafish and Other Anamniote Embryos.

    Science.gov (United States)

    Wragg, J; Müller, F

    2016-01-01

    Embryo development commences with the fusion of two terminally differentiated haploid gametes into the totipotent fertilized egg, which through a series of major cellular and molecular transitions generate a pluripotent cell mass. The activation of the zygotic genome occurs during the so-called maternal to zygotic transition and prepares the embryo for zygotic takeover from maternal factors, in the control of the development of cellular lineages during differentiation. Recent advances in next generation sequencing technologies have allowed the dissection of the genomic and epigenomic processes mediating this transition. These processes include reorganization of the chromatin structure to a transcriptionally permissive state, changes in composition and function of structural and regulatory DNA-binding proteins, and changeover of the transcriptome as it is overhauled from that deposited by the mother in the oocyte to a zygotically transcribed complement. Zygotic genome activation in zebrafish occurs 10 cell cycles after fertilization and provides an ideal experimental platform for elucidating the temporal sequence and dynamics of establishment of a transcriptionally active chromatin state and helps in identifying the determinants of transcription activation at polymerase II transcribed gene promoters. The relatively large number of pluripotent cells generated by the fast cell divisions before zygotic transcription provides sufficient biomass for next generation sequencing technology approaches to establish the temporal dynamics of events and suggest causative relationship between them. However, genomic and genetic technologies need to be improved further to capture the earliest events in development, where cell number is a limiting factor. These technologies need to be complemented with precise, inducible genetic interference studies using the latest genome editing tools to reveal the function of candidate determinants and to confirm the predictions made by classic

  19. Elucidation of the electrochromic mechanism of nanostructured iron oxides films

    Energy Technology Data Exchange (ETDEWEB)

    Garcia-Lobato, M.A.; Martinez, Arturo I.; Castro-Roman, M. [Center for Research and Advanced Studies of the National Polytechnic Institute, Cinvestav Campus Saltillo, Carr. Saltillo-Monterrey Km. 13, Ramos Arizpe, Coah. 25900 (Mexico); Perry, Dale L. [Mail Stop 70A1150, Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720 (United States); Zarate, R.A. [Departamento de Fisica, Facultad de Ciencias, Universidad Catolica del Norte, Casilla 1280, Antofagasta (Chile); Escobar-Alarcon, L. (Departamento de Fisica, Instituto Nacional de Investigaciones Nucleares, A.P. 18-1027, 11801 Mexico)

    2011-02-15

    Nanostructured hematite thin films were electrochemically cycled in an aqueous solution of LiOH. Through optical, structural, morphological, and magnetic measurements, the coloration mechanism of electrochromic iron oxide thin films was elucidated. The conditions for double or single electrochromic behavior are given in this work. During the electrochemical cycling, it was found that topotactic transformations of hexagonal crystal structures are favored; i.e. {alpha}-Fe{sub 2}O{sub 3} to Fe(OH){sub 2} and subsequently to {delta}-FeOOH. These topotactic redox reactions are responsible for color changes of iron oxide films. (author)

  20. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    DEFF Research Database (Denmark)

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin

    2013-01-01

    and genetic improvement were identified.CONCLUSIONS:Given the uniqueness of the soybean germplasm sequenced, this study drew a clear picture of human-mediated evolution of the soybean genomes. The genomic resources and information provided by this study would also facilitate the discovery of genes......BACKGROUND:Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re...

  1. Phylogenetic distribution of large-scale genome patchiness

    Directory of Open Access Journals (Sweden)

    Hackenberg Michael

    2008-04-01

    Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.

  2. A comparative study of nemertean complete mitochondrial genomes, including two new ones for Nectonemertes cf. mirabilis and Zygeupolia rubens, may elucidate the fundamental pattern for the phylum Nemertea

    Directory of Open Access Journals (Sweden)

    Chen Hai-Xia

    2012-04-01

    Full Text Available Abstract Background The mitochondrial genome is important for studying genome evolution as well as reconstructing the phylogeny of organisms. Complete mitochondrial genome sequences have been reported for more than 2200 metazoans, mainly vertebrates and arthropods. To date, from a total of about 1275 described nemertean species, only three complete and two partial mitochondrial DNA sequences from nemerteans have been published. Here, we report the entire mitochondrial genomes for two more nemertean species: Nectonemertes cf. mirabilis and Zygeupolia rubens. Results The sizes of the entire mitochondrial genomes are 15365 bp for N. cf. mirabilis and 15513 bp for Z. rubens. Each circular genome contains 37 genes and an AT-rich non-coding region, and overall nucleotide composition is AT-rich. In both species, there is significant strand asymmetry in the distribution of nucleotides, with the coding strand being richer in T than A and in G than C. The AT-rich non-coding regions of the two genomes have some repeat sequences and stem-loop structures, both of which may be associated with the initiation of replication or transcription. The 22 tRNAs show variable substitution patterns in nemerteans, with higher sequence conservation in genes located on the H strand. Gene arrangement of N. cf. mirabilis is identical to that of Paranemertes cf. peregrina, both of which are Hoplonemertea, while that of Z. rubens is the same as in Lineus viridis, both of which are Heteronemertea. Comparison of the gene arrangements and phylogenomic analysis based on concatenated nucleotide sequences of the 12 mitochondrial protein-coding genes revealed that species with closer relationships share more identical gene blocks. Conclusion The two new mitochondrial genomes share many features, including gene contents, with other known nemertean mitochondrial genomes. The tRNA families display a composite substitution pathway. Gene order comparison to the proposed ground pattern of

  3. LC-MS analysis in the e-beam and gamma radiolysis of metoprolol tartrate in aqueous solution: Structure elucidation and formation mechanism of radiolytic products

    Energy Technology Data Exchange (ETDEWEB)

    Slegers, Catherine [Unite d' Analyse Chimique et Physico-chimique des Medicaments, Universite Catholique de Louvain, CHAM 72.30, Avenue E. Mounier, 72, B-1200, Brussels (Belgium)]. E-mail: catherine.slegers@skynet.be; Maquille, Aubert [Unite d' Analyse Chimique et Physico-chimique des Medicaments, Universite Catholique de Louvain, CHAM 72.30, Avenue E. Mounier, 72, B-1200, Brussels (Belgium); Deridder, Veronique [Unite d' Analyse Chimique et Physico-chimique des Medicaments, Universite Catholique de Louvain, CHAM 72.30, Avenue E. Mounier, 72, B-1200, Brussels (Belgium); Sonveaux, Etienne [Unite de Chimie Pharmaceutique et de Radiopharmacie, Universite Catholique de Louvain, Brussels (Belgium); Habib Jiwan, Jean-Louis [Laboratoire de Spectrometrie de Masse, Universite Catholique de Louvain, Louvain-La-Neuve (Belgium); Tilquin, Bernard [Unite d' Analyse Chimique et Physico-chimique des Medicaments, Universite Catholique de Louvain, CHAM 72.30, Avenue E. Mounier, 72, B-1200, Brussels (Belgium)

    2006-09-15

    E-beam and gamma products from the radiolysis of aqueous solutions of ({+-})-metoprolol tartrate, saturated in nitrogen, are analyzed by HPLC with on-line mass and UV detectors. The structures of 10 radiolytic products common to e-beam and gamma irradiations are elucidated by comparing their fragmentation pattern to that of ({+-})-metoprolol. Two of the radiolytic products are also metabolites. Different routes for the formation of the radiolytic products are proposed.

  4. Conserved microstructure of the Brassica B Genome of Brassica nigra in relation to homologous regions of Arabidopsis thaliana, B. rapa and B. oleracea

    Science.gov (United States)

    2013-01-01

    Background The Brassica B genome is known to carry several important traits, yet there has been limited analyses of its underlying genome structure, especially in comparison to the closely related A and C genomes. A bacterial artificial chromosome (BAC) library of Brassica nigra was developed and screened with 17 genes from a 222 kb region of A. thaliana that had been well characterised in both the Brassica A and C genomes. Results Fingerprinting of 483 apparently non-redundant clones defined physical contigs for the corresponding regions in B. nigra. The target region is duplicated in A. thaliana and six homologous contigs were found in B. nigra resulting from the whole genome triplication event shared by the Brassiceae tribe. BACs representative of each region were sequenced to elucidate the level of microscale rearrangements across the Brassica species divide. Conclusions Although the B genome species separated from the A/C lineage some 6 Mya, comparisons between the three paleopolyploid Brassica genomes revealed extensive conservation of gene content and sequence identity. The level of fractionation or gene loss varied across genomes and genomic regions; however, the greatest loss of genes was observed to be common to all three genomes. One large-scale chromosomal rearrangement differentiated the B genome suggesting such events could contribute to the lack of recombination observed between B genome species and those of the closely related A/C lineage. PMID:23586706

  5. Electrochemical Synthesis, Structure Elucidation and Antibacterial Evaluation of 9a-aza-9a-chloro-9a-homoerythromycin A

    Directory of Open Access Journals (Sweden)

    Zoran Mandic

    2014-09-01

    Full Text Available Electrochemical synthesis, structure elucidation and antibacterial evaluation of 9a-aza-9a-chloro-9a-homoerythromycin A were carried out. It was found that the anodic oxidation of 9a-aza-9a-homoerythromycin A via electrogenerated reactive chlorine species leads to the chlorination of lactam nitrogen in high yield provided the pH of the reaction mixture is maintained above 3. NOESY spectra reveal the existence of the mixture of two conformational families in the solution, the "folding-out" conformer being slightly more abundant comparing to 9a-Aza-9a-homoerythromycin A. The chlorine substitution of lactam hydrogen resulted in improved antimicrobial potency against Sreptococcus pyogenes PSCB0542, Moraxella catarrhalis ATCC 25238, Haemophilus  influenzae ATCC 49247 and Enteroccocus faecalis ATCC 29212.

  6. DFT and EXAFS, mutually enhancing tools for structure elucidation of Tc complexes

    International Nuclear Information System (INIS)

    Breynaert, E.; Bruggeman, C.; Maes, A.

    2005-01-01

    Full text of publication follows: The redox-sensitive fission product technetium-99 (Tc) is of great interest in nuclear waste disposal studies because of its potential for contaminating the geosphere due to its very long half-life (2.13 x 10 5 year) and high mobility under oxidising conditions, where technetium forms pertechnetate (TcO 4 - ). Under suitable reducing conditions, e.g. in the presence of an iron(II) containing solid phase which can act as an electron donor, the solubility can be limited by the reduction of pertechnetate followed by the formation of a surface precipitate [1]. However, by association with mobile humic substances (HS) or other associating/complexing species, the solubility of reduced Tc species may be drastically enhanced [2]. The elucidation of the identity and geometrical structure of the species causing this enhanced solubility often remains a difficult issue. Extended X-Ray Absorption Fluorescence Spectroscopy/ X-Ray Absorption Near Edge Spectroscopy (EXAFS/XANES) analysis can be very helpful to determine the molecular surroundings of reduced Tc organic complexes of Tc(IV) colloid-HS-associations. The interpretation of the EXAFS spectra is however not a straightforward process due to the possibly strong influence of multiple scattering paths on the spectra. Contrary to single scattering analysis which can be easily performed on the experimental EXAFS spectra, multiple scattering analysis can only be done, based on good approximations of the possible geometrical structures of the species under consideration. It has been shown that the Density Functional Theory (DFT) can be used to perform geometry optimizations of technetium species and therefore can be utilized to predict the structure of technetium complexes [3]. The success and computational effort needed for such structure predictions are however highly dependant on the quality of the initial structure from which the geometry optimization is started. Single scattering analysis of

  7. Biosynthesis of Akaeolide and Lorneic Acids and Annotation of Type I Polyketide Synthase Gene Clusters in the Genome of Streptomyces sp. NPS554

    Directory of Open Access Journals (Sweden)

    Tao Zhou

    2015-01-01

    Full Text Available The incorporation pattern of biosynthetic precursors into two structurally unique polyketides, akaeolide and lorneic acid A, was elucidated by feeding experiments with 13C-labeled precursors. In addition, the draft genome sequence of the producer, Streptomyces sp. NPS554, was performed and the biosynthetic gene clusters for these polyketides were identified. The putative gene clusters contain all the polyketide synthase (PKS domains necessary for assembly of the carbon skeletons. Combined with the 13C-labeling results, gene function prediction enabled us to propose biosynthetic pathways involving unusual carbon-carbon bond formation reactions. Genome analysis also indicated the presence of at least ten orphan type I PKS gene clusters that might be responsible for the production of new polyketides.

  8. Striking structural dynamism and nucleotide sequence variation of the transposon Galileo in the genome of Drosophila mojavensis.

    Science.gov (United States)

    Marzo, Mar; Bello, Xabier; Puig, Marta; Maside, Xulio; Ruiz, Alfredo

    2013-02-04

    Galileo is a transposable element responsible for the generation of three chromosomal inversions in natural populations of Drosophila buzzatii. Although the most characteristic feature of Galileo is the long internally-repetitive terminal inverted repeats (TIRs), which resemble the Drosophila Foldback element, its transposase-coding sequence has led to its classification as a member of the P-element superfamily (Class II, subclass 1, TIR order). Furthermore, Galileo has a wide distribution in the genus Drosophila, since it has been found in 6 of the 12 Drosophila sequenced genomes. Among these species, D. mojavensis, the one closest to D. buzzatii, presented the highest diversity in sequence and structure of Galileo elements. In the present work, we carried out a thorough search and annotation of all the Galileo copies present in the D. mojavensis sequenced genome. In our set of 170 Galileo copies we have detected 5 Galileo subfamilies (C, D, E, F, and X) with different structures ranging from nearly complete, to only 2 TIR or solo TIR copies. Finally, we have explored the structural and length variation of the Galileo copies that point out the relatively frequent rearrangements within and between Galileo elements. Different mechanisms responsible for these rearrangements are discussed. Although Galileo is a transposable element with an ancient history in the D. mojavensis genome, our data indicate a recent transpositional activity. Furthermore, the dynamism in sequence and structure, mainly affecting the TIRs, suggests an active exchange of sequences among the copies. This exchange could lead to new subfamilies of the transposon, which could be crucial for the long-term survival of the element in the genome.

  9. Detection of G-Quadruplex Structures Formed by G-Rich Sequences from Rice Genome and Transcriptome Using Combined Probes.

    Science.gov (United States)

    Chang, Tianjun; Li, Weiguo; Ding, Zhan; Cheng, Shaofei; Liang, Kun; Liu, Xiangjun; Bing, Tao; Shangguan, Dihua

    2017-08-01

    Putative G-quadruplex (G4) forming sequences (PQS) are highly prevalent in the genome and transcriptome of various organisms and are considered as potential regulation elements in many biological processes by forming G4 structures. The formation of G4 structures highly depends on the sequences and the environment. In most cases, it is difficult to predict G4 formation by PQS, especially PQS containing G2 tracts. Therefore, the experimental identification of G4 formation is essential in the study of G4-related biological functions. Herein, we report a rapid and simple method for the detection of G4 structures by using a pair of complementary reporters, hemin and BMSP. This method was applied to detect G4 structures formed by PQS (DNA and RNA) searched in the genome and transcriptome of Oryza sativa. Unlike most of the reported G4 probes that only recognize part of G4 structures, the proposed method based on combined probes positively responded to almost all G4 conformations, including parallel, antiparallel, and mixed/hybrid G4, but did not respond to non-G4 sequences. This method shows potential for high-throughput identification of G4 structures in genome and transcriptome. Furthermore, BMSP was observed to drive some PQS to form more stable G4 structures or induce the G4 formation of some PQS that cannot form G4 in normal physiological conditions, which may provide a powerful molecular tool for gene regulation.

  10. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Arneodo, Alain; Vaillant, Cedric; Audit, Benjamin; Argoul, Francoise; D'Aubenton-Carafa, Yves; Thermes, Claude

    2011-01-01

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  11. The pomegranate (Punica granatum L.) genome and the genomics of punicalagin biosynthesis.

    Science.gov (United States)

    Qin, Gaihua; Xu, Chunyan; Ming, Ray; Tang, Haibao; Guyot, Romain; Kramer, Elena M; Hu, Yudong; Yi, Xingkai; Qi, Yongjie; Xu, Xiangyang; Gao, Zhenghui; Pan, Haifa; Jian, Jianbo; Tian, Yinping; Yue, Zhen; Xu, Yiliu

    2017-09-01

    Pomegranate (Punica granatum L.) is a perennial fruit crop grown since ancient times that has been planted worldwide and is known for its functional metabolites, particularly punicalagins. We have sequenced and assembled the pomegranate genome with 328 Mb anchored into nine pseudo-chromosomes and annotated 29 229 gene models. A Myrtales lineage-specific whole-genome duplication event was detected that occurred in the common ancestor before the divergence of pomegranate and Eucalyptus. Repetitive sequences accounted for 46.1% of the assembled genome. We found that the integument development gene INNER NO OUTER (INO) was under positive selection and potentially contributed to the development of the fleshy outer layer of the seed coat, an edible part of pomegranate fruit. The genes encoding the enzymes for synthesis and degradation of lignin, hemicelluloses and cellulose were also differentially expressed between soft- and hard-seeded varieties, reflecting differences in their accumulation in cultivars differing in seed hardness. Candidate genes for punicalagin biosynthesis were identified and their expression patterns indicated that gallic acid synthesis in tissues could follow different biochemical pathways. The genome sequence of pomegranate provides a valuable resource for the dissection of many biological and biochemical traits and also provides important insights for the acceleration of breeding. Elucidation of the biochemical pathway(s) involved in punicalagin biosynthesis could assist breeding efforts to increase production of this bioactive compound. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.

  12. DHX9 helicase is involved in preventing genomic instability induced by alternatively structured DNA in human cells.

    Science.gov (United States)

    Jain, Aklank; Bacolla, Albino; Del Mundo, Imee M; Zhao, Junhua; Wang, Guliang; Vasquez, Karen M

    2013-12-01

    Sequences that have the capacity to adopt alternative (i.e. non-B) DNA structures in the human genome have been implicated in stimulating genomic instability. Previously, we found that a naturally occurring intra-molecular triplex (H-DNA) caused genetic instability in mammals largely in the form of DNA double-strand breaks. Thus, it is of interest to determine the mechanism(s) involved in processing H-DNA. Recently, we demonstrated that human DHX9 helicase preferentially unwinds inter-molecular triplex DNA in vitro. Herein, we used a mutation-reporter system containing H-DNA to examine the relevance of DHX9 activity on naturally occurring H-DNA structures in human cells. We found that H-DNA significantly increased mutagenesis in small-interfering siRNA-treated, DHX9-depleted cells, affecting mostly deletions. Moreover, DHX9 associated with H-DNA in the context of supercoiled plasmids. To further investigate the role of DHX9 in the recognition/processing of H-DNA, we performed binding assays in vitro and chromatin immunoprecipitation assays in U2OS cells. DHX9 recognized H-DNA, as evidenced by its binding to the H-DNA structure and enrichment at the H-DNA region compared with a control region in human cells. These composite data implicate DHX9 in processing H-DNA structures in vivo and support its role in the overall maintenance of genomic stability at sites of alternatively structured DNA.

  13. Structure and mechanism of the ATPase that powers viral genome packaging.

    Science.gov (United States)

    Hilbert, Brendan J; Hayes, Janelle A; Stone, Nicholas P; Duffy, Caroline M; Sankaran, Banumathi; Kelch, Brian A

    2015-07-21

    Many viruses package their genomes into procapsids using an ATPase machine that is among the most powerful known biological motors. However, how this motor couples ATP hydrolysis to DNA translocation is still unknown. Here, we introduce a model system with unique properties for studying motor structure and mechanism. We describe crystal structures of the packaging motor ATPase domain that exhibit nucleotide-dependent conformational changes involving a large rotation of an entire subdomain. We also identify the arginine finger residue that catalyzes ATP hydrolysis in a neighboring motor subunit, illustrating that previous models for motor structure need revision. Our findings allow us to derive a structural model for the motor ring, which we validate using small-angle X-ray scattering and comparisons with previously published data. We illustrate the model's predictive power by identifying the motor's DNA-binding and assembly motifs. Finally, we integrate our results to propose a mechanistic model for DNA translocation by this molecular machine.

  14. Effects of aneuploidy on genome structure, expression, and interphase organization in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    Bruno Huettel

    2008-10-01

    Full Text Available Aneuploidy refers to losses and/or gains of individual chromosomes from the normal chromosome set. The resulting gene dosage imbalance has a noticeable affect on the phenotype, as illustrated by aneuploid syndromes, including Down syndrome in humans, and by human solid tumor cells, which are highly aneuploid. Although the phenotypic manifestations of aneuploidy are usually apparent, information about the underlying alterations in structure, expression, and interphase organization of unbalanced chromosome sets is still sparse. Plants generally tolerate aneuploidy better than animals, and, through colchicine treatment and breeding strategies, it is possible to obtain inbred sibling plants with different numbers of chromosomes. This possibility, combined with the genetic and genomics tools available for Arabidopsis thaliana, provides a powerful means to assess systematically the molecular and cytological consequences of aberrant numbers of specific chromosomes. Here, we report on the generation of Arabidopsis plants in which chromosome 5 is present in triplicate. We compare the global transcript profiles of normal diploids and chromosome 5 trisomics, and assess genome integrity using array comparative genome hybridization. We use live cell imaging to determine the interphase 3D arrangement of transgene-encoded fluorescent tags on chromosome 5 in trisomic and triploid plants. The results indicate that trisomy 5 disrupts gene expression throughout the genome and supports the production and/or retention of truncated copies of chromosome 5. Although trisomy 5 does not grossly distort the interphase arrangement of fluorescent-tagged sites on chromosome 5, it may somewhat enhance associations between transgene alleles. Our analysis reveals the complex genomic changes that can occur in aneuploids and underscores the importance of using multiple experimental approaches to investigate how chromosome numerical changes condition abnormal phenotypes and

  15. Coordination of genomic structure and transcription by the main bacterial nucleoid-associated protein HU

    Science.gov (United States)

    Berger, Michael; Farcas, Anca; Geertz, Marcel; Zhelyazkova, Petya; Brix, Klaudia; Travers, Andrew; Muskhelishvili, Georgi

    2010-01-01

    The histone-like protein HU is a highly abundant DNA architectural protein that is involved in compacting the DNA of the bacterial nucleoid and in regulating the main DNA transactions, including gene transcription. However, the coordination of the genomic structure and function by HU is poorly understood. Here, we address this question by comparing transcript patterns and spatial distributions of RNA polymerase in Escherichia coli wild-type and hupA/B mutant cells. We demonstrate that, in mutant cells, upregulated genes are preferentially clustered in a large chromosomal domain comprising the ribosomal RNA operons organized on both sides of OriC. Furthermore, we show that, in parallel to this transcription asymmetry, mutant cells are also impaired in forming the transcription foci—spatially confined aggregations of RNA polymerase molecules transcribing strong ribosomal RNA operons. Our data thus implicate HU in coordinating the global genomic structure and function by regulating the spatial distribution of RNA polymerase in the nucleoid. PMID:20010798

  16. LC-MS analysis in the e-beam and gamma radiolysis of metoprolol tartrate in aqueous solution: Structure elucidation and formation mechanism of radiolytic products

    International Nuclear Information System (INIS)

    Slegers, Catherine; Maquille, Aubert; Deridder, Veronique; Sonveaux, Etienne; Habib Jiwan, Jean-Louis; Tilquin, Bernard

    2006-01-01

    E-beam and gamma products from the radiolysis of aqueous solutions of (±)-metoprolol tartrate, saturated in nitrogen, are analyzed by HPLC with on-line mass and UV detectors. The structures of 10 radiolytic products common to e-beam and gamma irradiations are elucidated by comparing their fragmentation pattern to that of (±)-metoprolol. Two of the radiolytic products are also metabolites. Different routes for the formation of the radiolytic products are proposed

  17. Genomic analyses of the Chlamydia trachomatis core genome show an association between chromosomal genome, plasmid type and disease

    NARCIS (Netherlands)

    Versteeg, Bart; Bruisten, Sylvia M.; Pannekoek, Yvonne; Jolley, Keith A.; Maiden, Martin C. J.; van der Ende, Arie; Harrison, Odile B.

    2018-01-01

    Background: Chlamydia trachomatis (Ct) plasmid has been shown to encode genes essential for infection. We evaluated the population structure of Ct using whole-genome sequence data (WGS). In particular, the relationship between the Ct genome, plasmid and disease was investigated. Results: WGS data

  18. Body maps on the human genome.

    Science.gov (United States)

    Cherniak, Christopher; Rodriguez-Esteban, Raul

    2013-12-20

    Chromosomes have territories, or preferred locales, in the cell nucleus. When these sites are taken into account, some large-scale structure of the human genome emerges. The synoptic picture is that genes highly expressed in particular topologically compact tissues are not randomly distributed on the genome. Rather, such tissue-specific genes tend to map somatotopically onto the complete chromosome set. They seem to form a "genome homunculus": a multi-dimensional, genome-wide body representation extending across chromosome territories of the entire spermcell nucleus. The antero-posterior axis of the body significantly corresponds to the head-tail axis of the nucleus, and the dorso-ventral body axis to the central-peripheral nucleus axis. This large-scale genomic structure includes thousands of genes. One rationale for a homuncular genome structure would be to minimize connection costs in genetic networks. Somatotopic maps in cerebral cortex have been reported for over a century.

  19. Tracing common origins of Genomic Islands in prokaryotes based on genome signature analyses.

    Science.gov (United States)

    van Passel, Mark Wj

    2011-09-01

    Horizontal gene transfer constitutes a powerful and innovative force in evolution, but often little is known about the actual origins of transferred genes. Sequence alignments are generally of limited use in tracking the original donor, since still only a small fraction of the total genetic diversity is thought to be uncovered. Alternatively, approaches based on similarities in the genome specific relative oligonucleotide frequencies do not require alignments. Even though the exact origins of horizontally transferred genes may still not be established using these compositional analyses, it does suggest that compositionally very similar regions are likely to have had a common origin. These analyses have shown that up to a third of large acquired gene clusters that reside in the same genome are compositionally very similar, indicative of a shared origin. This brings us closer to uncovering the original donors of horizontally transferred genes, and could help in elucidating possible regulatory interactions between previously unlinked sequences.

  20. Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction.

    Science.gov (United States)

    Yang, Yuedong; Li, Xiaomei; Zhao, Huiying; Zhan, Jian; Wang, Jihua; Zhou, Yaoqi

    2017-01-01

    As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learning method trained on protein-bound RNA structures for solvent accessibility prediction. Built on sequence profiles from multiple sequence alignment (RNAsnap-prof), the method provided robust prediction in fivefold cross-validation and an independent test (Pearson correlation coefficients, r, between predicted and actual ASA values are 0.66 and 0.63, respectively). Application of the method to 6178 mRNAs revealed its positive correlation to mRNA accessibility by dimethyl sulphate (DMS) experimentally measured in vivo (r = 0.37) but not in vitro (r = 0.07), despite the lack of training on mRNAs and the fact that DMS accessibility is only an approximation to solvent accessibility. We further found strong association across coding and noncoding regions between predicted solvent accessibility of the mutation site of a single nucleotide variant (SNV) and the frequency of that variant in the population for 2.2 million SNVs obtained in the 1000 Genomes Project. Moreover, mapping solvent accessibility of RNAs to the human genome indicated that introns, 5' cap of 5' and 3' cap of 3' untranslated regions, are more solvent accessible, consistent with their respective functional roles. These results support conformational selections as the mechanism for the formation of RNA-protein complexes and highlight the utility of genome-scale characterization of RNA tertiary structures by RNAsnap. The server and its stand-alone downloadable version are available at http://sparks-lab.org. © 2016 Yang et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  1. Multiple independent structural dynamic events in the evolution of snake mitochondrial genomes.

    Science.gov (United States)

    Qian, Lifu; Wang, Hui; Yan, Jie; Pan, Tao; Jiang, Shanqun; Rao, Dingqi; Zhang, Baowei

    2018-05-10

    Mitochondrial DNA sequences have long been used in phylogenetic studies. However, little attention has been paid to the changes in gene arrangement patterns in the snake's mitogenome. Here, we analyzed the complete mitogenome sequences and structures of 65 snake species from 14 families and examined their structural patterns, organization and evolution. Our purpose was to further investigate the evolutionary implications and possible rearrangement mechanisms of the mitogenome within snakes. In total, eleven types of mitochondrial gene arrangement patterns were detected (Type I, II, III, III-A, III-B, III-B1, III-C, III-D, III-E, III-F, III-G), with mitochondrial genome rearrangements being a major trend in snakes, especially in Alethinophidia. In snake mitogenomes, the rearrangements mainly involved three processes, gene loss, translocation and duplication. Within Scolecophidia, the O L was lost several times in Typhlopidae and Leptotyphlopidae, but persisted as a plesiomorphy in the Alethinophidia. Duplication of the control region and translocation of the tRNA Leu gene are two visible features in Alethinophidian mitochondrial genomes. Independently and stochastically, the duplication of pseudo-Pro (P*) emerged in seven different lineages of unequal size in three families, indicating that the presence of P* was a polytopic event in the mitogenome. The WANCY tRNA gene cluster and the control regions and their adjacent segments were hotspots for mitogenome rearrangement. Maintenance of duplicate control regions may be the source for snake mitogenome structural diversity.

  2. The genome of Paenibacillus sabinae T27 provides insight into evolution, organization and functional elucidation of nif and nif-like genes

    OpenAIRE

    Li, Xinxin; Deng, Zhiping; Liu, Zhanzhi; Yan, Yongliang; Wang, Tianshu; Xie, Jianbo; Lin, Min; Cheng, Qi; Chen, Sanfeng

    2014-01-01

    Background Most biological nitrogen fixation is catalyzed by the molybdenum nitrogenase. This enzyme is a complex which contains the MoFe protein encoded by nifDK and the Fe protein encoded by nifH. In addition to nifHDK, nifHDK-like genes were found in some Archaea and Firmicutes, but their function is unclear. Results We sequenced the genome of Paenibacillus sabinae T27. A total of 4,793 open reading frames were predicted from its 5.27 Mb genome. The genome of P. sabinae T27 contains fiftee...

  3. Mayotlide: synthetic approaches and structural elucidation

    OpenAIRE

    Herraiz Cobo, Jesús

    2017-01-01

    [eng] Mayotlide is a marine peptide isolated by PharmaMar S.A. from Spongia sp.. The sequence of the aminoacids were achieved by MS-MS spectrometry, where two of them were tryptophans. The NMR spectroscopy revealed the presence of N1-C3a bond between the tryptophans, which means that one of them was cyclized. On the first structure proposal, the aminoacids were forming two macrocyclic rings: on the ring B, all the aminoacids of molecule were tied by amide bond, remarking the presence of the ...

  4. Mayotlide: synthetic approaches and structural elucidation

    OpenAIRE

    Herraiz Cobo, Jesús

    2017-01-01

    Mayotlide is a marine peptide isolated by PharmaMar S.A. from Spongia sp.. The sequence of the aminoacids were achieved by MS-MS spectrometry, where two of them were tryptophans. The NMR spectroscopy revealed the presence of N1-C3a bond between the tryptophans, which means that one of them was cyclized. On the first structure proposal, the aminoacids were forming two macrocyclic rings: on the ring B, all the aminoacids of molecule were tied by amide bond, remarking the presence of the cycliz...

  5. Predicting Tissue-Specific Enhancers in the Human Genome

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Loots, Gabriela G.; Nobrega, Marcelo A.; Ovcharenko, Ivan

    2006-07-01

    Determining how transcriptional regulatory signals areencoded in vertebrate genomes is essential for understanding the originsof multi-cellular complexity; yet the genetic code of vertebrate generegulation remains poorly understood. In an attempt to elucidate thiscode, we synergistically combined genome-wide gene expression profiling,vertebrate genome comparisons, and transcription factor binding siteanalysis to define sequence signatures characteristic of candidatetissue-specific enhancers in the human genome. We applied this strategyto microarray-based gene expression profiles from 79 human tissues andidentified 7,187 candidate enhancers that defined their flanking geneexpression, the majority of which were located outside of knownpromoters. We cross-validated this method for its ability to de novopredict tissue-specific gene expression and confirmed its reliability in57 of the 79 available human tissues, with an average precision inenhancer recognition ranging from 32 percent to 63 percent, and asensitivity of 47 percent. We used the sequence signatures identified bythis approach to assign tissue-specific predictions to ~;328,000human-mouse conserved noncoding elements in the human genome. Byoverlapping these genome-wide predictions with a large in vivo dataset ofenhancers validated in transgenic mice, we confirmed our results with a28 percent sensitivity and 50 percent precision. These results indicatethe power of combining complementary genomic datasets as an initialcomputational foray into the global view of tissue-specific generegulation in vertebrates.

  6. Chloroplast genomes of Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea: Structures and comparative analysis.

    Science.gov (United States)

    Asaf, Sajjad; Khan, Abdul Latif; Khan, Muhammad Aaqil; Waqas, Muhammad; Kang, Sang-Mo; Yun, Byung-Wook; Lee, In-Jung

    2017-08-08

    We investigated the complete chloroplast (cp) genomes of non-model Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea using Illumina paired-end sequencing to understand their genetic organization and structure. Detailed bioinformatics analysis revealed genome sizes of both subspecies ranging between 154.4~154.5 kbp, with a large single-copy region (84,197~84,158 bp), a small single-copy region (17,738~17,813 bp) and pair of inverted repeats (IRa/IRb; 26,264~26,259 bp). Both cp genomes encode 130 genes, including 85 protein-coding genes, eight ribosomal RNA genes and 37 transfer RNA genes. Whole cp genome comparison of A. halleri ssp. gemmifera and A. lyrata ssp. petraea, along with ten other Arabidopsis species, showed an overall high degree of sequence similarity, with divergence among some intergenic spacers. The location and distribution of repeat sequences were determined, and sequence divergences of shared genes were calculated among related species. Comparative phylogenetic analysis of the entire genomic data set and 70 shared genes between both cp genomes confirmed the previous phylogeny and generated phylogenetic trees with the same topologies. The sister species of A. halleri ssp. gemmifera is A. umezawana, whereas the closest relative of A. lyrata spp. petraea is A. arenicola.

  7. GenColors-based comparative genome databases for small eukaryotic genomes.

    Science.gov (United States)

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.

  8. Structural elucidation of the DFG-Asp in and DFG-Asp out states of TAM kinases and insight into the selectivity of their inhibitors.

    Science.gov (United States)

    Messoussi, Abdellah; Peyronnet, Lucile; Feneyrolles, Clémence; Chevé, Gwénaël; Bougrin, Khalid; Yasri, Aziz

    2014-10-10

    Structural elucidation of the active (DFG-Asp in) and inactive (DFG-Asp out) states of the TAM family of receptor tyrosine kinases is required for future development of TAM inhibitors as drugs. Herein we report a computational study on each of the three TAM members Tyro-3, Axl and Mer. DFG-Asp in and DFG-Asp out homology models of each one were built based on the X-ray structure of c-Met kinase, an enzyme with a closely related sequence. Structural validation and in silico screening enabled identification of critical amino acids for ligand binding within the active site of each DFG-Asp in and DFG-Asp out model. The position and nature of amino acids that differ among Tyro-3, Axl and Mer, and the potential role of these residues in the design of selective TAM ligands, are discussed.

  9. Observing copepods through a genomic lens

    Directory of Open Access Journals (Sweden)

    Johnson Stewart C

    2011-09-01

    Full Text Available Abstract Background Copepods outnumber every other multicellular animal group. They are critical components of the world's freshwater and marine ecosystems, sensitive indicators of local and global climate change, key ecosystem service providers, parasites and predators of economically important aquatic animals and potential vectors of waterborne disease. Copepods sustain the world fisheries that nourish and support human populations. Although genomic tools have transformed many areas of biological and biomedical research, their power to elucidate aspects of the biology, behavior and ecology of copepods has only recently begun to be exploited. Discussion The extraordinary biological and ecological diversity of the subclass Copepoda provides both unique advantages for addressing key problems in aquatic systems and formidable challenges for developing a focused genomics strategy. This article provides an overview of genomic studies of copepods and discusses strategies for using genomics tools to address key questions at levels extending from individuals to ecosystems. Genomics can, for instance, help to decipher patterns of genome evolution such as those that occur during transitions from free living to symbiotic and parasitic lifestyles and can assist in the identification of genetic mechanisms and accompanying physiological changes associated with adaptation to new or physiologically challenging environments. The adaptive significance of the diversity in genome size and unique mechanisms of genome reorganization during development could similarly be explored. Genome-wide and EST studies of parasitic copepods of salmon and large EST studies of selected free-living copepods have demonstrated the potential utility of modern genomics approaches for the study of copepods and have generated resources such as EST libraries, shotgun genome sequences, BAC libraries, genome maps and inbred lines that will be invaluable in assisting further efforts to

  10. Observing copepods through a genomic lens

    Science.gov (United States)

    2011-01-01

    Background Copepods outnumber every other multicellular animal group. They are critical components of the world's freshwater and marine ecosystems, sensitive indicators of local and global climate change, key ecosystem service providers, parasites and predators of economically important aquatic animals and potential vectors of waterborne disease. Copepods sustain the world fisheries that nourish and support human populations. Although genomic tools have transformed many areas of biological and biomedical research, their power to elucidate aspects of the biology, behavior and ecology of copepods has only recently begun to be exploited. Discussion The extraordinary biological and ecological diversity of the subclass Copepoda provides both unique advantages for addressing key problems in aquatic systems and formidable challenges for developing a focused genomics strategy. This article provides an overview of genomic studies of copepods and discusses strategies for using genomics tools to address key questions at levels extending from individuals to ecosystems. Genomics can, for instance, help to decipher patterns of genome evolution such as those that occur during transitions from free living to symbiotic and parasitic lifestyles and can assist in the identification of genetic mechanisms and accompanying physiological changes associated with adaptation to new or physiologically challenging environments. The adaptive significance of the diversity in genome size and unique mechanisms of genome reorganization during development could similarly be explored. Genome-wide and EST studies of parasitic copepods of salmon and large EST studies of selected free-living copepods have demonstrated the potential utility of modern genomics approaches for the study of copepods and have generated resources such as EST libraries, shotgun genome sequences, BAC libraries, genome maps and inbred lines that will be invaluable in assisting further efforts to provide genomics tools for

  11. Genomic DNA sequences from mastodon and woolly mammoth reveal deep speciation of forest and savanna elephants.

    Directory of Open Access Journals (Sweden)

    Nadin Rohland

    2010-12-01

    Full Text Available To elucidate the history of living and extinct elephantids, we generated 39,763 bp of aligned nuclear DNA sequence across 375 loci for African savanna elephant, African forest elephant, Asian elephant, the extinct American mastodon, and the woolly mammoth. Our data establish that the Asian elephant is the closest living relative of the extinct mammoth in the nuclear genome, extending previous findings from mitochondrial DNA analyses. We also find that savanna and forest elephants, which some have argued are the same species, are as or more divergent in the nuclear genome as mammoths and Asian elephants, which are considered to be distinct genera, thus resolving a long-standing debate about the appropriate taxonomic classification of the African elephants. Finally, we document a much larger effective population size in forest elephants compared with the other elephantid taxa, likely reflecting species differences in ancient geographic structure and range and differences in life history traits such as variance in male reproductive success.

  12. Genome Sequence of Talaromyces atroroseus, Which Produces Red Colorants for the Food Industry

    DEFF Research Database (Denmark)

    Thrane, Ulf; Rasmussen, Kasper Bøwig; Petersen, Bent

    2017-01-01

    Talaromyces atroroseus is a known producer of Monascus colorants suitable for the food industry. Furthermore, genetic tools have been established that facilitate elucidation and engineering of its biosynthetic pathways. Here, we report the draft genome of a potential fungal cell factory, T...

  13. Exploring Protein Function Using the Saccharomyces Genome Database.

    Science.gov (United States)

    Wong, Edith D

    2017-01-01

    Elucidating the function of individual proteins will help to create a comprehensive picture of cell biology, as well as shed light on human disease mechanisms, possible treatments, and cures. Due to its compact genome, and extensive history of experimentation and annotation, the budding yeast Saccharomyces cerevisiae is an ideal model organism in which to determine protein function. This information can then be leveraged to infer functions of human homologs. Despite the large amount of research and biological data about S. cerevisiae, many proteins' functions remain unknown. Here, we explore ways to use the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org ) to predict the function of proteins and gain insight into their roles in various cellular processes.

  14. De Novo Discovery of Structured ncRNA Motifs in Genomic Sequences

    DEFF Research Database (Denmark)

    Ruzzo, Walter L; Gorodkin, Jan

    2014-01-01

    De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphas...... on an approach based on the CMfinder CMfinder program as a case study. Applications to genomic screens for novel de novo structured ncRNA ncRNA s, including structured RNA elements in untranslated portions of protein-coding genes, are presented.......De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphasis...

  15. CRISPR/Cas9-mediated genome engineering of CHO cell factories: application and perspectives

    DEFF Research Database (Denmark)

    Lee, Jae Seong; Grav, Lise Marie; Lewis, Nathan E.

    2015-01-01

    repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system enables rapid,easy and efficient engineering of mammalian genomes. It has a wide range of applications frommodification of individual genes to genome-wide screening or regulation of genes. Facile genomeediting using CRISPR/Cas9 empowers...... researchers in the CHO community to elucidate the mechanisticbasis behind high level production of proteins and product quality attributes of interest. Inthis review, we describe the basis of CRISPR/Cas9-mediated genome editing and its applicationfor development of next generation CHO cell factories while...... highlighting both future perspectivesand challenges. As one of the main drivers for the CHO systems biology era, genome engineeringwith CRISPR/Cas9 will pave the way for rational design of CHO cell factories....

  16. Genome U-Plot: a whole genome visualization.

    Science.gov (United States)

    Gaitatzes, Athanasios; Johnson, Sarah H; Smadbeck, James B; Vasmatzis, George

    2018-05-15

    The ability to produce and analyze whole genome sequencing (WGS) data from samples with structural variations (SV) generated the need to visualize such abnormalities in simplified plots. Conventional two-dimensional representations of WGS data frequently use either circular or linear layouts. There are several diverse advantages regarding both these representations, but their major disadvantage is that they do not use the two-dimensional space very efficiently. We propose a layout, termed the Genome U-Plot, which spreads the chromosomes on a two-dimensional surface and essentially quadruples the spatial resolution. We present the Genome U-Plot for producing clear and intuitive graphs that allows researchers to generate novel insights and hypotheses by visualizing SVs such as deletions, amplifications, and chromoanagenesis events. The main features of the Genome U-Plot are its layered layout, its high spatial resolution and its improved aesthetic qualities. We compare conventional visualization schemas with the Genome U-Plot using visualization metrics such as number of line crossings and crossing angle resolution measures. Based on our metrics, we improve the readability of the resulting graph by at least 2-fold, making apparent important features and making it easy to identify important genomic changes. A whole genome visualization tool with high spatial resolution and improved aesthetic qualities. An implementation and documentation of the Genome U-Plot is publicly available at https://github.com/gaitat/GenomeUPlot. vasmatzis.george@mayo.edu. Supplementary data are available at Bioinformatics online.

  17. Core genome conservation of Staphylococcus haemolyticus limits sequence based population structure analysis.

    Science.gov (United States)

    Cavanagh, Jorunn Pauline; Klingenberg, Claus; Hanssen, Anne-Merethe; Fredheim, Elizabeth Aarag; Francois, Patrice; Schrenzel, Jacques; Flægstad, Trond; Sollid, Johanna Ericson

    2012-06-01

    The notoriously multi-resistant Staphylococcus haemolyticus is an emerging pathogen causing serious infections in immunocompromised patients. Defining the population structure is important to detect outbreaks and spread of antimicrobial resistant clones. Currently, the standard typing technique is pulsed-field gel electrophoresis (PFGE). In this study we describe novel molecular typing schemes for S. haemolyticus using multi locus sequence typing (MLST) and multi locus variable number of tandem repeats (VNTR) analysis. Seven housekeeping genes (MLST) and five VNTR loci (MLVF) were selected for the novel typing schemes. A panel of 45 human and veterinary S. haemolyticus isolates was investigated. The collection had diverse PFGE patterns (38 PFGE types) and was sampled over a 20 year-period from eight countries. MLST resolved 17 sequence types (Simpsons index of diversity [SID]=0.877) and MLVF resolved 14 repeat types (SID=0.831). We found a low sequence diversity. Phylogenetic analysis clustered the isolates in three (MLST) and one (MLVF) clonal complexes, respectively. Taken together, neither the MLST nor the MLVF scheme was suitable to resolve the population structure of this S. haemolyticus collection. Future MLVF and MLST schemes will benefit from addition of more variable core genome sequences identified by comparing different fully sequenced S. haemolyticus genomes. Copyright © 2012 Elsevier B.V. All rights reserved.

  18. Single-Cell (Meta-Genomics of a Dimorphic Candidatus Thiomargarita nelsonii Reveals Genomic Plasticity

    Directory of Open Access Journals (Sweden)

    Beverly E. Flood

    2016-05-01

    Full Text Available The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria.Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence transposable elements and miniature inverted-repeat transposable elements (MITEs. In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsr

  19. Who ate whom? Adaptive Helicobacter genomic changes that accompanied a host jump from early humans to large felines.

    Directory of Open Access Journals (Sweden)

    Mark Eppinger

    2006-07-01

    Full Text Available Helicobacter pylori infection of humans is so old that its population genetic structure reflects that of ancient human migrations. A closely related species, Helicobacter acinonychis, is specific for large felines, including cheetahs, lions, and tigers, whereas hosts more closely related to humans harbor more distantly related Helicobacter species. This observation suggests a jump between host species. But who ate whom and when did it happen? In order to resolve this question, we determined the genomic sequence of H. acinonychis strain Sheeba and compared it to genomes from H. pylori. The conserved core genes between the genomes are so similar that the host jump probably occurred within the last 200,000 (range 50,000-400,000 years. However, the Sheeba genome also possesses unique features that indicate the direction of the host jump, namely from early humans to cats. Sheeba possesses an unusually large number of highly fragmented genes, many encoding outer membrane proteins, which may have been destroyed in order to bypass deleterious responses from the feline host immune system. In addition, the few Sheeba-specific genes that were found include a cluster of genes encoding sialylation of the bacterial cell surface carbohydrates, which were imported by horizontal genetic exchange and might also help to evade host immune defenses. These results provide a genomic basis for elucidating molecular events that allow bacteria to adapt to novel animal hosts.

  20. Applied Genetics and Genomics in Alfalfa Breeding

    Directory of Open Access Journals (Sweden)

    E. Charles Brummer

    2012-03-01

    Full Text Available Alfalfa (Medicago sativa L., a perennial and outcrossing species, is a widely planted forage legume for hay, pasture and silage throughout the world. Currently, alfalfa breeding relies on recurrent phenotypic selection, but alternatives incorporating molecular marker assisted breeding could enhance genetic gain per unit time and per unit cost, and accelerate alfalfa improvement. Many major quantitative trait loci (QTL related to agronomic traits have been identified by family-based QTL mapping, but in relatively large genomic regions. Candidate genes elucidated from model species have helped to identify some potential causal loci in alfalfa mapping and breeding population for specific traits. Recently, high throughput sequencing technologies, coupled with advanced bioinformatics tools, have been used to identify large numbers of single nucleotide polymorphisms (SNP in alfalfa, which are being developed into markers. These markers will facilitate fine mapping of quantitative traits and genome wide association mapping of agronomic traits and further advanced breeding strategies for alfalfa, such as marker-assisted selection and genomic selection. Based on ideas from the literature, we suggest several ways to improve selection in alfalfa including (1 diversity selection and paternity testing, (2 introgression of QTL and (3 genomic selection.

  1. Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody

    2016-03-23

    Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel resistance

  2. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  3. Whole-genome sequence, SNP chips and pedigree structure: building demographic profiles in domestic dog breeds to optimize genetic-trait mapping

    Science.gov (United States)

    Dreger, Dayna L.; Rimbault, Maud; Davis, Brian W.; Bhatnagar, Adrienne; Parker, Heidi G.

    2016-01-01

    ABSTRACT In the decade following publication of the draft genome sequence of the domestic dog, extraordinary advances with application to several fields have been credited to the canine genetic system. Taking advantage of closed breeding populations and the subsequent selection for aesthetic and behavioral characteristics, researchers have leveraged the dog as an effective natural model for the study of complex traits, such as disease susceptibility, behavior and morphology, generating unique contributions to human health and biology. When designing genetic studies using purebred dogs, it is essential to consider the unique demography of each population, including estimation of effective population size and timing of population bottlenecks. The analytical design approach for genome-wide association studies (GWAS) and analysis of whole-genome sequence (WGS) experiments are inextricable from demographic data. We have performed a comprehensive study of genomic homozygosity, using high-depth WGS data for 90 individuals, and Illumina HD SNP data from 800 individuals representing 80 breeds. These data were coupled with extensive pedigree data analyses for 11 breeds that, together, allowed us to compute breed structure, demography, and molecular measures of genome diversity. Our comparative analyses characterize the extent, formation and implication of breed-specific diversity as it relates to population structure. These data demonstrate the relationship between breed-specific genome dynamics and population architecture, and provide important considerations influencing the technological and cohort design of association and other genomic studies. PMID:27874836

  4. gb4gv: a genome browser for geminivirus

    Directory of Open Access Journals (Sweden)

    Eric S. Ho

    2017-04-01

    Full Text Available Background Geminiviruses (family Geminiviridae are prevalent plant viruses that imperil agriculture globally, causing serious damage to the livelihood of farmers, particularly in developing countries. The virus evolves rapidly, attributing to its single-stranded genome propensity, resulting in worldwide circulation of diverse and viable genomes. Genomics is a prominent approach taken by researchers in elucidating the infectious mechanism of the virus. Currently, the NCBI Viral Genome website is a popular repository of viral genomes that conveniently provides researchers a centralized data source of genomic information. However, unlike the genome of living organisms, viral genomes most often maintain peculiar characteristics that fit into no single genome architecture. By imposing a unified annotation scheme on the myriad of viral genomes may downplay their hallmark features. For example, the viron of begomoviruses prevailing in America encapsulates two similar-sized circular DNA components and both are required for systemic infection of plants. However, the bipartite components are kept separately in NCBI as individual genomes with no explicit association in linking them. Thus, our goal is to build a comprehensive Geminivirus genomics database, namely gb4gv, that not only preserves genomic characteristics of the virus, but also supplements biologically relevant annotations that help to interrogate this virus, for example, the targeted host, putative iterons, siRNA targets, etc. Methods We have employed manual and automatic methods to curate 508 genomes from four major genera of Geminiviridae, and 161 associated satellites obtained from NCBI RefSeq and PubMed databases. Results These data are available for free access without registration from our website. Besides genomic content, our website provides visualization capability inherited from UCSC Genome Browser. Discussion With the genomic information readily accessible, we hope that our database

  5. Rapid screening and structural elucidation of a novel sibutramine analogue in a weight loss supplement: 11-desisobutyl-11-benzylsibutramine.

    Science.gov (United States)

    Mans, Daniel J; Gucinski, Ashley C; Dunn, Jamie D; Gryniewicz-Ruzicka, Connie M; Mecker-Pogue, Laura C; Kao, Jeff L-F; Ge, Xia

    2013-09-01

    A novel analogue of sibutramine, 11-desisobutyl-11-benzylsibutramine, has been discovered. During routine ion mobility spectrometry (IMS) screening of a weight loss supplement collected at an US FDA import operation facility an unknown peak was observed. Further analysis of the supplement by liquid chromatography-mass spectrometry (LC-MS) and high resolution mass spectrometry revealed an unknown peak with a relative retention time of 1.04 with respect to sibutramine and a predicted formula of C20H24NCl. In order to elucidate the analogue's structure, it was isolated from the supplement and characterized by tandem mass spectrometry and nuclear magnetic resonance (NMR), which revealed the analogue possessed a benzyl moiety at the 11 position in place of the isobutyl group associated with sibutramine. Copyright © 2013. Published by Elsevier B.V.

  6. A high-quality human reference panel reveals the complexity and distribution of genomic structural variants

    NARCIS (Netherlands)

    Hehir-Kwa, J.Y.; Marschall, T.; Kloosterman, W.P.; Francioli, L.C.; Baaijens, J.A.; Dijkstra, L.J.; Abdellaoui, A.; Koval, V.; Thung, D.T.; Wardenaar, R.; Renkens, I.; Coe, B.P.; Deelen, P.; de Ligt, J.; Lameijer, E.W.; Dijk, F.; Hormozdiari, F.; Uitterlinden, A.G.; van Duijn, C.M.; Eichler, E.E.; Bakker, P.I.W.; Swertz, M.A.; Wijmenga, C.; van Ommen, G.J.B; Slagboom, P.E.; Boomsma, D.I.; Schönhuth, A.; Ye, K.; Guryev, V.

    2016-01-01

    Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic

  7. Structural Elucidation of the DFG-Asp in and DFG-Asp out States of TAM Kinases and Insight into the Selectivity of Their Inhibitors

    Directory of Open Access Journals (Sweden)

    Abdellah Messoussi

    2014-10-01

    Full Text Available Structural elucidation of the active (DFG-Asp in and inactive (DFG-Asp out states of the TAM family of receptor tyrosine kinases is required for future development of TAM inhibitors as drugs. Herein we report a computational study on each of the three TAM members Tyro-3, Axl and Mer. DFG-Asp in and DFG-Asp out homology models of each one were built based on the X-ray structure of c-Met kinase, an enzyme with a closely related sequence. Structural validation and in silico screening enabled identification of critical amino acids for ligand binding within the active site of each DFG-Asp in and DFG-Asp out model. The position and nature of amino acids that differ among Tyro-3, Axl and Mer, and the potential role of these residues in the design of selective TAM ligands, are discussed.

  8. Elucidating the Small Regulatory RNA Repertoire of the Sea Anemone Anemonia viridis Based on Whole Genome and Small RNA Sequencing.

    Science.gov (United States)

    Urbarova, Ilona; Patel, Hardip; Forêt, Sylvain; Karlsen, Bård Ove; Jørgensen, Tor Erik; Hall-Spencer, Jason M; Johansen, Steinar D

    2018-02-01

    Cnidarians harbor a variety of small regulatory RNAs that include microRNAs (miRNAs) and PIWI-interacting RNAs (piRNAs), but detailed information is limited. Here, we report the identification and expression of novel miRNAs and putative piRNAs, as well as their genomic loci, in the symbiotic sea anemone Anemonia viridis. We generated a draft assembly of the A. viridis genome with putative size of 313 Mb that appeared to be composed of about 36% repeats, including known transposable elements. We detected approximately equal fractions of DNA transposons and retrotransposons. Deep sequencing of small RNA libraries constructed from A. viridis adults sampled at a natural CO2 gradient off Vulcano Island, Italy, identified 70 distinct miRNAs. Eight were homologous to previously reported miRNAs in cnidarians, whereas 62 appeared novel. Nine miRNAs were recognized as differentially expressed along the natural seawater pH gradient. We found a highly abundant and diverse population of piRNAs, with a substantial fraction showing ping-pong signatures. We identified nearly 22% putative piRNAs potentially targeting transposable elements within the A. viridis genome. The A. viridis genome appeared similar in size to that of other hexacorals with a very high divergence of transposable elements resembling that of the sea anemone genus Exaiptasia. The genome encodes and expresses a high number of small regulatory RNAs, which include novel miRNAs and piRNAs. Differentially expressed small RNAs along the seawater pH gradient indicated regulatory gene responses to environmental stressors. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. How American Nurses Association Code of Ethics informs genetic/genomic nursing.

    Science.gov (United States)

    Tluczek, Audrey; Twal, Marie E; Beamer, Laura Curr; Burton, Candace W; Darmofal, Leslie; Kracun, Mary; Zanni, Karen L; Turner, Martha

    2018-01-01

    Members of the Ethics and Public Policy Committee of the International Society of Nurses in Genetics prepared this article to assist nurses in interpreting the American Nurses Association (2015) Code of Ethics for Nurses with Interpretive Statements (Code) within the context of genetics/genomics. The Code explicates the nursing profession's norms and responsibilities in managing ethical issues. The nearly ubiquitous application of genetic/genomic technologies in healthcare poses unique ethical challenges for nursing. Therefore, authors conducted literature searches that drew from various professional resources to elucidate implications of the code in genetic/genomic nursing practice, education, research, and public policy. We contend that the revised Code coupled with the application of genomic technologies to healthcare creates moral obligations for nurses to continually refresh their knowledge and capacities to translate genetic/genomic research into evidence-based practice, assure the ethical conduct of scientific inquiry, and continually develop or revise national/international guidelines that protect the rights of individuals and populations within the context of genetics/genomics. Thus, nurses have an ethical responsibility to remain knowledgeable about advances in genetics/genomics and incorporate emergent evidence into their work.

  10. High-performance liquid chromatography on-line coupled to high-field NMR and mass spectrometry for structure elucidation of constituents of Hypericum perforatum L

    DEFF Research Database (Denmark)

    Hansen, S. H.; Jensen, A. G.; Cornett, Claus

    1999-01-01

    The on-line separation and structure elucidation of naphthodianthrones, flavonoids, and other constituents of an extract from Hypericum perforatum L, using high performance liquid chromatography (HPLC) coupled on-line with ultraviolet-visible, nuclear magnetic resonance (NMR), and mass spectrometry...... (MS) is described. A conventional reversed-phase HPLC system using ammonium acetate as the buffer substance in the eluent tvas used, and proton NMR spectra were obtained on a 500 MHz NMR instrument. The MS and MS/MS analyses were performed using negative electrospray ionization, In the present study...

  11. Integration of Structural Dynamics and Molecular Evolution via Protein Interaction Networks: A New Era in Genomic Medicine

    Science.gov (United States)

    Kumar, Avishek; Butler, Brandon M.; Kumar, Sudhir; Ozkan, S. Banu

    2016-01-01

    Summary Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. PMID:26684487

  12. The three-dimensional genome organization of Drosophila melanogaster through data integration.

    Science.gov (United States)

    Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank

    2017-07-31

    Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.

  13. Universal internucleotide statistics in full genomes: a footprint of the DNA structure and packaging?

    Directory of Open Access Journals (Sweden)

    Mikhail I Bogachev

    Full Text Available Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleotide interval distributions exhibit the same [Formula: see text]-exponential form. While in prokaryotes a single [Formula: see text]-exponential function makes the best fit, in eukaryotes the PDF contains additionally a second [Formula: see text]-exponential, which in the human genome makes a perfect approximation over nearly 10 decades. We suggest that this functional form is a footprint of the heterogeneous DNA structure, where the first [Formula: see text]-exponential reflects the universal helical pitch that appears both in pro- and eukaryotic DNA, while the second [Formula: see text]-exponential is a specific marker of the large-scale eukaryotic DNA organization.

  14. Global organization of a positive-strand RNA virus genome.

    Directory of Open Access Journals (Sweden)

    Baodong Wu

    Full Text Available The genomes of plus-strand RNA viruses contain many regulatory sequences and structures that direct different viral processes. The traditional view of these RNA elements are as local structures present in non-coding regions. However, this view is changing due to the discovery of regulatory elements in coding regions and functional long-range intra-genomic base pairing interactions. The ∼4.8 kb long RNA genome of the tombusvirus tomato bushy stunt virus (TBSV contains these types of structural features, including six different functional long-distance interactions. We hypothesized that to achieve these multiple interactions this viral genome must utilize a large-scale organizational strategy and, accordingly, we sought to assess the global conformation of the entire TBSV genome. Atomic force micrographs of the genome indicated a mostly condensed structure composed of interconnected protrusions extending from a central hub. This configuration was consistent with the genomic secondary structure model generated using high-throughput selective 2'-hydroxyl acylation analysed by primer extension (i.e. SHAPE, which predicted different sized RNA domains originating from a central region. Known RNA elements were identified in both domain and inter-domain regions, and novel structural features were predicted and functionally confirmed. Interestingly, only two of the six long-range interactions known to form were present in the structural model. However, for those interactions that did not form, complementary partner sequences were positioned relatively close to each other in the structure, suggesting that the secondary structure level of viral genome structure could provide a basic scaffold for the formation of different long-range interactions. The higher-order structural model for the TBSV RNA genome provides a snapshot of the complex framework that allows multiple functional components to operate in concert within a confined context.

  15. A quantitative account of genomic island acquisitions in prokaryotes

    Directory of Open Access Journals (Sweden)

    Roos Tom E

    2011-08-01

    Full Text Available Abstract Background Microbial genomes do not merely evolve through the slow accumulation of mutations, but also, and often more dramatically, by taking up new DNA in a process called horizontal gene transfer. These innovation leaps in the acquisition of new traits can take place via the introgression of single genes, but also through the acquisition of large gene clusters, which are termed Genomic Islands. Since only a small proportion of all the DNA diversity has been sequenced, it can be hard to find the appropriate donors for acquired genes via sequence alignments from databases. In contrast, relative oligonucleotide frequencies represent a remarkably stable genomic signature in prokaryotes, which facilitates compositional comparisons as an alignment-free alternative for phylogenetic relatedness. In this project, we test whether Genomic Islands identified in individual bacterial genomes have a similar genomic signature, in terms of relative dinucleotide frequencies, and can therefore be expected to originate from a common donor species. Results When multiple Genomic Islands are present within a single genome, we find that up to 28% of these are compositionally very similar to each other, indicative of frequent recurring acquisitions from the same donor to the same acceptor. Conclusions This represents the first quantitative assessment of common directional transfer events in prokaryotic evolutionary history. We suggest that many of the resident Genomic Islands per prokaryotic genome originated from the same source, which may have implications with respect to their regulatory interactions, and for the elucidation of the common origins of these acquired gene clusters.

  16. RosettaTMH: a method for membrane protein structure elucidation combining EPR distance restraints with assembly of transmembrane helices

    Directory of Open Access Journals (Sweden)

    Andrew Leaver-Fay

    2015-12-01

    Full Text Available Membrane proteins make up approximately one third of all proteins, and they play key roles in a plethora of physiological processes. However, membrane proteins make up less than 2% of experimentally determined structures, despite significant advances in structure determination methods, such as X-ray crystallography, nuclear magnetic resonance spectroscopy, and cryo-electron microscopy. One potential alternative means of structure elucidation is to combine computational methods with experimental EPR data. In 2011, Hirst and others introduced RosettaEPR and demonstrated that this approach could be successfully applied to fold soluble proteins. Furthermore, few computational methods for de novo folding of integral membrane proteins have been presented. In this work, we present RosettaTMH, a novel algorithm for structure prediction of helical membrane proteins. A benchmark set of 34 proteins, in which the proteins ranged in size from 91 to 565 residues, was used to compare RosettaTMH to Rosetta’s two existing membrane protein folding protocols: the published RosettaMembrane folding protocol (“MembraneAbinitio” and folding from an extended chain (“ExtendedChain”. When EPR distance restraints are used, RosettaTMH+EPR outperforms ExtendedChain+EPR for 11 proteins, including the largest six proteins tested. RosettaTMH+EPR is capable of achieving native-like folds for 30 of 34 proteins tested, including receptors and transporters. For example, the average RMSD100SSE relative to the crystal structure for rhodopsin was 6.1 ± 0.4 Å and 6.5 ± 0.6 Å for the 449-residue nitric oxide reductase subunit B, where the standard deviation reflects variance in RMSD100SSE values across ten different EPR distance restraint sets. The addition of RosettaTMH and RosettaTMH+EPR to the Rosetta family of de novo folding methods broadens the scope of helical membrane proteins that can be accurately modeled with this software suite.

  17. Genome Dynamics of Escherichia coli during Antibiotic Treatment: Transfer, Loss, and Persistence of Genetic Elements In situ of the Infant Gut

    DEFF Research Database (Denmark)

    Porse, Andreas; Gumpert, Heidi; Kubicek-Sutherland, Jessica Z.

    2017-01-01

    made to elucidate the genome dynamics of E. coli in its native settings. Here, we follow the genome dynamics of co-existing E. coli lineages in situ of the infant gut during the first year of life. One E. coli lineage causes a urinary tract infection (UTI) and experiences several alterations of its...

  18. Organizational heterogeneity of vertebrate genomes.

    Science.gov (United States)

    Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

    2012-01-01

    Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  19. Organizational heterogeneity of vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Svetlana Frenkel

    Full Text Available Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  20. Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups

    Directory of Open Access Journals (Sweden)

    Guillermo Nourdin-Galindo

    2017-10-01

    Full Text Available Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis, functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these

  1. Genome-wide identification, characterization, and expression profile of aquaporin gene family in flax (Linum usitatissimum).

    Science.gov (United States)

    Shivaraj, S M; Deshmukh, Rupesh K; Rai, Rhitu; Bélanger, Richard; Agrawal, Pawan K; Dash, Prasanta K

    2017-04-27

    Membrane intrinsic proteins (MIPs) form transmembrane channels and facilitate transport of myriad substrates across the cell membrane in many organisms. Majority of plant MIPs have water transporting ability and are commonly referred as aquaporins (AQPs). In the present study, we identified aquaporin coding genes in flax by genome-wide analysis, their structure, function and expression pattern by pan-genome exploration. Cross-genera phylogenetic analysis with known aquaporins from rice, arabidopsis, and poplar showed five subgroups of flax aquaporins representing 16 plasma membrane intrinsic proteins (PIPs), 17 tonoplast intrinsic proteins (TIPs), 13 NOD26-like intrinsic proteins (NIPs), 2 small basic intrinsic proteins (SIPs), and 3 uncharacterized intrinsic proteins (XIPs). Amongst aquaporins, PIPs contained hydrophilic aromatic arginine (ar/R) selective filter but TIP, NIP, SIP and XIP subfamilies mostly contained hydrophobic ar/R selective filter. Analysis of RNA-seq and microarray data revealed high expression of PIPs in multiple tissues, low expression of NIPs, and seed specific expression of TIP3 in flax. Exploration of aquaporin homologs in three closely related Linum species bienne, grandiflorum and leonii revealed presence of 49, 39 and 19 AQPs, respectively. The genome-wide identification of aquaporins, first in flax, provides insight to elucidate their physiological and developmental roles in flax.

  2. Software for computing and annotating genomic ranges.

    Science.gov (United States)

    Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J

    2013-01-01

    We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  3. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome

    NARCIS (Netherlands)

    Collins, Ryan L; Brand, Harrison; Redin, Claire E.; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T.; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A.; Lucente, Diane; Levy, Brynn; Sanders, Jan-Stephan; Wapner, Ronald J.; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E.

    2017-01-01

    Background: Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. Results: We sequenced 689 participants with autism spectrum disorder (ASD) and other

  4. The role of parasite-driven selection in shaping landscape genomic structure in red grouse (Lagopus lagopus scotica).

    Science.gov (United States)

    Wenzel, Marius A; Douglas, Alex; James, Marianne C; Redpath, Steve M; Piertney, Stuart B

    2016-01-01

    Landscape genomics promises to provide novel insights into how neutral and adaptive processes shape genome-wide variation within and among populations. However, there has been little emphasis on examining whether individual-based phenotype-genotype relationships derived from approaches such as genome-wide association (GWAS) manifest themselves as a population-level signature of selection in a landscape context. The two may prove irreconcilable as individual-level patterns become diluted by high levels of gene flow and complex phenotypic or environmental heterogeneity. We illustrate this issue with a case study that examines the role of the highly prevalent gastrointestinal nematode Trichostrongylus tenuis in shaping genomic signatures of selection in red grouse (Lagopus lagopus scotica). Individual-level GWAS involving 384 SNPs has previously identified five SNPs that explain variation in T. tenuis burden. Here, we examine whether these same SNPs display population-level relationships between T. tenuis burden and genetic structure across a small-scale landscape of 21 sites with heterogeneous parasite pressure. Moreover, we identify adaptive SNPs showing signatures of directional selection using F(ST) outlier analysis and relate population- and individual-level patterns of multilocus neutral and adaptive genetic structure to T. tenuis burden. The five candidate SNPs for parasite-driven selection were neither associated with T. tenuis burden on a population level, nor under directional selection. Similarly, there was no evidence of parasite-driven selection in SNPs identified as candidates for directional selection. We discuss these results in the context of red grouse ecology and highlight the broader consequences for the utility of landscape genomics approaches for identifying signatures of selection. © 2015 John Wiley & Sons Ltd.

  5. Structure elucidation and toxicity analyses of the radiolytic products of aflatoxin B{sub 1} in methanol-water solution

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Feng [Institute of Agro-food Science and Technology of Chinese Academy of Agricultural Sciences, 2nd Yuanmingyuan West Road, Hai Dian District, Beijing 100193 (China); Key Opening Laboratory of Agricultural Products Processing and Quality Control, Ministry of Agriculture, 2nd Yuanmingyuan West Road, Hai Dian District, Beijing 100193 (China); Graduate School of Chinese Academy of Agricultural Sciences, 12th Zhongguancun South Road, Hai Dian District, Beijing 100081 (China); Xie, Fang [Institute of Agro-food Science and Technology of Chinese Academy of Agricultural Sciences, 2nd Yuanmingyuan West Road, Hai Dian District, Beijing 100193 (China); Key Opening Laboratory of Agricultural Products Processing and Quality Control, Ministry of Agriculture, 2nd Yuanmingyuan West Road, Hai Dian District, Beijing 100193 (China); Xue, Xiaofeng [Bee Research Institute of Chinese Academy of Agricultural Sciences, 1st Xiangshan North Ditch, Hai Dian District, Beijing 100093 (China); Wang, Zhidong; Fan, Bei [Institute of Agro-food Science and Technology of Chinese Academy of Agricultural Sciences, 2nd Yuanmingyuan West Road, Hai Dian District, Beijing 100193 (China); Key Opening Laboratory of Agricultural Products Processing and Quality Control, Ministry of Agriculture, 2nd Yuanmingyuan West Road, Hai Dian District, Beijing 100193 (China); Ha, Yiming, E-mail: wxfay2011@hotmail.com [Institute of Agro-food Science and Technology of Chinese Academy of Agricultural Sciences, 2nd Yuanmingyuan West Road, Hai Dian District, Beijing 100193 (China); Key Opening Laboratory of Agricultural Products Processing and Quality Control, Ministry of Agriculture, 2nd Yuanmingyuan West Road, Hai Dian District, Beijing 100193 (China)

    2011-09-15

    Highlights: {yields} Radiolytic products of aflatoxin B{sub 1} were produced under gamma irradiation. {yields} Seven key radiolytic products were structure-elucidated. {yields} Free-radical species in radiolytic solution resulted in the formation of products. {yields} Based on the structure-activity relationship analysis, the toxicity of radiolytic products was significantly reduced compared with that of AFB{sub 1}. {yields} The addition reaction on furan ring double bond was the reason for the reduced toxicity. - Abstract: The identification of the radiolytic products of mycotoxins is a key issue in the feasibility study of gamma ray radiation detoxification. Methanol-water solution (60:40, v/v) spiked with aflatoxin B{sub 1} (AFB{sub 1}; 20 mg L{sup -1}) was irradiated with Co{sup 60} gamma ray to generate radiolytic products. Liquid chromatography-quadruple time-of-flight mass spectrometry was applied to identify the radiolytic products of AFB{sub 1}. Accurate mass and proposed molecular formulas with a high-matching property of more than 20 radiolytic products were obtained. Seven key radiolytic products were proposed based on the molecular formulas and tandem mass spectrometry spectra. The analyses of toxicity and formation pathways were proposed based on the structure of the radiolytic products. The addition reaction caused by the free-radical species in the methanol-water solution resulted in the formation of most radiolytic products. Based on the structure-activity relationship analysis, the toxicity of radiolytic products was significantly reduced compared with that of AFB{sub 1} because of the addition reaction that occurred on the double bond in the terminal furan ring. For this reason, gamma irradiation is deemed an effective tool for the detoxification of AFB{sub 1}.

  6. Comparative genome analysis of non-toxigenic non-O1 versus toxigenic O1 Vibrio cholerae

    OpenAIRE

    Mukherjee, Munmun; Kakarla, Prathusha; Kumar, Sanath; Gonzalez, Esmeralda; Floyd, Jared T.; Inupakutika, Madhuri; Devireddy, Amith Reddy; Tirrell, Selena R.; Bruns, Merissa; He, Guixin; Lindquist, Ingrid E.; Sundararajan, Anitha; Schilkey, Faye D.; Mudge, Joann; Varela, Manuel F.

    2014-01-01

    Pathogenic strains of Vibrio cholerae are responsible for endemic and pandemic outbreaks of the disease cholera. The complete toxigenic mechanisms underlying virulence in Vibrio strains are poorly understood. The hypothesis of this work was that virulent versus non-virulent strains of V. cholerae harbor distinctive genomic elements that encode virulence. The purpose of this study was to elucidate genomic differences between the O1 serotypes and non-O1 V. cholerae PS15, a non-toxigenic strain,...

  7. Genome-wide identification and structure-function studies of proteases and protease inhibitors in Cicer arietinum (chickpea).

    Science.gov (United States)

    Sharma, Ranu; Suresh, C G

    2015-01-01

    Proteases are a family of enzymes present in almost all living organisms. In plants they are involved in many biological processes requiring stress response in situations such as water deficiency, pathogen attack, maintaining protein content of the cell, programmed cell death, senescence, reproduction and many more. Similarly, protease inhibitors (PIs) are involved in various important functions like suppression of invasion by pathogenic nematodes, inhibition of spores-germination and mycelium growth of Alternaria alternata and response to wounding and fungal attack. As much as we know, no genome-wide study of proteases together with proteinaceous PIs is reported in any of the sequenced genomes till now. Phylogenetic studies and domain analysis of proteases were carried out to understand the molecular evolution as well as gene and protein features. Structural analysis was carried out to explore the binding mode and affinity of PIs for cognate proteases and prolyl oligopeptidase protease with inhibitor ligand. In the study reported here, a significant number of proteases and PIs were identified in chickpea genome. The gene expression profiles of proteases and PIs in five different plant tissues revealed a differential expression pattern in more than one plant tissue. Molecular dynamics studies revealed the formation of stable complex owing to increased number of protein-ligand and inter and intramolecular protein-protein hydrogen bonds. The genome-wide identification, characterization, evolutionary understanding, gene expression, and structural analysis of proteases and PIs provide a framework for future analysis when defining their roles in stress response and developing a more stress tolerant variety of chickpea. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo.

    Science.gov (United States)

    Zubradt, Meghan; Gupta, Paromita; Persad, Sitara; Lambowitz, Alan M; Weissman, Jonathan S; Rouskin, Silvi

    2017-01-01

    Coupling of structure-specific in vivo chemical modification to next-generation sequencing is transforming RNA secondary structure studies in living cells. The dominant strategy for detecting in vivo chemical modifications uses reverse transcriptase truncation products, which introduce biases and necessitate population-average assessments of RNA structure. Here we present dimethyl sulfate (DMS) mutational profiling with sequencing (DMS-MaPseq), which encodes DMS modifications as mismatches using a thermostable group II intron reverse transcriptase. DMS-MaPseq yields a high signal-to-noise ratio, can report multiple structural features per molecule, and allows both genome-wide studies and focused in vivo investigations of even low-abundance RNAs. We apply DMS-MaPseq for the first analysis of RNA structure within an animal tissue and to identify a functional structure involved in noncanonical translation initiation. Additionally, we use DMS-MaPseq to compare the in vivo structure of pre-mRNAs with their mature isoforms. These applications illustrate DMS-MaPseq's capacity to dramatically expand in vivo analysis of RNA structure.

  9. The complete mitochondrial genome and its remarkable secondary structure for a stonefly Acroneuria hainana Wu (Insecta: Plecoptera, Perlidae).

    Science.gov (United States)

    Huang, Mingchao; Wang, Yuyu; Liu, Xingyue; Li, Weihai; Kang, Zehui; Wang, Kai; Li, Xuankun; Yang, Ding

    2015-02-15

    The Plecoptera (stoneflies) is a hemimetabolous order of insects, whose larvae are usually used as indicators for fresh water biomonitoring. Herein, we describe the complete mitochondrial (mt) genome of a stonefly species, namely Acroneuria hainana Wu belonging to the family Perlidae. This mt genome contains 13 PCGs, 22 tRNA-coding genes and 2 rRNA-coding genes that are conserved in most insect mt genomes, and it also has the identical gene order with the insect ancestral gene order. However, there are three special initiation codons of ND1, ND5 and COI in PCGs: TTG, GTG and CGA, coding for L, V and R, respectively. Additionally, the 899-bp control region, with 73.30% A+T content, has two long repeated sequences which are found at the 3'-end closing to the tRNA(Ile) gene. Both of them can be folded into a stem-loop structure, whose adjacent upstream and downstream sequences can be also folded into stem-loop structures. It is presumed that the four special structures in series could be associated with the D-loop replication. It might be able to adjust the replication speed of two replicate directions. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. Cas9 versus Cas12a/Cpf1: Structure-function comparisons and implications for genome editing.

    Science.gov (United States)

    Swarts, Daan C; Jinek, Martin

    2018-05-22

    Cas9 and Cas12a are multidomain CRISPR-associated nucleases that can be programmed with a guide RNA to bind and cleave complementary DNA targets. The guide RNA sequence can be varied, making these effector enzymes versatile tools for genome editing and gene regulation applications. While Cas9 is currently the best-characterized and most widely used nuclease for such purposes, Cas12a (previously named Cpf1) has recently emerged as an alternative for Cas9. Cas9 and Cas12a have distinct evolutionary origins and exhibit different structural architectures, resulting in distinct molecular mechanisms. Here we compare the structural and mechanistic features that distinguish Cas9 and Cas12a, and describe how these features modulate their activity. We discuss implications for genome editing, and how they may influence the choice of Cas9 or Cas12a for specific applications. Finally, we review recent studies in which Cas12a has been utilized as a genome editing tool. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications Regulatory RNAs/RNAi/Riboswitches > Biogenesis of Effector Small RNAs RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes. © 2018 Wiley Periodicals, Inc.

  11. Functional annotation by sequence-weighted structure alignments: statistical analysis and case studies from the Protein 3000 structural genomics project in Japan.

    Science.gov (United States)

    Standley, Daron M; Toh, Hiroyuki; Nakamura, Haruki

    2008-09-01

    A method to functionally annotate structural genomics targets, based on a novel structural alignment scoring function, is proposed. In the proposed score, position-specific scoring matrices are used to weight structurally aligned residue pairs to highlight evolutionarily conserved motifs. The functional form of the score is first optimized for discriminating domains belonging to the same Pfam family from domains belonging to different families but the same CATH or SCOP superfamily. In the optimization stage, we consider four standard weighting functions as well as our own, the "maximum substitution probability," and combinations of these functions. The optimized score achieves an area of 0.87 under the receiver-operating characteristic curve with respect to identifying Pfam families within a sequence-unique benchmark set of domain pairs. Confidence measures are then derived from the benchmark distribution of true-positive scores. The alignment method is next applied to the task of functionally annotating 230 query proteins released to the public as part of the Protein 3000 structural genomics project in Japan. Of these queries, 78 were found to align to templates with the same Pfam family as the query or had sequence identities > or = 30%. Another 49 queries were found to match more distantly related templates. Within this group, the template predicted by our method to be the closest functional relative was often not the most structurally similar. Several nontrivial cases are discussed in detail. Finally, 103 queries matched templates at the fold level, but not the family or superfamily level, and remain functionally uncharacterized. 2008 Wiley-Liss, Inc.

  12. Integration of structural dynamics and molecular evolution via protein interaction networks: a new era in genomic medicine.

    Science.gov (United States)

    Kumar, Avishek; Butler, Brandon M; Kumar, Sudhir; Ozkan, S Banu

    2015-12-01

    Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Identification of balanced chromosomal rearrangements previously unknown among participants in the 1000 Genomes Project: implications for interpretation of structural variation in genomes and the future of clinical cytogenetics.

    Science.gov (United States)

    Dong, Zirui; Wang, Huilin; Chen, Haixiao; Jiang, Hui; Yuan, Jianying; Yang, Zhenjun; Wang, Wen-Jing; Xu, Fengping; Guo, Xiaosen; Cao, Ye; Zhu, Zhenzhen; Geng, Chunyu; Cheung, Wan Chee; Kwok, Yvonne K; Yang, Huanming; Leung, Tak Yeung; Morton, Cynthia C; Cheung, Sau Wai; Choy, Kwong Wai

    2017-11-02

    PurposeRecent studies demonstrate that whole-genome sequencing enables detection of cryptic rearrangements in apparently balanced chromosomal rearrangements (also known as balanced chromosomal abnormalities, BCAs) previously identified by conventional cytogenetic methods. We aimed to assess our analytical tool for detecting BCAs in the 1000 Genomes Project without knowing which bands were affected.MethodsThe 1000 Genomes Project provides an unprecedented integrated map of structural variants in phenotypically normal subjects, but there is no information on potential inclusion of subjects with apparent BCAs akin to those traditionally detected in diagnostic cytogenetics laboratories. We applied our analytical tool to 1,166 genomes from the 1000 Genomes Project with sufficient physical coverage (8.25-fold).ResultsWith this approach, we detected four reciprocal balanced translocations and four inversions, ranging in size from 57.9 kb to 13.3 Mb, all of which were confirmed by cytogenetic methods and polymerase chain reaction studies. One of these DNAs has a subtle translocation that is not readily identified by chromosome analysis because of the similarity of the banding patterns and size of exchanged segments, and another results in disruption of all transcripts of an OMIM gene.ConclusionOur study demonstrates the extension of utilizing low-pass whole-genome sequencing for unbiased detection of BCAs including translocations and inversions previously unknown in the 1000 Genomes Project.GENETICS in MEDICINE advance online publication, 2 November 2017; doi:10.1038/gim.2017.170.

  14. Elucidating the vacuum structure of the Aoki phase

    International Nuclear Information System (INIS)

    Azcoiti, Vicente; Di Carlo, Giuseppe; Follana, Eduardo; Vaquero, Alejandro

    2013-01-01

    In this paper, we discuss the vacuum structure of QCD with two flavors of Wilson fermions, inside the Aoki phase. We provide numerical evidence, coming from HMC simulations in 4 4 , 6 4 and 8 4 lattices, supporting a vacuum structure for this model at strong coupling more complex than the one assumed in the standard wisdom, with new vacua where the expectation value of iψ ¯ γ 5 ψ can take non-zero values, and which can not be connected with the Aoki vacua by parity–flavor symmetry transformations

  15. Mitochondrial Genome Sequences and Structures Aid in the Resolution of Piroplasmida phylogeny

    Science.gov (United States)

    Marr, Henry S.; Tarigo, Jaime L.; Cohn, Leah A.; Bird, David M.; Scholl, Elizabeth H.; Levy, Michael G.; Wiegmann, Brian M.; Birkenheuer, Adam J.

    2016-01-01

    The taxonomy of the order Piroplasmida, which includes a number of clinically and economically relevant organisms, is a hotly debated topic amongst parasitologists. Three genera (Babesia, Theileria, and Cytauxzoon) are recognized based on parasite life cycle characteristics, but molecular phylogenetic analyses of 18S sequences have suggested the presence of five or more distinct Piroplasmida lineages. Despite these important advancements, a few studies have been unable to define the taxonomic relationships of some organisms (e.g. C. felis and T. equi) with respect to other Piroplasmida. Additional evidence from mitochondrial genome sequences and synteny should aid in the inference of Piroplasmida phylogeny and resolution of taxonomic uncertainties. In this study, we have amplified, sequenced, and annotated seven previously uncharacterized mitochondrial genomes (Babesia canis, Babesia vogeli, Babesia rossi, Babesia sp. Coco, Babesia conradae, Babesia microti-like sp., and Cytauxzoon felis) and identified additional ribosomal fragments in ten previously characterized mitochondrial genomes. Phylogenetic analysis of concatenated mitochondrial and 18S sequences as well as cox1 amino acid sequence identified five distinct Piroplasmida groups, each of which possesses a unique mitochondrial genome structure. Specifically, our results confirm the existence of four previously identified clades (B. microti group, Babesia sensu stricto, Theileria equi, and a Babesia sensu latu group that includes B. conradae) while supporting the integration of Theileria and Cytauxzoon species into a single fifth taxon. Although known biological characteristics of Piroplasmida corroborate the proposed phylogeny, more investigation into parasite life cycles is warranted to further understand the evolution of the Piroplasmida. Our results provide an evolutionary framework for comparative biology of these important animal and human pathogens and help focus renewed efforts toward understanding the

  16. Mitochondrial Genome Sequences and Structures Aid in the Resolution of Piroplasmida phylogeny.

    Directory of Open Access Journals (Sweden)

    Megan E Schreeg

    Full Text Available The taxonomy of the order Piroplasmida, which includes a number of clinically and economically relevant organisms, is a hotly debated topic amongst parasitologists. Three genera (Babesia, Theileria, and Cytauxzoon are recognized based on parasite life cycle characteristics, but molecular phylogenetic analyses of 18S sequences have suggested the presence of five or more distinct Piroplasmida lineages. Despite these important advancements, a few studies have been unable to define the taxonomic relationships of some organisms (e.g. C. felis and T. equi with respect to other Piroplasmida. Additional evidence from mitochondrial genome sequences and synteny should aid in the inference of Piroplasmida phylogeny and resolution of taxonomic uncertainties. In this study, we have amplified, sequenced, and annotated seven previously uncharacterized mitochondrial genomes (Babesia canis, Babesia vogeli, Babesia rossi, Babesia sp. Coco, Babesia conradae, Babesia microti-like sp., and Cytauxzoon felis and identified additional ribosomal fragments in ten previously characterized mitochondrial genomes. Phylogenetic analysis of concatenated mitochondrial and 18S sequences as well as cox1 amino acid sequence identified five distinct Piroplasmida groups, each of which possesses a unique mitochondrial genome structure. Specifically, our results confirm the existence of four previously identified clades (B. microti group, Babesia sensu stricto, Theileria equi, and a Babesia sensu latu group that includes B. conradae while supporting the integration of Theileria and Cytauxzoon species into a single fifth taxon. Although known biological characteristics of Piroplasmida corroborate the proposed phylogeny, more investigation into parasite life cycles is warranted to further understand the evolution of the Piroplasmida. Our results provide an evolutionary framework for comparative biology of these important animal and human pathogens and help focus renewed efforts toward

  17. Common structural and epigenetic changes in the genome of castration-resistant prostate cancer.

    Science.gov (United States)

    Friedlander, Terence W; Roy, Ritu; Tomlins, Scott A; Ngo, Vy T; Kobayashi, Yasuko; Azameera, Aruna; Rubin, Mark A; Pienta, Kenneth J; Chinnaiyan, Arul; Ittmann, Michael M; Ryan, Charles J; Paris, Pamela L

    2012-02-01

    Progression of primary prostate cancer to castration-resistant prostate cancer (CRPC) is associated with numerous genetic and epigenetic alterations that are thought to promote survival at metastatic sites. In this study, we investigated gene copy number and CpG methylation status in CRPC to gain insight into specific pathophysiologic pathways that are active in this advanced form of prostate cancer. Our analysis defined and validated 495 genes exhibiting significant differences in CRPC in gene copy number, including gains in androgen receptor (AR) and losses of PTEN and retinoblastoma 1 (RB1). Significant copy number differences existed between tumors with or without AR gene amplification, including a common loss of AR repressors in AR-unamplified tumors. Simultaneous gene methylation and allelic deletion occurred frequently in RB1 and HSD17B2, the latter of which is involved in testosterone metabolism. Lastly, genomic DNA from most CRPC was hypermethylated compared with benign prostate tissue. Our findings establish a comprehensive methylation signature that couples epigenomic and structural analyses, thereby offering insights into the genomic alterations in CRPC that are associated with a circumvention of hormonal therapy. Genes identified in this integrated genomic study point to new drug targets in CRPC, an incurable disease state which remains the chief therapeutic challenge. ©2012 AACR.

  18. Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae

    Directory of Open Access Journals (Sweden)

    Yuan Huang

    2017-06-01

    Full Text Available Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three Salix species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus Populus, which most likely results from homoplasy. By comparing three Salix chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of P. trichocarpa at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in

  19. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  20. Population structure and genomic inbreeding in nine Swiss dairy cattle populations.

    Science.gov (United States)

    Signer-Hasler, Heidi; Burren, Alexander; Neuditschko, Markus; Frischknecht, Mirjam; Garrick, Dorian; Stricker, Christian; Gredler, Birgit; Bapst, Beat; Flury, Christine

    2017-11-07

    Domestication, breed formation and intensive selection have resulted in divergent cattle breeds that likely exhibit their own genomic signatures. In this study, we used genotypes from 27,612 autosomal single nucleotide polymorphisms to characterize population structure based on 9214 sires representing nine Swiss dairy cattle populations: Brown Swiss (BS), Braunvieh (BV), Original Braunvieh (OB), Holstein (HO), Red Holstein (RH), Swiss Fleckvieh (SF), Simmental (SI), Eringer (ER) and Evolèner (EV). Genomic inbreeding (F ROH ) and signatures of selection were determined by calculating runs of homozygosity (ROH). The results build the basis for a better understanding of the genetic development of Swiss dairy cattle populations and highlight differences between the original populations (i.e. OB, SI, ER and EV) and those that have become more popular in Switzerland as currently reflected by their larger populations (i.e. BS, BV, HO, RH and SF). The levels of genetic diversity were highest and lowest in the SF and BS breeds, respectively. Based on F ST values, we conclude that, among all pairwise comparisons, BS and HO (0.156) differ more than the other pairs of populations. The original Swiss cattle populations OB, SI, ER, and EV are clearly genetically separated from the Swiss cattle populations that are now more common and represented by larger numbers of cows. Mean levels of F ROH ranged from 0.027 (ER) to 0.091 (BS). Three of the original Swiss cattle populations, ER (F ROH : 0.027), OB (F ROH : 0.029), and SI (F ROH : 0.039), showed low levels of genomic inbreeding, whereas it was much higher in EV (F ROH : 0.074). Private signatures of selection for the original Swiss cattle populations are reported for BTA4, 5, 11 and 26. The low levels of genomic inbreeding observed in the original Swiss cattle populations ER, OB and SI compared to the other breeds are explained by a lesser use of artificial insemination and greater use of natural service. Natural service

  1. Insular Celtic population structure and genomic footprints of migration.

    Directory of Open Access Journals (Sweden)

    Ross P Byrne

    2018-01-01

    Full Text Available Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

  2. Genome survey of pistachio (Pistacia vera L.) by next generation sequencing: Development of novel SSR markers and genetic diversity in Pistacia species.

    Science.gov (United States)

    Ziya Motalebipour, Elmira; Kafkas, Salih; Khodaeiaminjan, Mortaza; Çoban, Nergiz; Gözel, Hatice

    2016-12-07

    Pistachio (Pistacia vera L.) is one of the most important nut crops in the world. There are about 11 wild species in the genus Pistacia, and they have importance as rootstock seed sources for cultivated P. vera and forest trees. Published information on the pistachio genome is limited. Therefore, a genome survey is necessary to obtain knowledge on the genome structure of pistachio by next generation sequencing. Simple sequence repeat (SSR) markers are useful tools for germplasm characterization, genetic diversity analysis, and genetic linkage mapping, and may help to elucidate genetic relationships among pistachio cultivars and species. To explore the genome structure of pistachio, a genome survey was performed using the Illumina platform at approximately 40× coverage depth in the P. vera cv. Siirt. The K-mer analysis indicated that pistachio has a genome that is about 600 Mb in size and is highly heterozygous. The assembly of 26.77 Gb Illumina data produced 27,069 scaffolds at N50 = 3.4 kb with a total of 513.5 Mb. A total of 59,280 SSR motifs were detected with a frequency of 8.67 kb. A total of 206 SSRs were used to characterize 24 P. vera cultivars and 20 wild Pistacia genotypes (four genotypes from each five wild Pistacia species) belonging to P. atlantica, P. integerrima, P. chinenesis, P. terebinthus, and P. lentiscus genotypes. Overall 135 SSR loci amplified in all 44 cultivars and genotypes, 41 were polymorphic in six Pistacia species. The novel SSR loci developed from cultivated pistachio were highly transferable to wild Pistacia species. The results from a genome survey of pistachio suggest that the genome size of pistachio is about 600 Mb with a high heterozygosity rate. This information will help to design whole genome sequencing strategies for pistachio. The newly developed novel polymorphic SSRs in this study may help germplasm characterization, genetic diversity, and genetic linkage mapping studies in the genus Pistacia.

  3. Experimental Induction of Genome Chaos.

    Science.gov (United States)

    Ye, Christine J; Liu, Guo; Heng, Henry H

    2018-01-01

    Genome chaos, or karyotype chaos, represents a powerful survival strategy for somatic cells under high levels of stress/selection. Since the genome context, not the gene content, encodes the genomic blueprint of the cell, stress-induced rapid and massive reorganization of genome topology functions as a very important mechanism for genome (karyotype) evolution. In recent years, the phenomenon of genome chaos has been confirmed by various sequencing efforts, and many different terms have been coined to describe different subtypes of the chaotic genome including "chromothripsis," "chromoplexy," and "structural mutations." To advance this exciting field, we need an effective experimental system to induce and characterize the karyotype reorganization process. In this chapter, an experimental protocol to induce chaotic genomes is described, following a brief discussion of the mechanism and implication of genome chaos in cancer evolution.

  4. Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models.

    Science.gov (United States)

    Lehermeier, Christina; Schön, Chris-Carolin; de Los Campos, Gustavo

    2015-09-01

    Plant breeding populations exhibit varying levels of structure and admixture; these features are likely to induce heterogeneity of marker effects across subpopulations. Traditionally, structure has been dealt with as a potential confounder, and various methods exist to "correct" for population stratification. However, these methods induce a mean correction that does not account for heterogeneity of marker effects. The animal breeding literature offers a few recent studies that consider modeling genetic heterogeneity in multibreed data, using multivariate models. However, these methods have received little attention in plant breeding where population structure can have different forms. In this article we address the problem of analyzing data from heterogeneous plant breeding populations, using three approaches: (a) a model that ignores population structure [A-genome-based best linear unbiased prediction (A-GBLUP)], (b) a stratified (i.e., within-group) analysis (W-GBLUP), and (c) a multivariate approach that uses multigroup data and accounts for heterogeneity (MG-GBLUP). The performance of the three models was assessed on three different data sets: a diversity panel of rice (Oryza sativa), a maize (Zea mays L.) half-sib panel, and a wheat (Triticum aestivum L.) data set that originated from plant breeding programs. The estimated genomic correlations between subpopulations varied from null to moderate, depending on the genetic distance between subpopulations and traits. Our assessment of prediction accuracy features cases where ignoring population structure leads to a parsimonious more powerful model as well as others where the multivariate and stratified approaches have higher predictive power. In general, the multivariate approach appeared slightly more robust than either the A- or the W-GBLUP. Copyright © 2015 by the Genetics Society of America.

  5. Genomic diversity of drug-resistant Mycobacterium tuberculosis isolates in Lisbon Portugal: Towards tuberculosis genomic epidemiology

    Directory of Open Access Journals (Sweden)

    João Perdigão

    2015-01-01

    Full Text Available Multidrug- (MDR and extensively drug-resistant (XDR tuberculosis (TB present a challenge to disease control and elimination goals. Lisbon, Portugal, has a high TB incidencerate and unusual and successful XDR-TB strains that have been found in circulation foralmost two decades. For the last 20 years, a continued circulation of two phylogenetic clades, Lisboa3 and Q1, which are highly associated with MDR and XDR, have been observed. In recent years, these strains have been well characterized regarding the molecular basis of drug resistance and have been inclusively subjected to whole genome sequencing (WGS. Researchers have been studying the genomic diversity of strains circulating in Lisbon and its genomic determinants through cutting-edge next generation sequencing. An enormous amount of whole genome sequence data are now available for the most prevalent and clinically relevant strains circulating in Lisbon. It is the persistence, prevalence and rapid evolution towards drug resistance that has prompted researchers to investigate the properties of these strains at the genomic level and in the future at a global transcriptomic level. Seventy Mycobacterium tuberculosis (MTB isolates, mostly recovered in Lisbon, were genotyped by 24-loci Mycobacterial Interspersed Repetitive Unit – Variable Number of Tandem Repeats (MIRU-VNTR and the genomes sequenced using a next generation sequencing platform – Illumina HiSeq 2000. The genotyping data revealed three major clusters associated with MDR-TB (Lisboa3-A, Lisboa3-B and Q1, two of which are associated with XDR-TB (Lisboa3-B and Q1, whilst the genomic data contributed to elucidating the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of MTB clinical isolates, revealed two major clades responsible for MDR/XDR-TB in the region

  6. Genomic diversity of drug-resistant Mycobacterium tuberculosis isolates in Lisbon Portugal: Towards tuberculosis genomic epidemiology

    KAUST Repository

    Perdigã o, Joã o; Silva, Hugo; Machado, Diana; Macedo, Rita; Maltez, Fernando; Silva, Carla; Jordao, Luisa; Couto, Isabel; Mallard, Kim; Coll, Francesc; Hill-Cawthorne, Grant A.; McNerney, Ruth; Pain, Arnab; Clark, Taane G.; Viveiros, Miguel; Portugal, Isabel

    2015-01-01

    Multidrug- (MDR) and extensively drug-resistant (XDR) tuberculosis (TB) present a challenge to disease control and elimination goals. Lisbon, Portugal, has a high TB incidence rate and unusual and successful XDR-TB strains that have been found in circulation for almost two decades. For the last 20. years, a continued circulation of two phylogenetic clades, Lisboa3 and Q1, which are highly associated with MDR and XDR, have been observed. In recent years, these strains have been well characterized regarding the molecular basis of drug resistance and have been inclusively subjected to whole genome sequencing (WGS). Researchers have been studying the genomic diversity of strains circulating in Lisbon and its genomic determinants through cutting-edge next generation sequencing. An enormous amount of whole genome sequence data are now available for the most prevalent and clinically relevant strains circulating in Lisbon.It is the persistence, prevalence and rapid evolution towards drug resistance that has prompted researchers to investigate the properties of these strains at the genomic level and in the future at a global transcriptomic level. Seventy Mycobacterium tuberculosis (MTB) isolates, mostly recovered in Lisbon, were genotyped by 24-. loci Mycobacterial Interspersed Repetitive Unit - Variable Number of Tandem Repeats (MIRU-VNTR) and the genomes sequenced using a next generation sequencing platform - Illumina HiSeq 2000.The genotyping data revealed three major clusters associated with MDR-TB (Lisboa3-A, Lisboa3-B and Q1), two of which are associated with XDR-TB (Lisboa3-B and Q1), whilst the genomic data contributed to elucidating the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of MTB clinical isolates, revealed two major clades responsible for MDR/XDR-TB in the region: Lisboa3 and Q

  7. Genomic diversity of drug-resistant Mycobacterium tuberculosis isolates in Lisbon Portugal: Towards tuberculosis genomic epidemiology

    KAUST Repository

    Perdigão, João

    2015-03-01

    Multidrug- (MDR) and extensively drug-resistant (XDR) tuberculosis (TB) present a challenge to disease control and elimination goals. Lisbon, Portugal, has a high TB incidence rate and unusual and successful XDR-TB strains that have been found in circulation for almost two decades. For the last 20. years, a continued circulation of two phylogenetic clades, Lisboa3 and Q1, which are highly associated with MDR and XDR, have been observed. In recent years, these strains have been well characterized regarding the molecular basis of drug resistance and have been inclusively subjected to whole genome sequencing (WGS). Researchers have been studying the genomic diversity of strains circulating in Lisbon and its genomic determinants through cutting-edge next generation sequencing. An enormous amount of whole genome sequence data are now available for the most prevalent and clinically relevant strains circulating in Lisbon.It is the persistence, prevalence and rapid evolution towards drug resistance that has prompted researchers to investigate the properties of these strains at the genomic level and in the future at a global transcriptomic level. Seventy Mycobacterium tuberculosis (MTB) isolates, mostly recovered in Lisbon, were genotyped by 24-. loci Mycobacterial Interspersed Repetitive Unit - Variable Number of Tandem Repeats (MIRU-VNTR) and the genomes sequenced using a next generation sequencing platform - Illumina HiSeq 2000.The genotyping data revealed three major clusters associated with MDR-TB (Lisboa3-A, Lisboa3-B and Q1), two of which are associated with XDR-TB (Lisboa3-B and Q1), whilst the genomic data contributed to elucidating the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of MTB clinical isolates, revealed two major clades responsible for MDR/XDR-TB in the region: Lisboa3 and Q

  8. Genomic diversity within the haloalkaliphilic genus Thioalkalivibrio.

    Science.gov (United States)

    Ahn, Anne-Catherine; Meier-Kolthoff, Jan P; Overmars, Lex; Richter, Michael; Woyke, Tanja; Sorokin, Dimitry Y; Muyzer, Gerard

    2017-01-01

    Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibrio strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANIb) and MUMmer (ANIm), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.

  9. Population genomic analysis of ancient and modern genomes yields new insights into the genetic ancestry of the Tyrolean Iceman and the genetic structure of Europe.

    Directory of Open Access Journals (Sweden)

    Martin Sikora

    2014-05-01

    Full Text Available Genome sequencing of the 5,300-year-old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has yielded new insights into his origin and relationship to modern European populations. A key finding of that study was an apparent recent common ancestry with individuals from Sardinia, based largely on the Y chromosome haplogroup and common autosomal SNP variation. Here, we compiled and analyzed genomic datasets from both modern and ancient Europeans, including genome sequence data from over 400 Sardinians and two ancient Thracians from Bulgaria, to investigate this result in greater detail and determine its implications for the genetic structure of Neolithic Europe. Using whole-genome sequencing data, we confirm that the Iceman is, indeed, most closely related to Sardinians. Furthermore, we show that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context. We hypothesize that this genetic affinity of ancient samples from different parts of Europe with Sardinians represents a common genetic component that was geographically widespread across Europe during the Neolithic, likely related to migrations and population expansions associated with the spread of agriculture.

  10. Comparative Reannotation of 21 Aspergillus Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  11. In situ optical sequencing and structure analysis of a trinucleotide repeat genome region by localization microscopy after specific COMBO-FISH nano-probing

    Science.gov (United States)

    Stuhlmüller, M.; Schwarz-Finsterle, J.; Fey, E.; Lux, J.; Bach, M.; Cremer, C.; Hinderhofer, K.; Hausmann, M.; Hildenbrand, G.

    2015-10-01

    Trinucleotide repeat expansions (like (CGG)n) of chromatin in the genome of cell nuclei can cause neurological disorders such as for example the Fragile-X syndrome. Until now the mechanisms are not clearly understood as to how these expansions develop during cell proliferation. Therefore in situ investigations of chromatin structures on the nanoscale are required to better understand supra-molecular mechanisms on the single cell level. By super-resolution localization microscopy (Spectral Position Determination Microscopy; SPDM) in combination with nano-probing using COMBO-FISH (COMBinatorial Oligonucleotide FISH), novel insights into the nano-architecture of the genome will become possible. The native spatial structure of trinucleotide repeat expansion genome regions was analysed and optical sequencing of repetitive units was performed within 3D-conserved nuclei using SPDM after COMBO-FISH. We analysed a (CGG)n-expansion region inside the 5' untranslated region of the FMR1 gene. The number of CGG repeats for a full mutation causing the Fragile-X syndrome was found and also verified by Southern blot. The FMR1 promotor region was similarly condensed like a centromeric region whereas the arrangement of the probes labelling the expansion region seemed to indicate a loop-like nano-structure. These results for the first time demonstrate that in situ chromatin structure measurements on the nanoscale are feasible. Due to further methodological progress it will become possible to estimate the state of trinucleotide repeat mutations in detail and to determine the associated chromatin strand structural changes on the single cell level. In general, the application of the described approach to any genome region will lead to new insights into genome nano-architecture and open new avenues for understanding mechanisms and their relevance in the development of heredity diseases.

  12. Analysis of Genome-Wide Copy Number Variations in Chinese Indigenous and Western Pig Breeds by 60 K SNP Genotyping Arrays

    Science.gov (United States)

    Sun, Yaqi; Wang, Hongyang; Wang, Chao; Yu, Shaobo; Liu, Jing; Zhang, Yu; Fan, Bin; Li, Kui; Liu, Bang

    2014-01-01

    Copy number variations (CNVs) represent a substantial source of structural variants in mammals and contribute to both normal phenotypic variability and disease susceptibility. Although low-resolution CNV maps are produced in many domestic animals, and several reports have been published about the CNVs of porcine genome, the differences between Chinese and western pigs still remain to be elucidated. In this study, we used Porcine SNP60 BeadChip and PennCNV algorithm to perform a genome-wide CNV detection in 302 individuals from six Chinese indigenous breeds (Tongcheng, Laiwu, Luchuan, Bama, Wuzhishan and Ningxiang pigs), three western breeds (Yorkshire, Landrace and Duroc) and one hybrid (Tongcheng×Duroc). A total of 348 CNV Regions (CNVRs) across genome were identified, covering 150.49 Mb of the pig genome or 6.14% of the autosomal genome sequence. In these CNVRs, 213 CNVRs were found to exist only in the six Chinese indigenous breeds, and 60 CNVRs only in the three western breeds. The characters of CNVs in four Chinese normal size breeds (Luchuan, Tongcheng and Laiwu pigs) and two minipig breeds (Bama and Wuzhishan pigs) were also analyzed in this study. Functional annotation suggested that these CNVRs possess a great variety of molecular function and may play important roles in phenotypic and production traits between Chinese and western breeds. Our results are important complementary to the CNV map in pig genome, which provide new information about the diversity of Chinese and western pig breeds, and facilitate further research on porcine genome CNVs. PMID:25198154

  13. Comparative genomics reveals insights into avian genome evolution and adaptation

    Science.gov (United States)

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  14. Draft Genome Sequence of Lactobacillus delbrueckii subsp. bulgaricus LBB.B5.

    Science.gov (United States)

    Urshev, Zoltan; Hajo, Karima; Lenoci, Leonardo; Bron, Peter A; Dijkstra, Annereinou; Alkema, Wynand; Wels, Michiel; Siezen, Roland J; Minkova, Svetlana; van Hijum, Sacha A F T

    2016-10-06

    Lactobacillus delbrueckii subsp. bulgaricus LBB.B5 originates from homemade Bulgarian yogurt and was selected for its ability to form a strong association with Streptococcus thermophilus The genome sequence will facilitate elucidating the genetic background behind the contribution of LBB.B5 to the taste and aroma of yogurt and its exceptional protocooperation with S. thermophilus. Copyright © 2016 Urshev et al.

  15. Radiotaxons and reliability of a genome

    International Nuclear Information System (INIS)

    Korogodin, V.I.

    1982-01-01

    Radiosensitivity of cells (D 0 ) is considered with regard to the structural organization of the genome. The following terms are introduced: ''karyotaxon'', organisms with identical structural organization of the genome, and ''specific genome stability'' K=D 0 C, where C is the quantity of DNA in the cell nucleus; K is the amount of energy (eV) the sorption of which in DNA is necessary and sufficient for one elementary damage to occur. It was shown that Ksub(i)=const. within every karyotaxon ''i''. K 1 =100 eV for viruses, and K 4 =61000 eV for the highest level of genome organization (diploid eukaryotes including man). Potential mechanisms of increasing Ksub(i) with increasing level of genome organization and the role of this factor in evolution are discussed [ru

  16. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  17. Complete Chloroplast Genomes of Papaver rhoeas and Papaver orientale: Molecular Structures, Comparative Analysis, and Phylogenetic Analysis

    Directory of Open Access Journals (Sweden)

    Jianguo Zhou

    2018-02-01

    Full Text Available Papaver rhoeas L. and P. orientale L., which belong to the family Papaveraceae, are used as ornamental and medicinal plants. The chloroplast genome has been used for molecular markers, evolutionary biology, and barcoding identification. In this study, the complete chloroplast genome sequences of P. rhoeas and P. orientale are reported. Results show that the complete chloroplast genomes of P. rhoeas and P. orientale have typical quadripartite structures, which are comprised of circular 152,905 and 152,799-bp-long molecules, respectively. A total of 130 genes were identified in each genome, including 85 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Sequence divergence analysis of four species from Papaveraceae indicated that the most divergent regions are found in the non-coding spacers with minimal differences among three Papaver species. These differences include the ycf1 gene and intergenic regions, such as rpoB-trnC, trnD-trnT, petA-psbJ, psbE-petL, and ccsA-ndhD. These regions are hypervariable regions, which can be used as specific DNA barcodes. This finding suggested that the chloroplast genome could be used as a powerful tool to resolve the phylogenetic positions and relationships of Papaveraceae. These results offer valuable information for future research in the identification of Papaver species and will benefit further investigations of these species.

  18. Elucidation of Mechanisms of Ceftazidime Resistance among Clinical Isolates of Pseudomonas aeruginosa by Using Genomic Data.

    Science.gov (United States)

    Kos, Veronica N; McLaughlin, Robert E; Gardner, Humphrey A

    2016-06-01

    Ceftazidime is one of the few cephalosporins with activity against Pseudomonas aeruginosa Using whole-genome comparative analysis, we set out to determine the prevalent mechanism(s) of resistance to ceftazidime (CAZ) using a set of 181 clinical isolates. These isolates represented various multilocus sequence types that consisted of both ceftazidime-susceptible and -resistant populations. A presumptive resistance mechanism against ceftazidime was identified in 88% of the nonsusceptible isolates using this approach. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  19. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    OpenAIRE

    Zuccolo, Andrea; Bowers, John E; Estill, James C; Xiong, Zhiyong; Luo, Meizhong; Sebastian, Aswathy; Goicoechea, Jos? Luis; Collura, Kristi; Yu, Yeisoo; Jiao, Yuannian; Duarte, Jill; Tang, Haibao; Ayyampalayam, Saravanaraj; Rounsley, Steve; Kudrna, Dave

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome....

  20. The genomes and comparative genomics of Lactobacillus delbrueckii phages.

    Science.gov (United States)

    Riipinen, Katja-Anneli; Forsman, Päivi; Alatossava, Tapani

    2011-07-01

    Lactobacillus delbrueckii phages are a great source of genetic diversity. Here, the genome sequences of Lb. delbrueckii phages LL-Ku, c5 and JCL1032 were analyzed in detail, and the genetic diversity of Lb. delbrueckii phages belonging to different taxonomic groups was explored. The lytic isometric group b phages LL-Ku (31,080 bp) and c5 (31,841 bp) showed a minimum nucleotide sequence identity of 90% over about three-fourths of their genomes. The genomic locations of their lysis modules were unique, and the genomes featured several putative overlapping transcription units of genes. LL-Ku and c5 virions displayed peptidoglycan hydrolytic activity associated with a ~36-kDa protein similar in size to the endolysin. Unexpectedly, the 49,433-bp genome of the prolate phage JCL1032 (temperate, group c) revealed a conserved gene order within its structural genes. Lb. delbrueckii phages representing groups a (a phage LL-H), b and c possessed only limited protein sequence homology. Genomic comparison of LL-Ku and c5 suggested that diversification of Lb. delbrueckii phages is mainly due to insertions, deletions and recombination. For the first time, the complete genome sequences of group b and c Lb. delbrueckii phages are reported.

  1. Genomics and the human genome project: implications for psychiatry

    OpenAIRE

    Kelsoe, J R

    2004-01-01

    In the past decade the Human Genome Project has made extraordinary strides in understanding of fundamental human genetics. The complete human genetic sequence has been determined, and the chromosomal location of almost all human genes identified. Presently, a large international consortium, the HapMap Project, is working to identify a large portion of genetic variation in different human populations and the structure and relationship of these variants to each other. The Human Genome Project h...

  2. Musa sebagai Model Genom

    Directory of Open Access Journals (Sweden)

    RITA MEGIA

    2005-12-01

    Full Text Available During the meeting in Arlington, USA in 2001, the scientists grouped in PROMUSA agreed with the launching of the Global Musa Genomics Consortium. The Consortium aims to apply genomics technologies to the improvement of this important crop. These genome projects put banana as the third model species after Arabidopsis and rice that will be analyzed and sequenced. Comparing to Arabidopsis and rice, banana genome provides a unique and powerful insight into structural and in functional genomics that could not be found in those two species. This paper discussed these subjects-including the importance of banana as the fourth main food in the world, the evolution and biodiversity of this genetic resource and its parasite.

  3. New Genome Similarity Measures based on Conserved Gene Adjacencies.

    Science.gov (United States)

    Doerr, Daniel; Kowada, Luis Antonio B; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M E; Stoye, Jens

    2017-06-01

    Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.

  4. Inferring network structure in non-normal and mixed discrete-continuous genomic data.

    Science.gov (United States)

    Bhadra, Anindya; Rao, Arvind; Baladandayuthapani, Veerabhadran

    2018-03-01

    Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear situations when these approaches are inadequate. The first occurs when the data are continuous but display non-normal marginal behavior such as heavy tails or skewness, rendering an assumption of normality inappropriate. The second occurs when a part of the data is ordinal or discrete (e.g., presence or absence of a mutation) and the other part is continuous (e.g., expression levels of genes or proteins). In this case, the existing Bayesian approaches typically employ a latent variable framework for the discrete part that precludes inferring conditional independence among the data that are actually observed. The current article overcomes these two challenges in a unified framework using Gaussian scale mixtures. Our framework is able to handle continuous data that are not normal and data that are of mixed continuous and discrete nature, while still being able to infer a sparse conditional sign independence structure among the observed data. Extensive performance comparison in simulations with alternative techniques and an analysis of a real cancer genomics data set demonstrate the effectiveness of the proposed approach. © 2017, The International Biometric Society.

  5. Genome-wide transcription analyses in rice using tiling microarrays

    DEFF Research Database (Denmark)

    Li, Lei; Wang, Xiangfeng; Stolc, Viktor

    2006-01-01

    . We report here a full-genome transcription analysis of the indica rice subspecies using high-density oligonucleotide tiling microarrays. Our results provided expression data support for the existence of 35,970 (81.9%) annotated gene models and identified 5,464 unique transcribed intergenic regions...... that share similar compositional properties with the annotated exons and have significant homology to other plant proteins. Elucidating and mapping of all transcribed regions revealed an association between global transcription and cytological chromosome features, and an overall similarity of transcriptional......Sequencing and computational annotation revealed several features, including high gene numbers, unusual composition of the predicted genes and a large number of genes lacking homology to known genes, that distinguish the rice (Oryza sativa) genome from that of other fully sequenced model species...

  6. Genome-wide analysis of the AP2/ERF family in Musa species reveals divergence and neofunctionalisation during evolution.

    Science.gov (United States)

    Lakhwani, Deepika; Pandey, Ashutosh; Dhar, Yogeshwar Vikram; Bag, Sumit Kumar; Trivedi, Prabodh Kumar; Asif, Mehar Hasan

    2016-01-06

    AP2/ERF domain containing transcription factor super family is one of the important regulators in the plant kingdom. The involvement of AP2/ERF family members has been elucidated in various processes associated with plant growth, development as well as in response to hormones, biotic and abiotic stresses. In this study, we carried out genome-wide analysis to identify members of AP2/ERF family in Musa acuminata (A genome) and Musa balbisiana (B genome) and changes leading to neofunctionalisation of genes. Analysis identified 265 and 318 AP2/ERF encoding genes in M. acuminata and M. balbisiana respectively which were further classified into ERF, DREB, AP2, RAV and Soloist groups. Comparative analysis indicated that AP2/ERF family has undergone duplication, loss and divergence during evolution and speciation of the Musa A and B genomes. We identified nine genes which are up-regulated during fruit ripening and might be components of the regulatory machinery operating during ethylene-dependent ripening in banana. Tissue-specific expression analysis of the genes suggests that different regulatory mechanisms might be involved in peel and pulp ripening process through recruiting specific ERFs in these tissues. Analysis also suggests that MaRAV-6 and MaERF026 have structurally diverged from their M. balbisiana counterparts and have attained new functions during ripening.

  7. Comparative Genomics of the Ubiquitous, Hydrocarbon-degrading Genus Marinobacter

    Science.gov (United States)

    Singer, E.; Webb, E.; Edwards, K. J.

    2012-12-01

    The genus Marinobacter is amongst the most ubiquitous in the global oceans and strains have been isolated from a wide variety of marine environments, including offshore oil-well heads, coastal thermal springs, Antarctic sea water, saline soils and associations with diatoms and dinoflagellates. Many strains have been recognized to be important hydrocarbon degraders in various marine habitats presenting sometimes extreme pH or salinity conditions. Analysis of the genome of M. aquaeolei revealed enormous adaptation versatility with an assortment of strategies for carbon and energy acquisition, sensation, and defense. In an effort to elucidate the ecological and biogeochemical significance of the Marinobacters, seven Marinobacter strains from diverse environments were included in a comparative genomics study. Genomes were screened for metabolic and adaptation potential to elucidate the strategies responsible for the omnipresence of the Marinobacter genus and their remedial action potential in hydrocarbon-polluted waters. The core genome predominantly encodes for key genes involved in hydrocarbon degradation, biofilm-relevant processes, including utilization of external DNA, halotolerance, as well as defense mechanisms against heavy metals, antibiotics, and toxins. All Marinobacter strains were observed to degrade a wide spectrum of hydrocarbon species, including aliphatic, polycyclic aromatic as well as acyclic isoprenoid compounds. Various genes predicted to facilitate hydrocarbon degradation, e.g. alkane 1-monooxygenase, appear to have originated from lateral gene transfer as they are located on gene clusters of 10-20% lower GC-content compared to genome averages and are flanked by transposases. Top ortholog hits are found in other hydrocarbon degrading organisms, e.g. Alcanivorax borkumensis. Strategies for hydrocarbon uptake encoded by various Marinobacter strains include cell surface hydrophobicity adaptation via capsular polysaccharide biosynthesis and attachment

  8. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome.

    Science.gov (United States)

    Collins, Ryan L; Brand, Harrison; Redin, Claire E; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon-Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A; Lucente, Diane; Levy, Brynn; Sanders, Stephan J; Wapner, Ronald J; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E

    2017-03-06

    Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV. These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease.

  9. SV2: accurate structural variation genotyping and de novo mutation detection from whole genomes.

    Science.gov (United States)

    Antaki, Danny; Brandler, William M; Sebat, Jonathan

    2018-05-15

    Structural variation (SV) detection from short-read whole genome sequencing is error prone, presenting significant challenges for population or family-based studies of disease. Here, we describe SV2, a machine-learning algorithm for genotyping deletions and duplications from paired-end sequencing data. SV2 can rapidly integrate variant calls from multiple structural variant discovery algorithms into a unified call set with high genotyping accuracy and capability to detect de novo mutations. SV2 is freely available on GitHub (https://github.com/dantaki/SV2). jsebat@ucsd.edu. Supplementary data are available at Bioinformatics online.

  10. Genome evolution during progression to breast cancer

    KAUST Repository

    Newburger, D. E.; Kashef-Haghighi, D.; Weng, Z.; Salari, R.; Sweeney, R. T.; Brunner, A. L.; Zhu, S. X.; Guo, X.; Varma, S.; Troxell, M. L.; West, R. B.; Batzoglou, S.; Sidow, A.

    2013-01-01

    Cancer evolution involves cycles of genomic damage, epigenetic deregulation, and increased cellular proliferation that eventually culminate in the carcinoma phenotype. Early neoplasias, which are often found concurrently with carcinomas and are histologically distinguishable from normal breast tissue, are less advanced in phenotype than carcinomas and are thought to represent precursor stages. To elucidate their role in cancer evolution we performed comparative whole-genome sequencing of early neoplasias, matched normal tissue, and carcinomas from six patients, for a total of 31 samples. By using somatic mutations as lineage markers we built trees that relate the tissue samples within each patient. On the basis of these lineage trees we inferred the order, timing, and rates of genomic events. In four out of six cases, an early neoplasia and the carcinoma share a mutated common ancestor with recurring aneuploidies, and in all six cases evolution accelerated in the carcinoma lineage. Transition spectra of somatic mutations are stable and consistent across cases, suggesting that accumulation of somatic mutations is a result of increased ancestral cell division rather than specific mutational mechanisms. In contrast to highly advanced tumors that are the focus of much of the current cancer genome sequencing, neither the early neoplasia genomes nor the carcinomas are enriched with potentially functional somatic point mutations. Aneuploidies that occur in common ancestors of neoplastic and tumor cells are the earliest events that affect a large number of genes and may predispose breast tissue to eventual development of invasive carcinoma.

  11. Genome evolution during progression to breast cancer

    KAUST Repository

    Newburger, D. E.

    2013-04-08

    Cancer evolution involves cycles of genomic damage, epigenetic deregulation, and increased cellular proliferation that eventually culminate in the carcinoma phenotype. Early neoplasias, which are often found concurrently with carcinomas and are histologically distinguishable from normal breast tissue, are less advanced in phenotype than carcinomas and are thought to represent precursor stages. To elucidate their role in cancer evolution we performed comparative whole-genome sequencing of early neoplasias, matched normal tissue, and carcinomas from six patients, for a total of 31 samples. By using somatic mutations as lineage markers we built trees that relate the tissue samples within each patient. On the basis of these lineage trees we inferred the order, timing, and rates of genomic events. In four out of six cases, an early neoplasia and the carcinoma share a mutated common ancestor with recurring aneuploidies, and in all six cases evolution accelerated in the carcinoma lineage. Transition spectra of somatic mutations are stable and consistent across cases, suggesting that accumulation of somatic mutations is a result of increased ancestral cell division rather than specific mutational mechanisms. In contrast to highly advanced tumors that are the focus of much of the current cancer genome sequencing, neither the early neoplasia genomes nor the carcinomas are enriched with potentially functional somatic point mutations. Aneuploidies that occur in common ancestors of neoplastic and tumor cells are the earliest events that affect a large number of genes and may predispose breast tissue to eventual development of invasive carcinoma.

  12. A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes

    Directory of Open Access Journals (Sweden)

    Michael Strong

    2009-12-01

    Full Text Available As the number of completely sequenced microbial genomes continues to rise at an impressive rate, it is important to prepare students with the skills necessary to investigate microorganisms at the genomic level. As a part of the core curriculum for first-year graduate students in the biological sciences, we have implemented a web-based tutorial to introduce students to the fields of comparative and functional genomics. The tutorial focuses on recent computational methods for identifying functionally linked genes and proteins on a genome-wide scale and was used to introduce students to the Rosetta Stone, Phylogenetic Profile, conserved Gene Neighbor, and Operon computational methods. Students learned to use a number of publicly available web servers and databases to identify functionally linked genes in the Escherichia coli genome, with emphasis on genome organization and operon structure. The overall effectiveness of the tutorial was assessed based on student evaluations and homework assignments. The tutorial is available to other educators at http://www.doe-mbi.ucla.edu/~strong/m253.php.

  13. GAAP: Genome-organization-framework-Assisted Assembly Pipeline for prokaryotic genomes.

    Science.gov (United States)

    Yuan, Lina; Yu, Yang; Zhu, Yanmin; Li, Yulai; Li, Changqing; Li, Rujiao; Ma, Qin; Siu, Gilman Kit-Hang; Yu, Jun; Jiang, Taijiao; Xiao, Jingfa; Kang, Yu

    2017-01-25

    Next-generation sequencing (NGS) technologies have greatly promoted the genomic study of prokaryotes. However, highly fragmented assemblies due to short reads from NGS are still a limiting factor in gaining insights into the genome biology. Reference-assisted tools are promising in genome assembly, but tend to result in false assembly when the assigned reference has extensive rearrangements. Herein, we present GAAP, a genome assembly pipeline for scaffolding based on core-gene-defined Genome Organizational Framework (cGOF) described in our previous study. Instead of assigning references, we use the multiple-reference-derived cGOFs as indexes to assist in order and orientation of the scaffolds and build a skeleton structure, and then use read pairs to extend scaffolds, called local scaffolding, and distinguish between true and chimeric adjacencies in the scaffolds. In our performance tests using both empirical and simulated data of 15 genomes in six species with diverse genome size, complexity, and all three categories of cGOFs, GAAP outcompetes or achieves comparable results when compared to three other reference-assisted programs, AlignGraph, Ragout and MeDuSa. GAAP uses both cGOF and pair-end reads to create assemblies in genomic scale, and performs better than the currently available reference-assisted assembly tools as it recovers more assemblies and makes fewer false locations, especially for species with extensive rearranged genomes. Our method is a promising solution for reconstruction of genome sequence from short reads of NGS.

  14. Multi-Probe Investigation of Proteomic Structure of Pathogens

    International Nuclear Information System (INIS)

    Malkin, A J; Plomp, M; Leighton, T J; Vogelstein, B; Wheeler, K E

    2008-01-01

    Complete genome sequences are available for understanding biotransformation, environmental resistance and pathogenesis of microbial, cellular and pathogen systems. The present technological and scientific challenges are to unravel the relationships between the organization and function of protein complexes at cell, microbial and pathogens surfaces, to understand how these complexes evolve during the bacterial, cellular and pathogen life cycles, and how they respond to environmental changes, chemical stimulants and therapeutics. In particular, elucidating the molecular structure and architecture of human pathogen surfaces is essential to understanding mechanisms of pathogenesis, immune response, physicochemical interactions, environmental resistance and development of countermeasures against bioterrorist agents. The objective of this project was to investigate the architecture, proteomic structure, and function of bacterial spores through a combination of high-resolution in vitro atomic force microscopy (AFM) and AFM-based immunolabeling with threat-specific antibodies. Particular attention in this project was focused on spore forming Bacillus species including the Sterne vaccine strain of Bacillus anthracis and the spore forming near-neighbor of Clostridium botulinum, C. novyi-NT. Bacillus species, including B. anthracis, the causative agent of inhalation anthrax are laboratory models for elucidating spore structure/function. Even though the complete genome sequence is available for B. subtilis, cereus, anthracis and other species, the determination and composition of spore structure/function is not understood. Prof. B. Vogelstein and colleagues at the John Hopkins University have recently developed a breakthrough bacteriolytic therapy for cancer treatment (1). They discovered that intravenously injected Clostridium novyi-NT spores germinate exclusively within the avascular regions of tumors in mice and destroy advanced cancerous lesions. The bacteria were also

  15. Draft genome analysis of Dietzia sp. 111N12-1, isolated from the South China Sea with bioremediation activity.

    Science.gov (United States)

    Yang, Shanjun; Yu, Mingjia; Chen, Jianming

    Dietzia sp. 111N12-1, isolated from the seawater of South China Sea, shows strong petroleum hydrocarbons degradation activity. Here, we report the draft sequence of approximately 3.7-Mbp genome of this strain. To the best of our knowledge, this is the first genome sequence of Dietzia strain isolated from the sea. The genome sequence may provide fundamental molecular information on elucidating the metabolic pathway of hydrocarbons degradation in this strain. Copyright © 2017 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.

  16. The population genomics of begomoviruses: global scale population structure and gene flow

    Directory of Open Access Journals (Sweden)

    Prasanna HC

    2010-09-01

    Full Text Available Abstract Background The rapidly growing availability of diverse full genome sequences from across the world is increasing the feasibility of studying the large-scale population processes that underly observable pattern of virus diversity. In particular, characterizing the genetic structure of virus populations could potentially reveal much about how factors such as geographical distributions, host ranges and gene flow between populations combine to produce the discontinuous patterns of genetic diversity that we perceive as distinct virus species. Among the richest and most diverse full genome datasets that are available is that for the dicotyledonous plant infecting genus, Begomovirus, in the Family Geminiviridae. The begomoviruses all share the same whitefly vector, are highly recombinogenic and are distributed throughout tropical and subtropical regions where they seriously threaten the food security of the world's poorest people. Results We focus here on using a model-based population genetic approach to identify the genetically distinct sub-populations within the global begomovirus meta-population. We demonstrate the existence of at least seven major sub-populations that can further be sub-divided into as many as thirty four significantly differentiated and genetically cohesive minor sub-populations. Using the population structure framework revealed in the present study, we further explored the extent of gene flow and recombination between genetic populations. Conclusions Although geographical barriers are apparently the most significant underlying cause of the seven major population sub-divisions, within the framework of these sub-divisions, we explore patterns of gene flow to reveal that both host range differences and genetic barriers to recombination have probably been major contributors to the minor population sub-divisions that we have identified. We believe that the global Begomovirus population structure revealed here could

  17. Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

    Science.gov (United States)

    Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

    2014-04-01

    Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.

  18. A genomic overview of the population structure of Salmonella.

    Science.gov (United States)

    Alikhan, Nabil-Fareed; Zhou, Zhemin; Sergeant, Martin J; Achtman, Mark

    2018-04-01

    For many decades, Salmonella enterica has been subdivided by serological properties into serovars or further subdivided for epidemiological tracing by a variety of diagnostic tests with higher resolution. Recently, it has been proposed that so-called eBurst groups (eBGs) based on the alleles of seven housekeeping genes (legacy multilocus sequence typing [MLST]) corresponded to natural populations and could replace serotyping. However, this approach lacks the resolution needed for epidemiological tracing and the existence of natural populations had not been independently validated by independent criteria. Here, we describe EnteroBase, a web-based platform that assembles draft genomes from Illumina short reads in the public domain or that are uploaded by users. EnteroBase implements legacy MLST as well as ribosomal gene MLST (rMLST), core genome MLST (cgMLST), and whole genome MLST (wgMLST) and currently contains over 100,000 assembled genomes from Salmonella. It also provides graphical tools for visual interrogation of these genotypes and those based on core single nucleotide polymorphisms (SNPs). eBGs based on legacy MLST are largely consistent with eBGs based on rMLST, thus demonstrating that these correspond to natural populations. rMLST also facilitated the selection of representative genotypes for SNP analyses of the entire breadth of diversity within Salmonella. In contrast, cgMLST provides the resolution needed for epidemiological investigations. These observations show that genomic genotyping, with the assistance of EnteroBase, can be applied at all levels of diversity within the Salmonella genus.

  19. Isolation, structural elucidation, MS profiling, and evaluation of triglyceride accumulation inhibitory effects of benzophenone C-glucosides from leaves of Mangifera indica L.

    Science.gov (United States)

    Zhang, Yi; Han, Lifeng; Ge, Dandan; Liu, Xuefeng; Liu, Erwei; Wu, Chunhua; Gao, Xiumei; Wang, Tao

    2013-02-27

    Seventy percent ethanol-water extract from the leaves of Mangifera indica L. (Anacardiaceae) was found to show an inhibitory effect on triglyceride (TG) accumulation in 3T3-L1 cells. From the active fraction, six new benzophenone C-glucosides, foliamangiferosides A(3) (1), A(4) (2), C(4) (3), C(5) (4), C(6) (5), and C(7) (6) together with 11 known benzophenone C-glucosides (7-17) were obtained. In this paper, isolation, structure elucidation (1-6), and MS fragment cleavage pathways of all 17 isolates were studied. 1-6 showed inhibitory effects on TG and free fatty acid accumulation in 3T3-L1 cells at 10 μM.

  20. The W22 genome: a foundation for maize functional genomics and transposon biology

    Science.gov (United States)

    The maize W22 inbred has served as a platform for maize genetics since the mid twentieth century. To streamline maize genome analyses, we have sequenced and de novo assembled a W22 reference genome using small-read sequencing technologies. We show that significant structural heterogeneity exists in ...

  1. The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp. population genomics

    DEFF Research Database (Denmark)

    Gopalakrishnan, Shyam; Samaniego Castruita, Jose Alfredo; Sinding, Mikkel Holger Strander

    2017-01-01

    Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data - that of a......Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data...... that regardless of the reference genome choice, most evolutionary genomic analyses yield qualitatively similar results, including those exploring the structure between the wolves and dogs using admixture and principal component analysis. However, we do observe differences in the genomic coverage of re-mapped...

  2. Structure of a short-chain dehydrogenase/reductase (SDR) within a genomic island from a clinical strain of Acinetobacter baumannii

    Energy Technology Data Exchange (ETDEWEB)

    Shah, Bhumika S., E-mail: bhumika.shah@mq.edu.au; Tetu, Sasha G. [Macquarie University, Research Park Drive, Sydney, NSW 2109 (Australia); Harrop, Stephen J. [University of New South Wales, Sydney, NSW 2052 (Australia); Paulsen, Ian T.; Mabbutt, Bridget C. [Macquarie University, Research Park Drive, Sydney, NSW 2109 (Australia)

    2014-09-25

    The structure of a short-chain dehydrogenase encoded within genomic islands of A. baumannii strains has been solved to 2.4 Å resolution. This classical SDR incorporates a flexible helical subdomain. The NADP-binding site and catalytic side chains are identified. Over 15% of the genome of an Australian clinical isolate of Acinetobacter baumannii occurs within genomic islands. An uncharacterized protein encoded within one island feature common to this and other International Clone II strains has been studied by X-ray crystallography. The 2.4 Å resolution structure of SDR-WM99c reveals it to be a new member of the classical short-chain dehydrogenase/reductase (SDR) superfamily. The enzyme contains a nucleotide-binding domain and, like many other SDRs, is tetrameric in form. The active site contains a catalytic tetrad (Asn117, Ser146, Tyr159 and Lys163) and water molecules occupying the presumed NADP cofactor-binding pocket. An adjacent cleft is capped by a relatively mobile helical subdomain, which is well positioned to control substrate access.

  3. Structure of a short-chain dehydrogenase/reductase (SDR) within a genomic island from a clinical strain of Acinetobacter baumannii

    International Nuclear Information System (INIS)

    Shah, Bhumika S.; Tetu, Sasha G.; Harrop, Stephen J.; Paulsen, Ian T.; Mabbutt, Bridget C.

    2014-01-01

    The structure of a short-chain dehydrogenase encoded within genomic islands of A. baumannii strains has been solved to 2.4 Å resolution. This classical SDR incorporates a flexible helical subdomain. The NADP-binding site and catalytic side chains are identified. Over 15% of the genome of an Australian clinical isolate of Acinetobacter baumannii occurs within genomic islands. An uncharacterized protein encoded within one island feature common to this and other International Clone II strains has been studied by X-ray crystallography. The 2.4 Å resolution structure of SDR-WM99c reveals it to be a new member of the classical short-chain dehydrogenase/reductase (SDR) superfamily. The enzyme contains a nucleotide-binding domain and, like many other SDRs, is tetrameric in form. The active site contains a catalytic tetrad (Asn117, Ser146, Tyr159 and Lys163) and water molecules occupying the presumed NADP cofactor-binding pocket. An adjacent cleft is capped by a relatively mobile helical subdomain, which is well positioned to control substrate access

  4. Genome-wide association analyses identify new risk variants and the genetic architecture of amyotrophic lateral sclerosis

    NARCIS (Netherlands)

    van Rheenen, Wouter; Shatunov, Aleksey; Dekker, Annelot M; McLaughlin, Russell L; Diekstra, Frank P; Pulit, Sara L; van der Spek, Rick A A; Võsa, Urmo; de Jong, Simone; Robinson, Matthew R; Yang, Jian; Fogh, Isabella; van Doormaal, Perry Tc; Tazelaar, Gijs H P; Koppers, Max; Blokhuis, Anna M; Sproviero, William; Jones, Ashley R; Kenna, Kevin P; van Eijk, Kristel R; Harschnitz, Oliver; Schellevis, Raymond D; Brands, William J; Medic, Jelena; Menelaou, Androniki; Vajda, Alice; Ticozzi, Nicola; Lin, Kuang; Rogelj, Boris; Vrabec, Katarina; Ravnik-Glavač, Metka; Koritnik, Blaž; Zidar, Janez; Leonardis, Lea; Grošelj, Leja Dolenc; Millecamps, Stéphanie; Salachas, François; Meininger, Vincent; de Carvalho, Mamede; Pinto, Susana; Mora, Jesus S; Rojas-García, Ricardo; Polak, Meraida; Chandran, Siddharthan; Colville, Shuna; Swingler, Robert; Morrison, Karen E; Shaw, Pamela J; Hardy, John; Orrell, Richard W; Pittman, Alan; Sidle, Katie; Fratta, Pietro; Malaspina, Andrea; Topp, Simon; Petri, Susanne; Abdulla, Susanne; Drepper, Carsten; Sendtner, Michael; Meyer, Thomas; Ophoff, Roel A.; Staats, Kim A; Wiedau-Pazos, Martina; Lomen-Hoerth, Catherine; Van Deerlin, Vivianna M; Trojanowski, John Q; Elman, Lauren; McCluskey, Leo; Basak, A Nazli; Tunca, Ceren; Hamzeiy, Hamid; Parman, Yesim; Meitinger, Thomas; Lichtner, Peter; Radivojkov-Blagojevic, Milena; Andres, Christian R; Maurel, Cindy; Bensimon, Gilbert; Landwehrmeyer, Bernhard; Brice, Alexis; Payan, Christine A M; Saker-Delye, Safaa; Dürr, Alexandra; Wood, Nicholas W; Tittmann, Lukas; Lieb, Wolfgang; Franke, Andre; Rietschel, Marcella; Cichon, Sven; Nöthen, Markus M; Amouyel, Philippe; Tzourio, Christophe; Dartigues, Jean-François; Uitterlinden, Andre G; Rivadeneira, Fernando; Estrada, Karol; Hofman, Albert; Curtis, Charles; Blauw, Hylke M; van der Kooi, Anneke J; de Visser, Marianne; Goris, An; Weber, Markus; Shaw, Christopher E; Smith, Bradley N; Pansarasa, Orietta; Cereda, Cristina; Del Bo, Roberto; Comi, Giacomo P; D'Alfonso, Sandra; Bertolin, Cinzia; Sorarù, Gianni; Mazzini, Letizia; Pensato, Viviana; Gellera, Cinzia; Tiloca, Cinzia; Ratti, Antonia; Calvo, Andrea; Moglia, Cristina; Brunetti, Maura; Arcuti, Simona; Capozzo, Rosa; Zecca, Chiara; Lunetta, Christian; Penco, Silvana; Riva, Nilo; Padovani, Alessandro; Filosto, Massimiliano; Muller, Bernard; Stuit, Robbert Jan; Blair, Ian; Zhang, Katharine; McCann, Emily P; Fifita, Jennifer A; Nicholson, Garth A; Rowe, Dominic B; Pamphlett, Roger; Kiernan, Matthew C; Grosskreutz, Julian; Witte, Otto W; Ringer, Thomas; Prell, Tino; Stubendorff, Beatrice; Kurth, Ingo; Hübner, Christian A; Leigh, P Nigel; Casale, Federico; Chio, Adriano; Beghi, Ettore; Pupillo, Elisabetta; Tortelli, Rosanna; Logroscino, Giancarlo; Powell, John; Ludolph, Albert C; Weishaupt, Jochen H; Robberecht, Wim; Van Damme, Philip; Franke, Lude; Pers, Tune H; Brown, Robert H; Glass, Jonathan D; Landers, John E; Hardiman, Orla; Andersen, Peter M; Corcia, Philippe; Vourc'h, Patrick; Silani, Vincenzo; Wray, Naomi R; Visscher, Peter M; de Bakker, Paul I W; van Es, Michael A; Pasterkamp, R Jeroen; Lewis, Cathryn M; Breen, Gerome; Al-Chalabi, Ammar; van den Berg, Leonard H; Veldink, Jan H

    To elucidate the genetic architecture of amyotrophic lateral sclerosis (ALS) and find associated loci, we assembled a custom imputation reference panel from whole-genome-sequenced patients with ALS and matched controls (n = 1,861). Through imputation and mixed-model association analysis in 12,577

  5. Genome-wide association analyses identify new risk variants and the genetic architecture of amyotrophic lateral sclerosis

    NARCIS (Netherlands)

    van Rheenen, Wouter; Shatunov, Aleksey; Dekker, Annelot M.; McLaughlin, Russell L.; Diekstra, Frank P.; Pulit, Sara L.; van der Spek, Rick A. A.; Võsa, Urmo; de Jong, Simone; Robinson, Matthew R.; Yang, Jian; Fogh, Isabella; van Doormaal, Perry Tc; Tazelaar, Gijs H. P.; Koppers, Max; Blokhuis, Anna M.; Sproviero, William; Jones, Ashley R.; Kenna, Kevin P.; van Eijk, Kristel R.; Harschnitz, Oliver; Schellevis, Raymond D.; Brands, William J.; Medic, Jelena; Menelaou, Androniki; Vajda, Alice; Ticozzi, Nicola; Lin, Kuang; Rogelj, Boris; Vrabec, Katarina; Ravnik-Glavač, Metka; Koritnik, Blaž; Zidar, Janez; Leonardis, Lea; Grošelj, Leja Dolenc; Millecamps, Stéphanie; Salachas, François; Meininger, Vincent; de Carvalho, Mamede; Pinto, Susana; Mora, Jesus S.; Rojas-García, Ricardo; Polak, Meraida; Chandran, Siddharthan; Colville, Shuna; Swingler, Robert; Morrison, Karen E.; Shaw, Pamela J.; Hardy, John; Orrell, Richard W.; Pittman, Alan; Sidle, Katie; Fratta, Pietro; Malaspina, Andrea; Topp, Simon; Petri, Susanne; Abdulla, Susanne; Drepper, Carsten; Sendtner, Michael; Meyer, Thomas; Ophoff, Roel A.; Staats, Kim A.; Wiedau-Pazos, Martina; Lomen-Hoerth, Catherine; van Deerlin, Vivianna M.; Trojanowski, John Q.; Elman, Lauren; McCluskey, Leo; Basak, A. Nazli; Tunca, Ceren; Hamzeiy, Hamid; Parman, Yesim; Meitinger, Thomas; Lichtner, Peter; Radivojkov-Blagojevic, Milena; Andres, Christian R.; Maurel, Cindy; Bensimon, Gilbert; Landwehrmeyer, Bernhard; Brice, Alexis; Payan, Christine A. M.; Saker-Delye, Safaa; Dürr, Alexandra; Wood, Nicholas W.; Tittmann, Lukas; Lieb, Wolfgang; Franke, Andre; Rietschel, Marcella; Cichon, Sven; Nöthen, Markus M.; Amouyel, Philippe; Tzourio, Christophe; Dartigues, Jean-François; Uitterlinden, Andre G.; Rivadeneira, Fernando; Estrada, Karol; Hofman, Albert; Curtis, Charles; Blauw, Hylke M.; van der Kooi, Anneke J.; de Visser, Marianne; Goris, An; Weber, Markus; Shaw, Christopher E.; Smith, Bradley N.; Pansarasa, Orietta; Cereda, Cristina; del Bo, Roberto; Comi, Giacomo P.; D'alfonso, Sandra; Bertolin, Cinzia; Sorarù, Gianni; Mazzini, Letizia; Pensato, Viviana; Gellera, Cinzia; Tiloca, Cinzia; Ratti, Antonia; Calvo, Andrea; Moglia, Cristina; Brunetti, Maura; Arcuti, Simona; Capozzo, Rosa; Zecca, Chiara; Lunetta, Christian; Penco, Silvana; Riva, Nilo; Padovani, Alessandro; Filosto, Massimiliano; Muller, Bernard; Stuit, Robbert Jan; Blair, Ian; Zhang, Katharine; McCann, Emily P.; Fifita, Jennifer A.; Nicholson, Garth A.; Rowe, Dominic B.; Pamphlett, Roger; Kiernan, Matthew C.; Grosskreutz, Julian; Witte, Otto W.; Ringer, Thomas; Prell, Tino; Stubendorff, Beatrice; Kurth, Ingo; Hübner, Christian A.; Leigh, P. Nigel; Casale, Federico; Chio, Adriano; Beghi, Ettore; Pupillo, Elisabetta; Tortelli, Rosanna; Logroscino, Giancarlo; Powell, John; Ludolph, Albert C.; Weishaupt, Jochen H.; Robberecht, Wim; van Damme, Philip; Franke, Lude; Pers, Tune H.; Brown, Robert H.; Glass, Jonathan D.; Landers, John E.; Hardiman, Orla; Andersen, Peter M.; Corcia, Philippe; Vourc'h, Patrick; Silani, Vincenzo; Wray, Naomi R.; Visscher, Peter M.; de Bakker, Paul I. W.; van Es, Michael A.; Pasterkamp, R. Jeroen; Lewis, Cathryn M.; Breen, Gerome; Al-Chalabi, Ammar; van den Berg, Leonard H.; Veldink, Jan H.

    2016-01-01

    To elucidate the genetic architecture of amyotrophic lateral sclerosis (ALS) and find associated loci, we assembled a custom imputation reference panel from whole-genome-sequenced patients with ALS and matched controls (n = 1,861). Through imputation and mixed-model association analysis in 12,577

  6. Universal Internucleotide Statistics in Full Genomes: A Footprint of the DNA Structure and Packaging?

    OpenAIRE

    Bogachev, Mikhail I.; Kayumov, Airat R.; Bunde, Armin

    2014-01-01

    Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleot...

  7. Structural analysis of a set of proteins resulting from a bacterial genomics project.

    Science.gov (United States)

    Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

    2005-09-01

    The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.

  8. An original SERPINA3 gene cluster: Elucidation of genomic organization and gene expression in the Bos taurus 21q24 region

    Directory of Open Access Journals (Sweden)

    Ouali Ahmed

    2008-04-01

    Full Text Available Abstract Background The superfamily of serine proteinase inhibitors (serpins is involved in numerous fundamental biological processes as inflammation, blood coagulation and apoptosis. Our interest is focused on the SERPINA3 sub-family. The major human plasma protease inhibitor, α1-antichymotrypsin, encoded by the SERPINA3 gene, is homologous to genes organized in clusters in several mammalian species. However, although there is a similar genic organization with a high degree of sequence conservation, the reactive-centre-loop domains, which are responsible for the protease specificity, show significant divergences. Results We provide additional information by analyzing the situation of SERPINA3 in the bovine genome. A cluster of eight genes and one pseudogene sharing a high degree of identity and the same structural organization was characterized. Bovine SERPINA3 genes were localized by radiation hybrid mapping on 21q24 and only spanned over 235 Kilobases. For all these genes, we propose a new nomenclature from SERPINA3-1 to SERPINA3-8. They share approximately 70% of identity with the human SERPINA3 homologue. In the cluster, we described an original sub-group of six members with an unexpected high degree of conservation for the reactive-centre-loop domain, suggesting a similar peptidase inhibitory pattern. Preliminary expression analyses of these bovSERPINA3s showed different tissue-specific patterns and diverse states of glycosylation and phosphorylation. Finally, in the context of phylogenetic analyses, we improved our knowledge on mammalian SERPINAs evolution. Conclusion Our experimental results update data of the bovine genome sequencing, substantially increase the bovSERPINA3 sub-family and enrich the phylogenetic tree of serpins. We provide new opportunities for future investigations to approach the biological functions of this unusual subset of serine proteinase inhibitors.

  9. Genome engineering in Vibrio cholerae

    DEFF Research Database (Denmark)

    Val, Marie-Eve; Skovgaard, Ole; Ducos-Galand, Magaly

    2012-01-01

    Although bacteria with multipartite genomes are prevalent, our knowledge of the mechanisms maintaining their genome is very limited, and much remains to be learned about the structural and functional interrelationships of multiple chromosomes. Owing to its bi-chromosomal genome architecture and its....... This difficulty was surmounted using a unique and powerful strategy based on massive rearrangement of prokaryotic genomes. We developed a site-specific recombination-based engineering tool, which allows targeted, oriented, and reciprocal DNA exchanges. Using this genetic tool, we obtained a panel of V. cholerae...

  10. Effect of primary particle size on spray formation, morphology and internal structure of alumina granules and elucidation of flowability and compaction behaviour

    Directory of Open Access Journals (Sweden)

    Pandu Ramavath

    2014-06-01

    Full Text Available Three different alumina powders with varying particle sizes were subjected to spray drying under identical conditions and effect of particle size on heat transfer efficiency and mechanism of formation of granules was elucidated. Morphology, internal structure and size distribution of granules were studied and evaluated with respect to their flow behaviour. In order to estimate the elastic interaction of granules, the granules were subjected to compaction under progressive loading followed by periodic unloading. Compaction curves were plotted and compressibility factor was estimated and correlated with predicted and measured green density values.

  11. Structure elucidation and in vitro cytotoxicity of ochratoxin α amide, a new degradation product of ochratoxin A.

    Science.gov (United States)

    Bittner, Andrea; Cramer, Benedikt; Harrer, Henning; Humpf, Hans-Ulrich

    2015-05-01

    The mycotoxin ochratoxin A is a secondary metabolite occurring in a wide range of commodities. During the exposure of ochratoxin A to white and blue light, a cleavage between the carbon atom C-14 and the nitrogen atom was described. As a reaction product, the new compound ochratoxin α amide has been proposed based on mass spectrometry (MS) experiments. In the following study, we observed that this compound is also formed at high temperatures such as used for example during coffee roasting and therefore represents a further thermal ochratoxin A degradation product. To confirm the structure of ochratoxin α amide, the compound was prepared in large scale and complete structure elucidation via nuclear magnetic resonance (NMR) and MS was performed. Additionally, first studies on the toxicity of ochratoxin α amide were performed using immortalized human kidney epithelial (IHKE) cells, a cell line known to be sensitive against ochratoxin A with an IC50 value of 0.5 μM. Using this system, ochratoxin α amide revealed no cytotoxicity up to concentrations of 50 μM. Thus, these results propose that the thermal degradation of ochratoxin A to ochratoxin α amide might be a detoxification process. Finally, we present a sample preparation and a HPLC-tandem mass spectrometry (HPLC-MS/MS) method for the analysis of ochratoxin α amide in extrudates and checked its formation during the extrusion of artificially contaminated wheat grits at 150 and 180 °C, whereas no ochratoxin α amide was detectable under these conditions.

  12. A genomic overview of the population structure of Salmonella.

    Directory of Open Access Journals (Sweden)

    Nabil-Fareed Alikhan

    2018-04-01

    Full Text Available For many decades, Salmonella enterica has been subdivided by serological properties into serovars or further subdivided for epidemiological tracing by a variety of diagnostic tests with higher resolution. Recently, it has been proposed that so-called eBurst groups (eBGs based on the alleles of seven housekeeping genes (legacy multilocus sequence typing [MLST] corresponded to natural populations and could replace serotyping. However, this approach lacks the resolution needed for epidemiological tracing and the existence of natural populations had not been independently validated by independent criteria. Here, we describe EnteroBase, a web-based platform that assembles draft genomes from Illumina short reads in the public domain or that are uploaded by users. EnteroBase implements legacy MLST as well as ribosomal gene MLST (rMLST, core genome MLST (cgMLST, and whole genome MLST (wgMLST and currently contains over 100,000 assembled genomes from Salmonella. It also provides graphical tools for visual interrogation of these genotypes and those based on core single nucleotide polymorphisms (SNPs. eBGs based on legacy MLST are largely consistent with eBGs based on rMLST, thus demonstrating that these correspond to natural populations. rMLST also facilitated the selection of representative genotypes for SNP analyses of the entire breadth of diversity within Salmonella. In contrast, cgMLST provides the resolution needed for epidemiological investigations. These observations show that genomic genotyping, with the assistance of EnteroBase, can be applied at all levels of diversity within the Salmonella genus.

  13. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    Science.gov (United States)

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  14. BIGSdb: Scalable analysis of bacterial genome variation at the population level

    Directory of Open Access Journals (Sweden)

    Maiden Martin CJ

    2010-12-01

    represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach.

  15. Population genomic structure and adaptation in the zoonotic malaria parasite Plasmodium knowlesi

    KAUST Repository

    Assefa, Samuel

    2015-10-06

    Malaria cases caused by the zoonotic parasite Plasmodium knowlesi are being increasingly reported throughout Southeast Asia and in travelers returning from the region. To test for evidence of signatures of selection or unusual population structure in this parasite, we surveyed genome sequence diversity in 48 clinical isolates recently sampled from Malaysian Borneo and in five lines maintained in laboratory rhesus macaques after isolation in the 1960s from Peninsular Malaysia and the Philippines. Overall genomewide nucleotide diversity (π = 6.03 × 10) was much higher than has been seen in worldwide samples of either of the major endemic malaria parasite species Plasmodium falciparum and Plasmodium vivax. A remarkable substructure is revealed within P. knowlesi, consisting of two major sympatric clusters of the clinical isolates and a third cluster comprising the laboratory isolates. There was deep differentiation between the two clusters of clinical isolates [mean genomewide fixation index (F) = 0.21, with 9,293 SNPs having fixed differences of F = 1.0]. This differentiation showed marked heterogeneity across the genome, with mean F values of different chromosomes ranging from 0.08 to 0.34 and with further significant variation across regions within several chromosomes. Analysis of the largest cluster (cluster 1, 38 isolates) indicated long-term population growth, with negatively skewed allele frequency distributions (genomewide average Tajima\\'s D = -1.35). Against this background there was evidence of balancing selection on particular genes, including the circumsporozoite protein (csp) gene, which had the top Tajima\\'s D value (1.57), and scans of haplotype homozygosity implicate several genomic regions as being under recent positive selection.

  16. Population genomic structure and adaptation in the zoonotic malaria parasite Plasmodium knowlesi

    KAUST Repository

    Assefa, Samuel; Lim, Caeul; Preston, Mark D.; Duffy, Craig W.; Nair, Mridul; Adroub, Sabir; Kadir, Khamisah A.; Goldberg, Jonathan M.; Neafsey, Daniel E.; Divis, Paul; Clark, Taane G.; Duraisingh, Manoj T.; Conway, David J.; Pain, Arnab; Singh, Balbir

    2015-01-01

    Malaria cases caused by the zoonotic parasite Plasmodium knowlesi are being increasingly reported throughout Southeast Asia and in travelers returning from the region. To test for evidence of signatures of selection or unusual population structure in this parasite, we surveyed genome sequence diversity in 48 clinical isolates recently sampled from Malaysian Borneo and in five lines maintained in laboratory rhesus macaques after isolation in the 1960s from Peninsular Malaysia and the Philippines. Overall genomewide nucleotide diversity (π = 6.03 × 10) was much higher than has been seen in worldwide samples of either of the major endemic malaria parasite species Plasmodium falciparum and Plasmodium vivax. A remarkable substructure is revealed within P. knowlesi, consisting of two major sympatric clusters of the clinical isolates and a third cluster comprising the laboratory isolates. There was deep differentiation between the two clusters of clinical isolates [mean genomewide fixation index (F) = 0.21, with 9,293 SNPs having fixed differences of F = 1.0]. This differentiation showed marked heterogeneity across the genome, with mean F values of different chromosomes ranging from 0.08 to 0.34 and with further significant variation across regions within several chromosomes. Analysis of the largest cluster (cluster 1, 38 isolates) indicated long-term population growth, with negatively skewed allele frequency distributions (genomewide average Tajima's D = -1.35). Against this background there was evidence of balancing selection on particular genes, including the circumsporozoite protein (csp) gene, which had the top Tajima's D value (1.57), and scans of haplotype homozygosity implicate several genomic regions as being under recent positive selection.

  17. Genomics of elite sporting performance: what little we know and necessary advances.

    Science.gov (United States)

    Pitsiladis, Yannis; Wang, Guan; Wolfarth, Bernd; Scott, Robert; Fuku, Noriyuki; Mikami, Eri; He, Zihong; Fiuza-Luces, Carmen; Eynon, Nir; Lucia, Alejandro

    2013-06-01

    Numerous reports of genetic associations with performance-related phenotypes have been published over the past three decades but there has been limited progress in discovering and characterising the genetic contribution to elite/world-class performance, mainly owing to few coordinated research efforts involving major funding initiatives/consortia and the use primarily of the candidate gene analysis approach. It is timely that exercise genomics research has moved into a new era utilising well-phenotyped, large cohorts and genome-wide technologies--approaches that have begun to elucidate the genetic basis of other complex traits/diseases. This review summarises the most recent and significant findings from sports genetics and explores future trends and possibilities.

  18. Research for genetic instability of human genome

    International Nuclear Information System (INIS)

    Hori, T.; Takahashi, E.; Tsuji, H.; Yamauchi, M.; Murata, M.

    1992-01-01

    In the present review paper, the potential relevance of chromosomal fragile sites to carcinogenesis and mutagenesis is discussed based on our own and other's studies. Recent evidence indicate that fragile sites may act as predisposition factors involved in chromosomal instability of the human genome and that the sites may be preferential targets for various DNA damaging agents including ionizing radiation. It is also demonstrated that some critical genomic rearrangements at the fragile sites may contribute towards oncogenesis and that individuals carrying heritable form of fragile site may be at the risk. Although clinical significance of autosomal fragile sites has been a matter of discussion, a fragile site of the X chromosome is known to be associated with an X-linked genetic diseases, called fragile X syndrome. Molecular events leading to the fragile X syndrome have recently been elucidated. The fragile X genotype can be characterized by an increased amount of p(CCG)n repeat DNA sequence in the FMR-1 gene and the repeated sequences are shown to be unstable in both meiosis and mitosis. These repeats might exhibit higher mutation rate than is generally seen in the human genome. Further studies on the fragile sites in molecular biology and radiation biology will yield relevant data to the molecular mechanisms of genetic instability of the human genome as well as to better assessment of genetic effect of ionizing radiation. (author)

  19. The Non-Genomic Actions of Vitamin D

    Directory of Open Access Journals (Sweden)

    Charles S Hii

    2016-03-01

    Full Text Available Since its discovery in 1920, a great deal of effort has gone into investigating the physiological actions of vitamin D and the impact its deficiency has on human health. Despite this intense interest, there is still disagreement on what constitutes the lower boundary of adequacy and on the Recommended Dietary Allowance. There has also been a major push to elucidate the biochemistry of vitamin D, its metabolic pathways and the mechanisms that mediate its action. Originally thought to act by altering the expression of target genes, it was realized in the mid-1980s that some of the actions of vitamin D were too rapid to be accounted for by changes at the genomic level. These rapid non-genomic actions have attracted as much interest as the genomic actions and they have spawned additional questions in an already busy field. This mini-review attempts to summarise the in vitro and in vivo work that has been conducted to characterise the rapid non-genomic actions, the mechanisms that give rise to these properties and the roles that these play in the overall action of vitamin D at the cellular level. Understanding the effects of vitamin D at the cellular level should enable the design of elegant human studies to extract the full potential of vitamin D to benefit human health.

  20. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    Directory of Open Access Journals (Sweden)

    Weerachai Jaratlerdsiri

    Full Text Available The major histocompatibility complex (MHC is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs.

  1. Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber

    NARCIS (Netherlands)

    Zhang, Z.; Mao, L.; Chen, Junshi; Bu, F.; Li, G.; Sun, J.; Li, S.; Sun, H.; Jiao, C.; Blakely, R.; Pan, J.; Cai, R.; Luo, R.; Peer, Van de Y.; Jacobsen, E.; Fei, Z.; Huang, S.

    2015-01-01

    Structural variations (SVs) represent a major source of genetic diversity. However, the functional impact and formation mechanisms of SVs in plant genomes remain largely unexplored. Here, we report a nucleotide-resolution SV map of cucumber (Cucumis sativas) that comprises 26,788 SVs based on deep

  2. ProteinSplit: splitting of multi-domain proteins using prediction of ordered and disordered regions in protein sequences for virtual structural genomics

    International Nuclear Information System (INIS)

    Wyrwicz, Lucjan S; Koczyk, Grzegorz; Rychlewski, Leszek; Plewczynski, Dariusz

    2007-01-01

    The annotation of protein folds within newly sequenced genomes is the main target for semi-automated protein structure prediction (virtual structural genomics). A large number of automated methods have been developed recently with very good results in the case of single-domain proteins. Unfortunately, most of these automated methods often fail to properly predict the distant homology between a given multi-domain protein query and structural templates. Therefore a multi-domain protein should be split into domains in order to overcome this limitation. ProteinSplit is designed to identify protein domain boundaries using a novel algorithm that predicts disordered regions in protein sequences. The software utilizes various sequence characteristics to assess the local propensity of a protein to be disordered or ordered in terms of local structure stability. These disordered parts of a protein are likely to create interdomain spacers. Because of its speed and portability, the method was successfully applied to several genome-wide fold annotation experiments. The user can run an automated analysis of sets of proteins or perform semi-automated multiple user projects (saving the results on the server). Additionally the sequences of predicted domains can be sent to the Bioinfo.PL Protein Structure Prediction Meta-Server for further protein three-dimensional structure and function prediction. The program is freely accessible as a web service at http://lucjan.bioinfo.pl/proteinsplit together with detailed benchmark results on the critical assessment of a fully automated structure prediction (CAFASP) set of sequences. The source code of the local version of protein domain boundary prediction is available upon request from the authors

  3. Chromatin dynamics in genome stability

    DEFF Research Database (Denmark)

    Nair, Nidhi; Shoaib, Muhammad; Sørensen, Claus Storgaard

    2017-01-01

    Genomic DNA is compacted into chromatin through packaging with histone and non-histone proteins. Importantly, DNA accessibility is dynamically regulated to ensure genome stability. This is exemplified in the response to DNA damage where chromatin relaxation near genomic lesions serves to promote...... access of relevant enzymes to specific DNA regions for signaling and repair. Furthermore, recent data highlight genome maintenance roles of chromatin through the regulation of endogenous DNA-templated processes including transcription and replication. Here, we review research that shows the importance...... of chromatin structure regulation in maintaining genome integrity by multiple mechanisms including facilitating DNA repair and directly suppressing endogenous DNA damage....

  4. Comparisons of Copy Number, Genomic Structure, and Conserved Motifs for α-Amylase Genes from Barley, Rice, and Wheat

    Directory of Open Access Journals (Sweden)

    Qisen Zhang

    2017-10-01

    Full Text Available Barley is an important crop for the production of malt and beer. However, crops such as rice and wheat are rarely used for malting. α-amylase is the key enzyme that degrades starch during malting. In this study, we compared the genomic properties, gene copies, and conserved promoter motifs of α-amylase genes in barley, rice, and wheat. In all three crops, α-amylase consists of four subfamilies designated amy1, amy2, amy3, and amy4. In wheat and barley, members of amy1 and amy2 genes are localized on chromosomes 6 and 7, respectively. In rice, members of amy1 genes are found on chromosomes 1 and 2, and amy2 genes on chromosome 6. The barley genome has six amy1 members and three amy2 members. The wheat B genome contains four amy1 members and three amy2 members, while the rice genome has three amy1 members and one amy2 member. The B genome has mostly amy1 and amy2 members among the three wheat genomes. Amy1 promoters from all three crop genomes contain a GA-responsive complex consisting of a GA-responsive element (CAATAAA, pyrimidine box (CCTTTT and TATCCAT/C box. This study has shown that amy1 and amy2 from both wheat and barley have similar genomic properties, including exon/intron structures and GA-responsive elements on promoters, but these differ in rice. Like barley, wheat should have sufficient amy activity to degrade starch completely during malting. Other factors, such as high protein with haze issues and the lack of husk causing Lauting difficulty, may limit the use of wheat for brewing.

  5. NSD1 mutations generate a genome-wide DNA methylation signature.

    LENUS (Irish Health Repository)

    Choufani, S

    2015-12-22

    Sotos syndrome (SS) represents an important human model system for the study of epigenetic regulation; it is an overgrowth\\/intellectual disability syndrome caused by mutations in a histone methyltransferase, NSD1. As layered epigenetic modifications are often interdependent, we propose that pathogenic NSD1 mutations have a genome-wide impact on the most stable epigenetic mark, DNA methylation (DNAm). By interrogating DNAm in SS patients, we identify a genome-wide, highly significant NSD1(+\\/-)-specific signature that differentiates pathogenic NSD1 mutations from controls, benign NSD1 variants and the clinically overlapping Weaver syndrome. Validation studies of independent cohorts of SS and controls assigned 100% of these samples correctly. This highly specific and sensitive NSD1(+\\/-) signature encompasses genes that function in cellular morphogenesis and neuronal differentiation, reflecting cardinal features of the SS phenotype. The identification of SS-specific genome-wide DNAm alterations will facilitate both the elucidation of the molecular pathophysiology of SS and the development of improved diagnostic testing.

  6. Use of Whole-Genus Genome Sequence Data To Develop a Multilocus Sequence Typing Tool That Accurately Identifies Yersinia Isolates to the Species and Subspecies Levels

    Science.gov (United States)

    Hall, Miquette; Chattaway, Marie A.; Reuter, Sandra; Savin, Cyril; Strauch, Eckhard; Carniel, Elisabeth; Connor, Thomas; Van Damme, Inge; Rajakaruna, Lakshani; Rajendram, Dunstan; Jenkins, Claire; Thomson, Nicholas R.

    2014-01-01

    The genus Yersinia is a large and diverse bacterial genus consisting of human-pathogenic species, a fish-pathogenic species, and a large number of environmental species. Recently, the phylogenetic and population structure of the entire genus was elucidated through the genome sequence data of 241 strains encompassing every known species in the genus. Here we report the mining of this enormous data set to create a multilocus sequence typing-based scheme that can identify Yersinia strains to the species level to a level of resolution equal to that for whole-genome sequencing. Our assay is designed to be able to accurately subtype the important human-pathogenic species Yersinia enterocolitica to whole-genome resolution levels. We also report the validation of the scheme on 386 strains from reference laboratory collections across Europe. We propose that the scheme is an important molecular typing system to allow accurate and reproducible identification of Yersinia isolates to the species level, a process often inconsistent in nonspecialist laboratories. Additionally, our assay is the most phylogenetically informative typing scheme available for Y. enterocolitica. PMID:25339391

  7. Biochemical and structural insights into xylan utilization by the thermophilic bacterium Caldanaerobius polysaccharolyticus.

    Science.gov (United States)

    Han, Yejun; Agarwal, Vinayak; Dodd, Dylan; Kim, Jason; Bae, Brian; Mackie, Roderick I; Nair, Satish K; Cann, Isaac K O

    2012-10-12

    Hemicellulose is the next most abundant plant cell wall component after cellulose. The abundance of hemicellulose such as xylan suggests that their hydrolysis and conversion to biofuels can improve the economics of bioenergy production. In an effort to understand xylan hydrolysis at high temperatures, we sequenced the genome of the thermophilic bacterium Caldanaerobius polysaccharolyticus. Analysis of the partial genome sequence revealed a gene cluster that contained both hydrolytic enzymes and also enzymes key to the pentose-phosphate pathway. The hydrolytic enzymes in the gene cluster were demonstrated to convert products from a large endoxylanase (Xyn10A) predicted to anchor to the surface of the bacterium. We further use structural and calorimetric studies to demonstrate that the end products of Xyn10A hydrolysis of xylan are recognized and bound by XBP1, a putative solute-binding protein, likely for transport into the cell. The XBP1 protein showed preference for xylo-oligosaccharides as follows: xylotriose > xylobiose > xylotetraose. To elucidate the structural basis for the oligosaccharide preference, we solved the co-crystal structure of XBP1 complexed with xylotriose to a 1.8-Å resolution. Analysis of the biochemical data in the context of the co-crystal structure reveals the molecular underpinnings of oligosaccharide length specificity.

  8. GPCR-I-TASSER: A Hybrid Approach to G Protein-Coupled Receptor Structure Modeling and the Application to the Human Genome.

    Science.gov (United States)

    Zhang, Jian; Yang, Jianyi; Jang, Richard; Zhang, Yang

    2015-08-04

    Experimental structure determination remains difficult for G protein-coupled receptors (GPCRs). We propose a new hybrid protocol to construct GPCR structure models that integrates experimental mutagenesis data with ab initio transmembrane (TM) helix assembly simulations. The method was tested on 24 known GPCRs where the ab initio TM-helix assembly procedure constructed the correct fold for 20 cases. When combined with weak homology and sparse mutagenesis restraints, the method generated correct folds for all the tested cases with an average Cα root-mean-square deviation 2.4 Å in the TM regions. The new hybrid protocol was applied to model all 1,026 GPCRs in the human genome, where 923 have a high confidence score and are expected to have correct folds; these contain many pharmaceutically important families with no previously solved structures, including Trace amine, Prostanoids, Releasing hormones, Melanocortins, Vasopressin, and Neuropeptide Y receptors. The results demonstrate new progress on genome-wide structure modeling of TM proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Comparing Mycobacterium tuberculosis genomes using genome topology networks.

    Science.gov (United States)

    Jiang, Jianping; Gu, Jianlei; Zhang, Liang; Zhang, Chenyi; Deng, Xiao; Dou, Tonghai; Zhao, Guoping; Zhou, Yan

    2015-02-14

    Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene duplication and genome rearrangement, can lead to different phenotypes among strains, and an investigation of genes affected by SVs may extend our knowledge of the relationships between SVs and phenotypes in microbes, especially in pathogenic bacteria. In this work, we introduce a 'Genome Topology Network' (GTN) method based on gene homology and gene locations to analyze genomic SVs and perform phylogenetic analysis. Furthermore, the concept of 'unfixed ortholog' has been proposed, whose members are affected by SVs in genome topology among close species. To improve the precision of 'unfixed ortholog' recognition, a strategy to detect annotation differences and complete gene annotation was applied. To assess the GTN method, a set of thirteen complete M. tuberculosis genomes was analyzed as a case study. GTNs with two different gene homology-assigning methods were built, the Clusters of Orthologous Groups (COG) method and the orthoMCL clustering method, and two phylogenetic trees were constructed accordingly, which may provide additional insights into whole genome-based phylogenetic analysis. We obtained 24 unfixable COG groups, of which most members were related to immunogenicity and drug resistance, such as PPE-repeat proteins (COG5651) and transcriptional regulator TetR gene family members (COG1309). The GTN method has been implemented in PERL and released on our website. The tool can be downloaded from http://homepage.fudan.edu.cn/zhouyan/gtn/ , and allows re-annotating the 'lost' genes among closely related genomes, analyzing genes affected by SVs, and performing phylogenetic analysis. With this tool, many immunogenic-related and drug resistance-related genes

  10. Genomic Insights into Growth and Its Disorders: An Update

    Science.gov (United States)

    de Bruin, Christiaan; Dauber, Andrew

    2016-01-01

    Purpose of review To provide an update of the most striking new developments in the field of growth genetics over the past 12 months Recent findings A number of large genome-wide association studies have identified new genetic loci and pathways associated to human growth and adult height as well as related traits and comorbidities. New genetic etiologies of primordial dwarfism and several short stature syndromes have been elucidated. Moreover, a breakthrough finding of Xq26 microduplications as a cause of pituitary gigantism was made. Several new developments in imprinted growth-related genes (including the first human mutation in IGF-II) and novel insights into the epigenetic regulation of growth have been reported. Summary Genomic investigations continue to provide new insights into the genetic basis of human growth as well as its disorders. PMID:26702851

  11. Genomic rearrangement in radiation-induced murine myeloid leukemia

    International Nuclear Information System (INIS)

    Ishihara, Hiroshi

    1994-01-01

    After whole body irradiation of 3Gy X ray to C3H/He male mice, acute myeloid leukemia is induced at an incidence of 20 to 30% within 2 years. We have studied the mechanism of occurrence of this radiation-induced murine myeloid leukemia. Detection and isolation of genomic structural aberration which may be accumulated accompanied with leukemogenesis are helpful in analyzing the complicated molecular process from radiation damage to leukemogenesis. So, our research work was done in three phases. First, structures of previously characterized oncogenes and cytokine-related genes were analyzed, and abnormal structures of fms(protooncogene encoding M-CSF receptor gene)-related and myc-related genes were found in several leukemia cells. Additionally, genomic structural aberration of IL-3 gene was observed in some leukemia cells, so that construction of genomic libraries and cloning of the abnormal IL-3 genomic DNAs were performed to characterize the structure. Secondly, because the breakage of chromosome 2 that is frequently observed in myeloid leukemia locates in proximal position of IL-1 gene cluster in some cases, the copy number of IL-1 gene was determined and the gene was cloned. Lastly, the abnormal genome of leukemia cell was cloned by in-gel competence reassociation method. We discussed these findings and evaluated the analysis of the molecular process of leukemogenesis using these cloned genomic fragments. (author)

  12. Long-term response to genomic selection: effects of estimation method and reference population structure for different genetic architectures.

    Science.gov (United States)

    Bastiaansen, John W M; Coster, Albart; Calus, Mario P L; van Arendonk, Johan A M; Bovenhuis, Henk

    2012-01-24

    Genomic selection has become an important tool in the genetic improvement of animals and plants. The objective of this study was to investigate the impacts of breeding value estimation method, reference population structure, and trait genetic architecture, on long-term response to genomic selection without updating marker effects. Three methods were used to estimate genomic breeding values: a BLUP method with relationships estimated from genome-wide markers (GBLUP), a Bayesian method, and a partial least squares regression method (PLSR). A shallow (individuals from one generation) or deep reference population (individuals from five generations) was used with each method. The effects of the different selection approaches were compared under four different genetic architectures for the trait under selection. Selection was based on one of the three genomic breeding values, on pedigree BLUP breeding values, or performed at random. Selection continued for ten generations. Differences in long-term selection response were small. For a genetic architecture with a very small number of three to four quantitative trait loci (QTL), the Bayesian method achieved a response that was 0.05 to 0.1 genetic standard deviation higher than other methods in generation 10. For genetic architectures with approximately 30 to 300 QTL, PLSR (shallow reference) or GBLUP (deep reference) had an average advantage of 0.2 genetic standard deviation over the Bayesian method in generation 10. GBLUP resulted in 0.6% and 0.9% less inbreeding than PLSR and BM and on average a one third smaller reduction of genetic variance. Responses in early generations were greater with the shallow reference population while long-term response was not affected by reference population structure. The ranking of estimation methods was different with than without selection. Under selection, applying GBLUP led to lower inbreeding and a smaller reduction of genetic variance while a similar response to selection was

  13. Telomeres and viruses: common themes of genome maintenance

    Science.gov (United States)

    Deng, Zhong; Wang, Zhuo; Lieberman, Paul M.

    2012-01-01

    Genome maintenance mechanisms actively suppress genetic instability associated with cancer and aging. Some viruses provoke genetic instability by subverting the host’s control of genome maintenance. Viruses have their own specialized strategies for genome maintenance, which can mimic and modify host cell processes. Here, we review some of the common features of genome maintenance utilized by viruses and host chromosomes, with a particular focus on terminal repeat (TR) elements. The TRs of cellular chromosomes, better known as telomeres, have well-established roles in cellular chromosome stability. Cellular telomeres are themselves maintained by viral-like mechanisms, including self-propagation by reverse transcription, recombination, and retrotransposition. Viral TR elements, like cellular telomeres, are essential for viral genome stability and propagation. We review the structure and function of viral repeat elements and discuss how they may share telomere-like structures and genome protection functions. We consider how viral infections modulate telomere regulatory factors for viral repurposing and can alter normal host telomere structure and chromosome stability. Understanding the common strategies of viral and cellular genome maintenance may provide new insights into viral–host interactions and the mechanisms driving genetic instability in cancer. PMID:23293769

  14. Complete chloroplast genome sequence of Elodea canadensis and comparative analyses with other monocot plastid genomes.

    Science.gov (United States)

    Huotari, Tea; Korpelainen, Helena

    2012-10-15

    Elodea canadensis is an aquatic angiosperm native to North America. It has attracted great attention due to its invasive nature when transported to new areas in its non-native range. We have determined the complete nucleotide sequence of the chloroplast (cp) genome of Elodea. Taxonomically Elodea is a basal monocot, and only few monocot cp genomes representing early lineages of monocots have been sequenced so far. The genome is a circular double-stranded DNA molecule 156,700 bp in length, and has a typical structure with large (LSC 86,194 bp) and small (SSC 17,810 bp) single-copy regions separated by a pair of inverted repeats (IRs 26,348 bp each). The Elodea cp genome contains 113 unique genes and 16 duplicated genes in the IR regions. A comparative analysis showed that the gene order and organization of the Elodea cp genome is almost identical to that of Amborella trichopoda, a basal angiosperm. The structure of IRs in Elodea is unique among monocot species with the whole cp genome sequenced. In Elodea and another monocot Lemna minor the borders between IRs and LSC are located upstream of rps 19 gene and downstream of trnH-GUG gene, while in most monocots, IR has extended to include both trnH and rps 19 genes. A phylogenetic analysis conducted using Bayesian method, based on the DNA sequences of 81 chloroplast genes from 17 monocot taxa provided support for the placement of Elodea together with Lemna as a basal monocot and the next diverging lineage of monocots after Acorales. In comparison with other monocots, the Elodea cp genome has gone through only few rearrangements or gene losses. IR of Elodea has a unique structure among the monocot species studied so far as its structure is similar to that of a basal angiosperm Amborella. This result together with phylogenetic analyses supports the placement of Elodea as a basal monocot to the next diverging lineage of monocots after Acorales. So far, only few cp genomes representing early lineages of monocots have been

  15. Calculation of 3D genome structures for comparison of chromosome conformation capture experiments with microscopy: An evaluation of single-cell Hi-C protocols.

    Science.gov (United States)

    Lando, David; Stevens, Tim J; Basu, Srinjan; Laue, Ernest D

    2018-01-01

    Single-cell chromosome conformation capture approaches are revealing the extent of cell-to-cell variability in the organization and packaging of genomes. These single-cell methods, unlike their multi-cell counterparts, allow straightforward computation of realistic chromosome conformations that may be compared and combined with other, independent, techniques to study 3D structure. Here we discuss how single-cell Hi-C and subsequent 3D genome structure determination allows comparison with data from microscopy. We then carry out a systematic evaluation of recently published single-cell Hi-C datasets to establish a computational approach for the evaluation of single-cell Hi-C protocols. We show that the calculation of genome structures provides a useful tool for assessing the quality of single-cell Hi-C data because it requires a self-consistent network of interactions, relating to the underlying 3D conformation, with few errors, as well as sufficient longer-range cis- and trans-chromosomal contacts.

  16. Elucidation of the biosynthesis of meroterpenoid yanuthone D in Aspergillus Niger

    DEFF Research Database (Denmark)

    Holm, Dorte Koefoed; Petersen, Lene Maj; Klitgaard, Andreas

    2012-01-01

    We have elucidated the mode of biosynthesis of the meroterpenoid compound Yanuthone D in Aspergillus niger. We have successfully deleted all cluster genes, and identified a number of intermediates. Structures of the intermediates were solved using a combined approach comprising classical 1D- and 2D......-NMR and tandem mass spectrometry (MS/MS). In this study we have confirmed that Yanuthone D is of meroterpenoid origin, and we have identified an unexpected precursor, which has not before been reported for Aspergillus niger....

  17. Draft Genome Sequence of Bacillus sp. FMQ74, a Dairy-contaminating Isolate from Raw Milk

    DEFF Research Database (Denmark)

    Okshevsky, Mira Ursula; Regina, Viduthalai R.; Marshall, Ian

    2017-01-01

    Representatives of the genus Bacillus are common milk contaminants that cause spoilage and flavor alterations of dairy products. Bacillus sp. FMQ74 was isolated from raw milk on a Danish dairy farm. To elucidate the genomic basis of this strain’s survival in the dairy industry, a high-quality draft...

  18. Genome sequence of the thermophile Bacillus coagulans Hammer, the type strain of the species.

    Science.gov (United States)

    Su, Fei; Tao, Fei; Tang, Hongzhi; Xu, Ping

    2012-11-01

    Here we announce a 3.0-Mb assembly of the Bacillus coagulans Hammer strain, which is the type strain of the species within the genus Bacillus. Genomic analyses based on the sequence may provide insights into the phylogeny of the species and help to elucidate characteristics of the poorly studied strains of Bacillus coagulans.

  19. Genome Sequence of the Thermophile Bacillus coagulans Hammer, the Type Strain of the Species

    OpenAIRE

    Su, Fei; Tao, Fei; Tang, Hongzhi; Xu, Ping

    2012-01-01

    Here we announce a 3.0-Mb assembly of the Bacillus coagulans Hammer strain, which is the type strain of the species within the genus Bacillus. Genomic analyses based on the sequence may provide insights into the phylogeny of the species and help to elucidate characteristics of the poorly studied strains of Bacillus coagulans.

  20. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D' Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  1. Functional annotation of the genome unravels probiotic potential of Bacillus coagulans HS243.

    Science.gov (United States)

    Kapse, N G; Engineer, A S; Gowdaman, V; Wagh, S; Dhakephalkar, P K

    2018-05-30

    Spore forming Bacillus species are widely used as probiotics for human dietary supplements and in animal feeds. However, information on genetic basis of their probiotic action is obscure. Therefore, the present investigation was undertaken to elucidate probiotic traits of B. coagulans HS243 through its genome analysis. Genome mining revealed the presence of an arsenal of marker genes attributed to genuine probiotic traits. In silico analysis of HS243 genome revealed the presence of multi subunit ATPases, ADI pathway genes, chologlycine hydrolase, adhesion proteins for surviving and colonizing harsh gastric transit. HS243 genome harbored vitamin and essential amino acid biosynthetic genes, suggesting the use of HS243 as a nutrient supplement. Bacteriocin producing genes highlighted the disease preventing potential of HS243. Thus, this work established that HS243 possessed the genetic repertoire required for surviving harsh gastric transit and conferring health benefits to the host which were further validated by wet lab evidences. Copyright © 2018. Published by Elsevier Inc.

  2. Supplementary Material for: Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody; Coll, Francesc; McNerney, Ruth; Ascher, David; Pires, Douglas; Furnham, Nick; Coeck, Nele; Hill-Cawthorne, Grant; Nair, Mridul; Mallard, Kim; Ramsay, Andrew; Campino, Susana; Hibberd, Martin; Pain, Arnab; Rigouts, Leen; Clark, Taane

    2016-01-01

    Abstract Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure

  3. 3D modeling of genome macroorganization on the basis of its structural changes after action of radiation

    International Nuclear Information System (INIS)

    Aleksandrov, I.D.; Aleksandrova, M.V.; Zaikin, N.S.; Koren'kov, V.V.; Pervushova, O.V.; Stepanenko, V.A.

    2006-01-01

    At present, after 120 years of the theoretical and experimental works, the issue of the genome macroarchitecture as the highest level of interphase chromosome organization in somatic cell nuclei remains still unresolved. The problem of the spatial arrangement of interphase chromosomes in haploid germ cells has never even been studied. A 3D simulation of packaging of the entire second chromosome in Drosophila mature sperms has been performed by using mathematical approaches and visualization methods to present macromolecular structure data. As genetic markers for simulation, frequency and location of the second inversion breakpoints for 72 structural υg mutants induced by ionizing radiation were used supposing that both ends of each inversion are topologically brought together forming loop of appropriate size. For the account of a degree of spatial affinity and visualization of chromosomal loops modern 3D-modeling methods with application of splines, libraries OpenGL, language Delphi, program Gmax were used. According to the model proposed, the entire second chromosome within mature sperm nuclei seems to be packaged in the form of a megarosette-loop structure which may be a basic principle of organization of the genome macro-architecture in animal haploid germ cells

  4. Ribosomal DNA sequence heterogeneity reflects intraspecies phylogenies and predicts genome structure in two contrasting yeast species.

    Science.gov (United States)

    West, Claire; James, Stephen A; Davey, Robert P; Dicks, Jo; Roberts, Ian N

    2014-07-01

    The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of

  5. Identification and classification of conserved RNA secondary structures in the human genome

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Bejerano, Gill; Siepel, Adam

    2006-01-01

    The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars...... for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set......, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization....

  6. Application of Genomic Technologies to the Breeding of Trees.

    Science.gov (United States)

    Badenes, Maria L; Fernández I Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J

    2016-01-01

    The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the

  7. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Science.gov (United States)

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  8. Genome Scan for Selection in Structured Layer Chicken Populations Exploiting Linkage Disequilibrium Information.

    Directory of Open Access Journals (Sweden)

    Mahmood Gholami

    Full Text Available An increasing interest is being placed in the detection of genes, or genomic regions, that have been targeted by selection because identifying signatures of selection can lead to a better understanding of genotype-phenotype relationships. A common strategy for the detection of selection signatures is to compare samples from distinct populations and to search for genomic regions with outstanding genetic differentiation. The aim of this study was to detect selective signatures in layer chicken populations using a recently proposed approach, hapFLK, which exploits linkage disequilibrium information while accounting appropriately for the hierarchical structure of populations. We performed the analysis on 70 individuals from three commercial layer breeds (White Leghorn, White Rock and Rhode Island Red, genotyped for approximately 1 million SNPs. We found a total of 41 and 107 regions with outstanding differentiation or similarity using hapFLK and its single SNP counterpart FLK respectively. Annotation of selection signature regions revealed various genes and QTL corresponding to productions traits, for which layer breeds were selected. A number of the detected genes were associated with growth and carcass traits, including IGF-1R, AGRP and STAT5B. We also annotated an interesting gene associated with the dark brown feather color mutational phenotype in chickens (SOX10. We compared FST, FLK and hapFLK and demonstrated that exploiting linkage disequilibrium information and accounting for hierarchical population structure decreased the false detection rate.

  9. Identification of the elementary structural units of the DNA damage response.

    Science.gov (United States)

    Natale, Francesco; Rapp, Alexander; Yu, Wei; Maiser, Andreas; Harz, Hartmann; Scholl, Annina; Grulich, Stephan; Anton, Tobias; Hörl, David; Chen, Wei; Durante, Marco; Taucher-Scholz, Gisela; Leonhardt, Heinrich; Cardoso, M Cristina

    2017-06-12

    Histone H2AX phosphorylation is an early signalling event triggered by DNA double-strand breaks (DSBs). To elucidate the elementary units of phospho-H2AX-labelled chromatin, we integrate super-resolution microscopy of phospho-H2AX during DNA repair in human cells with genome-wide sequencing analyses. Here we identify phospho-H2AX chromatin domains in the nanometre range with median length of ∼75 kb. Correlation analysis with over 60 genomic features shows a time-dependent euchromatin-to-heterochromatin repair trend. After X-ray or CRISPR-Cas9-mediated DSBs, phospho-H2AX-labelled heterochromatin exhibits DNA decondensation while retaining heterochromatic histone marks, indicating that chromatin structural and molecular determinants are uncoupled during repair. The phospho-H2AX nano-domains arrange into higher-order clustered structures of discontinuously phosphorylated chromatin, flanked by CTCF. CTCF knockdown impairs spreading of the phosphorylation throughout the 3D-looped nano-domains. Co-staining of phospho-H2AX with phospho-Ku70 and TUNEL reveals that clusters rather than nano-foci represent single DSBs. Hence, each chromatin loop is a nano-focus, whose clusters correspond to previously known phospho-H2AX foci.

  10. Nationwide Genomic Study in Denmark Reveals Remarkable Population Homogeneity.

    Science.gov (United States)

    Athanasiadis, Georgios; Cheng, Jade Y; Vilhjálmsson, Bjarni J; Jørgensen, Frank G; Als, Thomas D; Le Hellard, Stephanie; Espeseth, Thomas; Sullivan, Patrick F; Hultman, Christina M; Kjærgaard, Peter C; Schierup, Mikkel H; Mailund, Thomas

    2016-10-01

    Denmark has played a substantial role in the history of Northern Europe. Through a nationwide scientific outreach initiative, we collected genetic and anthropometrical data from ∼800 high school students and used them to elucidate the genetic makeup of the Danish population, as well as to assess polygenic predictions of phenotypic traits in adolescents. We observed remarkable homogeneity across different geographic regions, although we could still detect weak signals of genetic structure reflecting the history of the country. Denmark presented genomic affinity with primarily neighboring countries with overall resemblance of decreasing weight from Britain, Sweden, Norway, Germany, and France. A Polish admixture signal was detected in Zealand and Funen, and our date estimates coincided with historical evidence of Wend settlements in the south of Denmark. We also observed considerably diverse demographic histories among Scandinavian countries, with Denmark having the smallest current effective population size compared to Norway and Sweden. Finally, we found that polygenic prediction of self-reported adolescent height in the population was remarkably accurate (R 2 = 0.639 ± 0.015). The high homogeneity of the Danish population could render population structure a lesser concern for the upcoming large-scale gene-mapping studies in the country. Copyright © 2016 by the Genetics Society of America.

  11. Collaborative Genomics Study Advances Precision Oncology

    Science.gov (United States)

    A collaborative study conducted by two Office of Cancer Genomics (OCG) initiatives highlights the importance of integrating structural and functional genomics programs to improve cancer therapies, and more specifically, contribute to precision oncology treatments for children.

  12. Deriving a Mutation Index of Carcinogenicity Using Protein Structure and Protein Interfaces

    Science.gov (United States)

    Hakas, Jarle; Pearl, Frances; Zvelebil, Marketa

    2014-01-01

    With the advent of Next Generation Sequencing the identification of mutations in the genomes of healthy and diseased tissues has become commonplace. While much progress has been made to elucidate the aetiology of disease processes in cancer, the contributions to disease that many individual mutations make remain to be characterised and their downstream consequences on cancer phenotypes remain to be understood. Missense mutations commonly occur in cancers and their consequences remain challenging to predict. However, this knowledge is becoming more vital, for both assessing disease progression and for stratifying drug treatment regimes. Coupled with structural data, comprehensive genomic databases of mutations such as the 1000 Genomes project and COSMIC give an opportunity to investigate general principles of how cancer mutations disrupt proteins and their interactions at the molecular and network level. We describe a comprehensive comparison of cancer and neutral missense mutations; by combining features derived from structural and interface properties we have developed a carcinogenicity predictor, InCa (Index of Carcinogenicity). Upon comparison with other methods, we observe that InCa can predict mutations that might not be detected by other methods. We also discuss general limitations shared by all predictors that attempt to predict driver mutations and discuss how this could impact high-throughput predictions. A web interface to a server implementation is publicly available at http://inca.icr.ac.uk/. PMID:24454733

  13. Asparagine-linked oligosaccharides on lutropin, follitropin, and thyrotropin: structural elucidation of the sulfated and sialylated oligosaccharides on bovine, ovine, and human pituitary glycoprotein hormones

    International Nuclear Information System (INIS)

    Green, E.D.; Baenziger, J.U.

    1988-01-01

    The authors have elucidated the structures of the anionic asparagine-linked oligosaccharides present on the glycoprotein hormones lutropin (luteinizing hormone), follitropin (follicle-stimulating hormone), and thyrotropin (thyroid-stimulating hormone). Purified hormones, isolated from bovine, ovine, and human pituitaries, were digested with N-glycanase, and the released oligosaccharides were reduced with NaB[ 3 H] 4 . The 3 H-labeled oligosaccharides from each hormone were then fractionated by anion-exchange high performance liquid chromatography (HPLC) into populations differing in the number of sulfate and/or sialic acid moieties. The sulfated, sialylated, and sulfated/sialylated structures, which together comprised 67-90% of the asparagine-linked oligosaccharides on the pituitary glycoprotein hormones, were highly heterogeneous and displayed hormone- as well as animal species-specific features. A previously uncharacterized dibranched oligosaccharide, bearing one residue each of sulfate and sialic acid, was found on all of the hormones except bovine lutropin. In this study, they describe the purification and detailed structural characterizations of the sulfated, sialylated, and sulfated/sialylated oligosaccharides found on lutropin, follitropin, and thyrotropin from several animal species

  14. Secondary Analysis and Integration of Existing Data to Elucidate the Genetic Architecture of Cancer Risk and Related Outcomes, R21 | Informatics Technology for Cancer Research (ITCR)

    Science.gov (United States)

    This funding opportunity announcement (FOA) encourages applications that propose to conduct secondary data analysis and integration of existing datasets and database resources, with the ultimate aim to elucidate the genetic architecture of cancer risk and related outcomes. The goal of this initiative is to address key scientific questions relevant to cancer epidemiology by supporting the analysis of existing genetic or genomic datasets, possibly in combination with environmental, outcomes, behavioral, lifestyle, and molecular profiles data.

  15. Secondary Analysis and Integration of Existing Data to Elucidate the Genetic Architecture of Cancer Risk and Related Outcomes, R01 | Informatics Technology for Cancer Research (ITCR)

    Science.gov (United States)

    This funding opportunity announcement (FOA) encourages applications that propose to conduct secondary data analysis and integration of existing datasets and database resources, with the ultimate aim to elucidate the genetic architecture of cancer risk and related outcomes. The goal of this initiative is to address key scientific questions relevant to cancer epidemiology by supporting the analysis of existing genetic or genomic datasets, possibly in combination with environmental, outcomes, behavioral, lifestyle, and molecular profiles data.

  16. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate

    Directory of Open Access Journals (Sweden)

    Freddy Asenjo

    2016-04-01

    Full Text Available Background. The honey bee (Apis mellifera is the most important pollinator in agriculture worldwide. However, the number of honey bees has fallen significantly since 2006, becoming a huge ecological problem nowadays. The principal cause is CCD, or Colony Collapse Disorder, characterized by the seemingly spontaneous abandonment of hives by their workers. One of the characteristics of CCD in honey bees is the alteration of the bacterial communities in their gastrointestinal tract, mainly due to the decrease of Firmicutes populations, such as the Lactobacilli. At this time, the causes of these alterations remain unknown. We recently isolated a strain of Lactobacillus kunkeei (L. kunkeei strain MP2 from the gut of Chilean honey bees. L. kunkeei, is one of the most commonly isolated bacterium from the honey bee gut and is highly versatile in different ecological niches. In this study, we aimed to elucidate in detail, the L. kunkeei genetic background and perform a comparative genome analysis with other Lactobacillus species. Methods. L. kunkeei MP2 was originally isolated from the guts of Chilean A. mellifera individuals. Genome sequencing was done using Pacific Biosciences single-molecule real-time sequencing technology. De novo assembly was performed using Celera assembler. The genome was annotated using Prokka, and functional information was added using the EggNOG 3.1 database. In addition, genomic islands were predicted using IslandViewer, and pro-phage sequences using PHAST. Comparisons between L. kunkeei MP2 with other L. kunkeei, and Lactobacillus strains were done using Roary. Results. The complete genome of L. kunkeei MP2 comprises one circular chromosome of 1,614,522 nt. with a GC content of 36,9%. Pangenome analysis with 16 L. kunkeei strains, identified 113 unique genes, most of them related to phage insertions. A large and unique region of L. kunkeei MP2 genome contains several genes that encode for phage structural protein and

  17. Supplementary Material for: Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody

    2016-01-01

    Abstract Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel

  18. The subclonal structure and genomic evolution of oral squamous cell carcinoma revealed by ultra-deep sequencing

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin J

    2017-01-01

    Recent studies suggest that head and neck squamous cell carcinomas are very heterogeneous between patients; however the subclonal structure remains unexplored mainly due to studies using only a single biopsy per patient. To deconvolutethe clonal structure and describe the genomic cancer evolution......, we applied whole-exome sequencing combined with ultra-deep targeted sequencing on oral squamous cell carcinomas (OSCC). From each patient, a set of biopsies was sampled from distinct geographical sites in primary tumor and lymph node metastasis.We demonstrate that the included OSCCs show a high...

  19. Human-specific HERV-K insertion causes genomic variations in the human genome.

    Directory of Open Access Journals (Sweden)

    Wonseok Shin

    Full Text Available Human endogenous retroviruses (HERV sequences account for about 8% of the human genome. Through comparative genomics and literature mining, we identified a total of 29 human-specific HERV-K insertions. We characterized them focusing on their structure and flanking sequence. The results showed that four of the human-specific HERV-K insertions deleted human genomic sequences via non-classical insertion mechanisms. Interestingly, two of the human-specific HERV-K insertion loci contained two HERV-K internals and three LTR elements, a pattern which could be explained by LTR-LTR ectopic recombination or template switching. In addition, we conducted a polymorphic test and observed that twelve out of the 29 elements are polymorphic in the human population. In conclusion, human-specific HERV-K elements have inserted into human genome since the divergence of human and chimpanzee, causing human genomic changes. Thus, we believe that human-specific HERV-K activity has contributed to the genomic divergence between humans and chimpanzees, as well as within the human population.

  20. Genome-wide analysis of WRKY gene family in the sesame genome and identification of the WRKY genes involved in responses to abiotic stresses.

    Science.gov (United States)

    Li, Donghua; Liu, Pan; Yu, Jingyin; Wang, Linhai; Dossa, Komivi; Zhang, Yanxin; Zhou, Rong; Wei, Xin; Zhang, Xiurong

    2017-09-11

    Sesame (Sesamum indicum L.) is one of the world's most important oil crops. However, it is susceptible to abiotic stresses in general, and to waterlogging and drought stresses in particular. The molecular mechanisms of abiotic stress tolerance in sesame have not yet been elucidated. The WRKY domain transcription factors play significant roles in plant growth, development, and responses to stresses. However, little is known about the number, location, structure, molecular phylogenetics, and expression of the WRKY genes in sesame. We performed a comprehensive study of the WRKY gene family in sesame and identified 71 SiWRKYs. In total, 65 of these genes were mapped to 15 linkage groups within the sesame genome. A phylogenetic analysis was performed using a related species (Arabidopsis thaliana) to investigate the evolution of the sesame WRKY genes. Tissue expression profiles of the WRKY genes demonstrated that six SiWRKY genes were highly expressed in all organs, suggesting that these genes may be important for plant growth and organ development in sesame. Analysis of the SiWRKY gene expression patterns revealed that 33 and 26 SiWRKYs respond strongly to waterlogging and drought stresses, respectively. Changes in the expression of 12 SiWRKY genes were observed at different times after the waterlogging and drought treatments had begun, demonstrating that sesame gene expression patterns vary in response to abiotic stresses. In this study, we analyzed the WRKY family of transcription factors encoded by the sesame genome. Insight was gained into the classification, evolution, and function of the SiWRKY genes, revealing their putative roles in a variety of tissues. Responses to abiotic stresses in different sesame cultivars were also investigated. The results of our study provide a better understanding of the structures and functions of sesame WRKY genes and suggest that manipulating these WRKYs could enhance resistance to waterlogging and drought.