WorldWideScience

Sample records for progressivemauve multiple genome

  1. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    Science.gov (United States)

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  2. Multiple models for Rosaceae genomics.

    Science.gov (United States)

    Shulaev, Vladimir; Korban, Schuyler S; Sosinski, Bryon; Abbott, Albert G; Aldwinckle, Herb S; Folta, Kevin M; Iezzoni, Amy; Main, Dorrie; Arús, Pere; Dandekar, Abhaya M; Lewers, Kim; Brown, Susan K; Davis, Thomas M; Gardiner, Susan E; Potter, Daniel; Veilleux, Richard E

    2008-07-01

    The plant family Rosaceae consists of over 100 genera and 3,000 species that include many important fruit, nut, ornamental, and wood crops. Members of this family provide high-value nutritional foods and contribute desirable aesthetic and industrial products. Most rosaceous crops have been enhanced by human intervention through sexual hybridization, asexual propagation, and genetic improvement since ancient times, 4,000 to 5,000 B.C. Modern breeding programs have contributed to the selection and release of numerous cultivars having significant economic impact on the U.S. and world markets. In recent years, the Rosaceae community, both in the United States and internationally, has benefited from newfound organization and collaboration that have hastened progress in developing genetic and genomic resources for representative crops such as apple (Malus spp.), peach (Prunus spp.), and strawberry (Fragaria spp.). These resources, including expressed sequence tags, bacterial artificial chromosome libraries, physical and genetic maps, and molecular markers, combined with genetic transformation protocols and bioinformatics tools, have rendered various rosaceous crops highly amenable to comparative and functional genomics studies. This report serves as a synopsis of the resources and initiatives of the Rosaceae community, recent developments in Rosaceae genomics, and plans to apply newly accumulated knowledge and resources toward breeding and crop improvement.

  3. Simultaneous gene finding in multiple genomes.

    Science.gov (United States)

    König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

    2016-11-15

    As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Whole genome phylogenies for multiple Drosophila species

    Directory of Open Access Journals (Sweden)

    Seetharam Arun

    2012-12-01

    Full Text Available Abstract Background Reconstructing the evolutionary history of organisms using traditional phylogenetic methods may suffer from inaccurate sequence alignment. An alternative approach, particularly effective when whole genome sequences are available, is to employ methods that don’t use explicit sequence alignments. We extend a novel phylogenetic method based on Singular Value Decomposition (SVD to reconstruct the phylogeny of 12 sequenced Drosophila species. SVD analysis provides accurate comparisons for a high fraction of sequences within whole genomes without the prior identification of orthologs or homologous sites. With this method all protein sequences are converted to peptide frequency vectors within a matrix that is decomposed to provide simplified vector representations for each protein of the genome in a reduced dimensional space. These vectors are summed together to provide a vector representation for each species, and the angle between these vectors provides distance measures that are used to construct species trees. Results An unfiltered whole genome analysis (193,622 predicted proteins strongly supports the currently accepted phylogeny for 12 Drosophila species at higher dimensions except for the generally accepted but difficult to discern sister relationship between D. erecta and D. yakuba. Also, in accordance with previous studies, many sequences appear to support alternative phylogenies. In this case, we observed grouping of D. erecta with D. sechellia when approximately 55% to 95% of the proteins were removed using a filter based on projection values or by reducing resolution by using fewer dimensions. Similar results were obtained when just the melanogaster subgroup was analyzed. Conclusions These results indicate that using our novel phylogenetic method, it is possible to consult and interpret all predicted protein sequences within multiple whole genomes to produce accurate phylogenetic estimations of relatedness between

  5. Multiple Whole Genome Alignments Without a Reference Organism

    Energy Technology Data Exchange (ETDEWEB)

    Dubchak, Inna; Poliakov, Alexander; Kislyuk, Andrey; Brudno, Michael

    2009-01-16

    Multiple sequence alignments have become one of the most commonly used resources in genomics research. Most algorithms for multiple alignment of whole genomes rely either on a reference genome, against which all of the other sequences are laid out, or require a one-to-one mapping between the nucleotides of the genomes, preventing the alignment of recently duplicated regions. Both approaches have drawbacks for whole-genome comparisons. In this paper we present a novel symmetric alignment algorithm. The resulting alignments not only represent all of the genomes equally well, but also include all relevant duplications that occurred since the divergence from the last common ancestor. Our algorithm, implemented as a part of the VISTA Genome Pipeline (VGP), was used to align seven vertebrate and sixDrosophila genomes. The resulting whole-genome alignments demonstrate a higher sensitivity and specificity than the pairwise alignments previously available through the VGP and have higher exon alignment accuracy than comparable public whole-genome alignments. Of the multiple alignment methods tested, ours performed the best at aligning genes from multigene families?perhaps the most challenging test for whole-genome alignments. Our whole-genome multiple alignments are available through the VISTA Browser at http://genome.lbl.gov/vista/index.shtml.

  6. Multiple Genome Sequences of Lactobacillus plantarum Strains

    OpenAIRE

    Kafka, Thomas A.; Geissler, Andreas J.; Vogel, Rudi F.

    2017-01-01

    ABSTRACT We report here the genome sequences of four Lactobacillus plantarum strains which vary in surface hydrophobicity. Bioinformatic analysis, using additional genomes of Lactobacillus plantarum strains, revealed a possible correlation between the cell wall teichoic acid-type and cell surface hydrophobicity and provide the basis for consecutive analyses.

  7. Multiple Models for Rosaceae Genomics[OA

    Science.gov (United States)

    Shulaev, Vladimir; Korban, Schuyler S.; Sosinski, Bryon; Abbott, Albert G.; Aldwinckle, Herb S.; Folta, Kevin M.; Iezzoni, Amy; Main, Dorrie; Arús, Pere; Dandekar, Abhaya M.; Lewers, Kim; Brown, Susan K.; Davis, Thomas M.; Gardiner, Susan E.; Potter, Daniel; Veilleux, Richard E.

    2008-01-01

    The plant family Rosaceae consists of over 100 genera and 3,000 species that include many important fruit, nut, ornamental, and wood crops. Members of this family provide high-value nutritional foods and contribute desirable aesthetic and industrial products. Most rosaceous crops have been enhanced by human intervention through sexual hybridization, asexual propagation, and genetic improvement since ancient times, 4,000 to 5,000 B.C. Modern breeding programs have contributed to the selection and release of numerous cultivars having significant economic impact on the U.S. and world markets. In recent years, the Rosaceae community, both in the United States and internationally, has benefited from newfound organization and collaboration that have hastened progress in developing genetic and genomic resources for representative crops such as apple (Malus spp.), peach (Prunus spp.), and strawberry (Fragaria spp.). These resources, including expressed sequence tags, bacterial artificial chromosome libraries, physical and genetic maps, and molecular markers, combined with genetic transformation protocols and bioinformatics tools, have rendered various rosaceous crops highly amenable to comparative and functional genomics studies. This report serves as a synopsis of the resources and initiatives of the Rosaceae community, recent developments in Rosaceae genomics, and plans to apply newly accumulated knowledge and resources toward breeding and crop improvement. PMID:18487361

  8. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Wasnick Michael

    2008-03-01

    Full Text Available Abstract Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any

  9. Assembly, Annotation, and Analysis of Multiple Mycorrhizal Fungal Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Initiative Consortium, Mycorrhizal Genomics; Kuo, Alan; Grigoriev, Igor; Kohler, Annegret; Martin, Francis

    2013-03-08

    Mycorrhizal fungi play critical roles in host plant health, soil community structure and chemistry, and carbon and nutrient cycling, all areas of intense interest to the US Dept. of Energy (DOE) Joint Genome Institute (JGI). To this end we are building on our earlier sequencing of the Laccaria bicolor genome by partnering with INRA-Nancy and the mycorrhizal research community in the MGI to sequence and analyze dozens of mycorrhizal genomes of all Basidiomycota and Ascomycota orders and multiple ecological types (ericoid, orchid, and ectomycorrhizal). JGI has developed and deployed high-throughput sequencing techniques, and Assembly, RNASeq, and Annotation Pipelines. In 2012 alone we sequenced, assembled, and annotated 12 draft or improved genomes of mycorrhizae, and predicted ~;;232831 genes and ~;;15011 multigene families, All of this data is publicly available on JGI MycoCosm (http://jgi.doe.gov/fungi/), which provides access to both the genome data and tools with which to analyze the data. Preliminary comparisons of the current total of 14 public mycorrhizal genomes suggest that 1) short secreted proteins potentially involved in symbiosis are more enriched in some orders than in others amongst the mycorrhizal Agaricomycetes, 2) there are wide ranges of numbers of genes involved in certain functional categories, such as signal transduction and post-translational modification, and 3) novel gene families are specific to some ecological types.

  10. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  11. Genome-wide association study identifies multiple susceptibility loci for multiple myeloma

    DEFF Research Database (Denmark)

    Mitchell, Jonathan S; Li, Ni; Weinhold, Niels

    2016-01-01

    Multiple myeloma (MM) is a plasma cell malignancy with a significant heritable basis. Genome-wide association studies have transformed our understanding of MM predisposition, but individual studies have had limited power to discover risk loci. Here we perform a meta-analysis of these GWAS, add a ...

  12. GenPlay Multi-Genome, a tool to compare and analyze multiple human genomes in a graphical interface.

    Science.gov (United States)

    Lajugie, Julien; Fourel, Nicolas; Bouhassira, Eric E

    2015-01-01

    Parallel visualization of multiple individual human genomes is a complex endeavor that is rapidly gaining importance with the increasing number of personal, phased and cancer genomes that are being generated. It requires the display of variants such as SNPs, indels and structural variants that are unique to specific genomes and the introduction of multiple overlapping gaps in the reference sequence. Here, we describe GenPlay Multi-Genome, an application specifically written to visualize and analyze multiple human genomes in parallel. GenPlay Multi-Genome is ideally suited for the comparison of allele-specific expression and functional genomic data obtained from multiple phased genomes in a graphical interface with access to multiple-track operation. It also allows the analysis of data that have been aligned to custom genomes rather than to a standard reference and can be used as a variant calling format file browser and as a tool to compare different genome assembly, such as hg19 and hg38. GenPlay is available under the GNU public license (GPL-3) from http://genplay.einstein.yu.edu. The source code is available at https://github.com/JulienLajugie/GenPlay. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Genome Context Viewer: visual exploration of multiple annotated genomes using microsynteny.

    Science.gov (United States)

    Cleary, Alan; Farmer, Andrew

    2018-05-01

    The Genome Context Viewer is a visual data-mining tool that allows users to search across multiple providers of genome data for regions with similarly annotated content that may be aligned and visualized at the level of their shared functional elements. By handling ordered sequences of gene family memberships as a unit of search and comparison, the user interface enables quick and intuitive assessment of the degree of gene content divergence and the presence of various types of structural events within syntenic contexts. Insights into functionally significant differences seen at this level of abstraction can then serve to direct the user to more detailed explorations of the underlying data in other interconnected, provider-specific tools. GCV is provided under the GNU General Public License version 3 (GPL-3.0). Source code is available at https://github.com/legumeinfo/lis_context_viewer. adf@ncgr.org. Supplementary data are available at Bioinformatics online.

  14. Murasaki: a fast, parallelizable algorithm to find anchors from multiple genomes.

    Directory of Open Access Journals (Sweden)

    Kris Popendorf

    Full Text Available BACKGROUND: With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, anchor identification among multiple genomes has been achieved using pairwise alignment tools like BLASTZ through progressive alignment tools like TBA, but the computational requirements for sequence comparisons of multiple genomes quickly becomes a limiting factor as the number and scale of genomes grows. METHODOLOGY/PRINCIPAL FINDINGS: Our algorithm, named Murasaki, makes it possible to identify anchors within multiple large sequences on the scale of several hundred megabases in few minutes using a single CPU. Two advanced features of Murasaki are (1 adaptive hash function generation, which enables efficient use of arbitrary mismatch patterns (spaced seeds and therefore the comparison of multiple mammalian genomes in a practical amount of computation time, and (2 parallelizable execution that decreases the required wall-clock and CPU times. Murasaki can perform a sensitive anchoring of eight mammalian genomes (human, chimp, rhesus, orangutan, mouse, rat, dog, and cow in 21 hours CPU time (42 minutes wall time. This is the first single-pass in-core anchoring of multiple mammalian genomes. We evaluated Murasaki by comparing it with the genome alignment programs BLASTZ and TBA. We show that Murasaki can anchor multiple genomes in near linear time, compared to the quadratic time requirements of BLASTZ and TBA, while improving overall accuracy. CONCLUSIONS/SIGNIFICANCE: Murasaki provides an open source platform to take advantage of long patterns, cluster computing, and novel hash algorithms to produce accurate anchors across multiple genomes with

  15. Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems

    Science.gov (United States)

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José C.; Mota-Sanchez, David; Estrada-González, Fermín; Gillberg, Jussi; Singh, Ravi; Mondal, Suchismita; Juliana, Philomin

    2018-01-01

    In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF) and the matrix factorization algorithm (MF) in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment–trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets. PMID:29097376

  16. Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems.

    Science.gov (United States)

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José C; Mota-Sanchez, David; Estrada-González, Fermín; Gillberg, Jussi; Singh, Ravi; Mondal, Suchismita; Juliana, Philomin

    2018-01-04

    In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF) and the matrix factorization algorithm (MF) in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment-trait combinations) and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets. Copyright © 2018 Montesinos-Lopez et al.

  17. Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems

    Directory of Open Access Journals (Sweden)

    Osval A. Montesinos-López

    2018-01-01

    Full Text Available In genomic-enabled prediction, the task of improving the accuracy of the prediction of lines in environments is difficult because the available information is generally sparse and usually has low correlations between traits. In current genomic selection, although researchers have a large amount of information and appropriate statistical models to process it, there is still limited computing efficiency to do so. Although some statistical models are usually mathematically elegant, many of them are also computationally inefficient, and they are impractical for many traits, lines, environments, and years because they need to sample from huge normal multivariate distributions. For these reasons, this study explores two recommender systems: item-based collaborative filtering (IBCF and the matrix factorization algorithm (MF in the context of multiple traits and multiple environments. The IBCF and MF methods were compared with two conventional methods on simulated and real data. Results of the simulated and real data sets show that the IBCF technique was slightly better in terms of prediction accuracy than the two conventional methods and the MF method when the correlation was moderately high. The IBCF technique is very attractive because it produces good predictions when there is high correlation between items (environment–trait combinations and its implementation is computationally feasible, which can be useful for plant breeders who deal with very large data sets.

  18. Serendipitous discovery of Wolbachia genomes in multiple Drosophila species.

    Science.gov (United States)

    Salzberg, Steven L; Dunning Hotopp, Julie C; Delcher, Arthur L; Pop, Mihai; Smith, Douglas R; Eisen, Michael B; Nelson, William C

    2005-01-01

    The Trace Archive is a repository for the raw, unanalyzed data generated by large-scale genome sequencing projects. The existence of this data offers scientists the possibility of discovering additional genomic sequences beyond those originally sequenced. In particular, if the source DNA for a sequencing project came from a species that was colonized by another organism, then the project may yield substantial amounts of genomic DNA, including near-complete genomes, from the symbiotic or parasitic organism. By searching the publicly available repository of DNA sequencing trace data, we discovered three new species of the bacterial endosymbiont Wolbachia pipientis in three different species of fruit fly: Drosophila ananassae, D. simulans, and D. mojavensis. We extracted all sequences with partial matches to a previously sequenced Wolbachia strain and assembled those sequences using customized software. For one of the three new species, the data recovered were sufficient to produce an assembly that covers more than 95% of the genome; for a second species the data produce the equivalent of a 'light shotgun' sampling of the genome, covering an estimated 75-80% of the genome; and for the third species the data cover approximately 6-7% of the genome. The results of this study reveal an unexpected benefit of depositing raw data in a central genome sequence repository: new species can be discovered within this data. The differences between these three new Wolbachia genomes and the previously sequenced strain revealed numerous rearrangements and insertions within each lineage and hundreds of novel genes. The three new genomes, with annotation, have been deposited in GenBank.

  19. Multiple roles of genome-attached bacteriophage terminal proteins

    International Nuclear Information System (INIS)

    Redrejo-Rodríguez, Modesto; Salas, Margarita

    2014-01-01

    Protein-primed replication constitutes a generalized mechanism to initiate DNA or RNA synthesis in linear genomes, including viruses, gram-positive bacteria, linear plasmids and mobile elements. By this mechanism a specific amino acid primes replication and becomes covalently linked to the genome ends. Despite the fact that TPs lack sequence homology, they share a similar structural arrangement, with the priming residue in the C-terminal half of the protein and an accumulation of positively charged residues at the N-terminal end. In addition, various bacteriophage TPs have been shown to have DNA-binding capacity that targets TPs and their attached genomes to the host nucleoid. Furthermore, a number of bacteriophage TPs from different viral families and with diverse hosts also contain putative nuclear localization signals and localize in the eukaryotic nucleus, which could lead to the transport of the attached DNA. This suggests a possible role of bacteriophage TPs in prokaryote-to-eukaryote horizontal gene transfer. - Highlights: • Protein-primed genome replication constitutes a strategy to initiate DNA or RNA synthesis in linear genomes. • Bacteriophage terminal proteins (TPs) are covalently attached to viral genomes by their primary function priming DNA replication. • TPs are also DNA-binding proteins and target phage genomes to the host nucleoid. • TPs can also localize in the eukaryotic nucleus and may have a role in phage-mediated interkingdom gene transfer

  20. Multiple roles of genome-attached bacteriophage terminal proteins

    Energy Technology Data Exchange (ETDEWEB)

    Redrejo-Rodríguez, Modesto; Salas, Margarita, E-mail: msalas@cbm.csic.es

    2014-11-15

    Protein-primed replication constitutes a generalized mechanism to initiate DNA or RNA synthesis in linear genomes, including viruses, gram-positive bacteria, linear plasmids and mobile elements. By this mechanism a specific amino acid primes replication and becomes covalently linked to the genome ends. Despite the fact that TPs lack sequence homology, they share a similar structural arrangement, with the priming residue in the C-terminal half of the protein and an accumulation of positively charged residues at the N-terminal end. In addition, various bacteriophage TPs have been shown to have DNA-binding capacity that targets TPs and their attached genomes to the host nucleoid. Furthermore, a number of bacteriophage TPs from different viral families and with diverse hosts also contain putative nuclear localization signals and localize in the eukaryotic nucleus, which could lead to the transport of the attached DNA. This suggests a possible role of bacteriophage TPs in prokaryote-to-eukaryote horizontal gene transfer. - Highlights: • Protein-primed genome replication constitutes a strategy to initiate DNA or RNA synthesis in linear genomes. • Bacteriophage terminal proteins (TPs) are covalently attached to viral genomes by their primary function priming DNA replication. • TPs are also DNA-binding proteins and target phage genomes to the host nucleoid. • TPs can also localize in the eukaryotic nucleus and may have a role in phage-mediated interkingdom gene transfer.

  1. Multiple-trait genetic evaluation using genomic matrix

    African Journals Online (AJOL)

    Jane

    2011-07-06

    Jul 6, 2011 ... relationships was estimated through computer simulation and was compared with the accuracy of ... programs, detect animals with superior genetic and select ... genomic matrices in the mixed model equations of BLUP.

  2. Simultaneous Structural Variation Discovery in Multiple Paired-End Sequenced Genomes

    Science.gov (United States)

    Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk

    Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes [1]. Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.

  3. Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines

    NARCIS (Netherlands)

    Ellrott, Kyle; Bailey, Matthew H.; Saksena, Gordon; Covington, Kyle R.; Kandoth, Cyriac; Stewart, Chip; Hess, Julian; Ma, Singer; Chiotti, Kami E.; McLellan, Michael; Sofia, Heidi J.; Hutter, Carolyn M.; Getz, Gad; Wheeler, David A.; Ding, Li; Caesar-Johnson, Samantha J.; Demchok, John A.; Felau, Ina; Kasapi, Melpomeni; Ferguson, Martin L.; Hutter, Carolyn M.; Sofia, Heidi J.; Tarnuzzer, Roy; Wang, Zhining; Yang, Liming; Zenklusen, Jean C.; Zhang, Jiashan (Julia); Chudamani, Sudha; Liu, Jia; Lolla, Laxmi; Naresh, Rashi; Pihl, Todd; Sun, Qiang; Wan, Yunhu; Wu, Ye; Cho, Juok; DeFreitas, Timothy; Frazer, Scott; Gehlenborg, Nils; Getz, Gad; Heiman, David I.; Kim, Jaegil; Lawrence, Michael S.; Lin, Pei; Meier, Sam; Noble, Michael S.; Saksena, Gordon; Voet, Doug; Zhang, Hailei; Bernard, Brady; Chambwe, Nyasha; Dhankani, Varsha; Knijnenburg, Theo; Kramer, Roger; Leinonen, Kalle; Liu, Yuexin; Miller, Michael; Reynolds, Sheila; Shmulevich, Ilya; Thorsson, Vesteinn; Zhang, Wei; Akbani, Rehan; Broom, Bradley M.; Hegde, Apurva M.; Ju, Zhenlin; Kanchi, Rupa S.; Korkut, Anil; Li, Jun; Liang, Han; Ling, Shiyun; Liu, Wenbin; Lu, Yiling; Mills, Gordon B.; Ng, Kwok Shing; Rao, Arvind; Ryan, Michael; Wang, Jing; Weinstein, John N.; Zhang, Jiexin; Abeshouse, Adam; Armenia, Joshua; Chakravarty, Debyani; Chatila, Walid K.; de Bruijn, Ino; Gao, Jianjiong; Gross, Benjamin E.; Heins, Zachary J.; Kundra, Ritika; La, Konnor; Ladanyi, Marc; Luna, Augustin; Nissan, Moriah G.; Ochoa, Angelica; Phillips, Sarah M.; Reznik, Ed; Sanchez-Vega, Francisco; Sander, Chris; Schultz, Nikolaus; Sheridan, Robert; Sumer, S. Onur; Sun, Yichao; Taylor, Barry S.; Wang, Jioajiao; Zhang, Hongxin; Anur, Pavana; Peto, Myron; Spellman, Paul; Benz, Christopher; Stuart, Joshua M.; Wong, Christopher K.; Yau, Christina; Hayes, D. Neil; Wilkerson, Matthew D.; Ally, Adrian; Balasundaram, Miruna; Bowlby, Reanne; Brooks, Denise; Carlsen, Rebecca; Chuah, Eric; Dhalla, Noreen; Holt, Robert; Jones, Steven J.M.; Kasaian, Katayoon; Lee, Darlene; Ma, Yussanne; Marra, Marco A.; Mayo, Michael; Moore, Richard A.; Mungall, Andrew J.; Mungall, Karen; Robertson, A. Gordon; Sadeghi, Sara; Schein, Jacqueline E.; Sipahimalani, Payal; Tam, Angela; Thiessen, Nina; Tse, Kane; Wong, Tina; Berger, Ashton C.; Beroukhim, Rameen; Cherniack, Andrew D.; Cibulskis, Carrie; Gabriel, Stacey B.; Gao, Galen F.; Ha, Gavin; Meyerson, Matthew; Schumacher, Steven E.; Shih, Juliann; Kucherlapati, Melanie H.; Kucherlapati, Raju S.; Baylin, Stephen; Cope, Leslie; Danilova, Ludmila; Bootwalla, Moiz S.; Lai, Phillip H.; Maglinte, Dennis T.; Van Den Berg, David J.; Weisenberger, Daniel J.; Auman, J. Todd; Balu, Saianand; Bodenheimer, Tom; Fan, Cheng; Hoadley, Katherine A.; Hoyle, Alan P.; Jefferys, Stuart R.; Jones, Corbin D.; Meng, Shaowu; Mieczkowski, Piotr A.; Mose, Lisle E.; Perou, Amy H.; Perou, Charles M.; Roach, Jeffrey; Shi, Yan; Simons, Janae V.; Skelly, Tara; Soloway, Matthew G.; Tan, Donghui; Veluvolu, Umadevi; Fan, Huihui; Hinoue, Toshinori; Laird, Peter W.; Shen, Hui; Zhou, Wanding; Bellair, Michelle; Chang, Kyle; Covington, Kyle; Creighton, Chad J.; Dinh, Huyen; Doddapaneni, Harsha Vardhan; Donehower, Lawrence A.; Drummond, Jennifer; Gibbs, Richard A.; Glenn, Robert; Hale, Walker; Han, Yi; Hu, Jianhong; Korchina, Viktoriya; Lee, Sandra; Lewis, Lora; Li, Wei; Liu, Xiuping; Morgan, Margaret; Morton, Donna; Muzny, Donna; Santibanez, Jireh; Sheth, Margi; Shinbrot, Eve; Wang, Linghua; Wang, Min; Wheeler, David A.; Xi, Liu; Zhao, Fengmei; Hess, Julian; Appelbaum, Elizabeth L.; Bailey, Matthew; Cordes, Matthew G.; Ding, Li; Fronick, Catrina C.; Fulton, Lucinda A.; Fulton, Robert S.; Kandoth, Cyriac; Mardis, Elaine R.; McLellan, Michael D.; Miller, Christopher A.; Schmidt, Heather K.; Wilson, Richard K.; Crain, Daniel; Curley, Erin; Gardner, Johanna; Lau, Kevin; Mallery, David; Morris, Scott; Paulauskis, Joseph; Penny, Robert; Shelton, Candace; Shelton, Troy; Sherman, Mark; Thompson, Eric; Yena, Peggy; Bowen, Jay; Gastier-Foster, Julie M.; Gerken, Mark; Leraas, Kristen M.; Lichtenberg, Tara M.; Ramirez, Nilsa C.; Wise, Lisa; Zmuda, Erik; Corcoran, Niall; Costello, Tony; Hovens, Christopher; Carvalho, Andre L.; de Carvalho, Ana C.; Fregnani, José H.; Longatto-Filho, Adhemar; Reis, Rui M.; Scapulatempo-Neto, Cristovam; Silveira, Henrique C.S.; Vidal, Daniel O.; Burnette, Andrew; Eschbacher, Jennifer; Hermes, Beth; Noss, Ardene; Singh, Rosy; Anderson, Matthew L.; Castro, Patricia D.; Ittmann, Michael; Huntsman, David; Kohl, Bernard; Le, Xuan; Thorp, Richard; Andry, Chris; Duffy, Elizabeth R.; Lyadov, Vladimir; Paklina, Oxana; Setdikova, Galiya; Shabunin, Alexey; Tavobilov, Mikhail; McPherson, Christopher; Warnick, Ronald; Berkowitz, Ross; Cramer, Daniel; Feltmate, Colleen; Horowitz, Neil; Kibel, Adam; Muto, Michael; Raut, Chandrajit P.; Malykh, Andrei; Barnholtz-Sloan, Jill S.; Barrett, Wendi; Devine, Karen; Fulop, Jordonna; Ostrom, Quinn T.; Shimmel, Kristen; Wolinsky, Yingli; Sloan, Andrew E.; De Rose, Agostino; Giuliante, Felice; Goodman, Marc; Karlan, Beth Y.; Hagedorn, Curt H.; Eckman, John; Harr, Jodi; Myers, Jerome; Tucker, Kelinda; Zach, Leigh Anne; Deyarmin, Brenda; Hu, Hai; Kvecher, Leonid; Larson, Caroline; Mural, Richard J.; Somiari, Stella; Vicha, Ales; Zelinka, Tomas; Bennett, Joseph; Iacocca, Mary; Rabeno, Brenda; Swanson, Patricia; Latour, Mathieu; Lacombe, Louis; Têtu, Bernard; Bergeron, Alain; McGraw, Mary; Staugaitis, Susan M.; Chabot, John; Hibshoosh, Hanina; Sepulveda, Antonia; Su, Tao; Wang, Timothy; Potapova, Olga; Voronina, Olga; Desjardins, Laurence; Mariani, Odette; Roman-Roman, Sergio; Sastre, Xavier; Stern, Marc Henri; Cheng, Feixiong; Signoretti, Sabina; Berchuck, Andrew; Bigner, Darell; Lipp, Eric; Marks, Jeffrey; McCall, Shannon; McLendon, Roger; Secord, Angeles; Sharp, Alexis; Behera, Madhusmita; Brat, Daniel J.; Chen, Amy; Delman, Keith; Force, Seth; Khuri, Fadlo; Magliocca, Kelly; Maithel, Shishir; Olson, Jeffrey J.; Owonikoko, Taofeek; Pickens, Alan; Ramalingam, Suresh; Shin, Dong M.; Sica, Gabriel; Van Meir, Erwin G.; Zhang, Hongzheng; Eijckenboom, Wil; Gillis, Ad; Korpershoek, Esther; Looijenga, Leendert; Oosterhuis, Wolter; Stoop, Hans; van Kessel, Kim E.; Zwarthoff, Ellen C.; Calatozzolo, Chiara; Cuppini, Lucia; Cuzzubbo, Stefania; DiMeco, Francesco; Finocchiaro, Gaetano; Mattei, Luca; Perin, Alessandro; Pollo, Bianca; Chen, Chu; Houck, John; Lohavanichbutr, Pawadee; Hartmann, Arndt; Stoehr, Christine; Stoehr, Robert; Taubert, Helge; Wach, Sven; Wullich, Bernd; Kycler, Witold; Murawa, Dawid; Wiznerowicz, Maciej; Chung, Ki; Edenfield, W. Jeffrey; Martin, Julie; Baudin, Eric; Bubley, Glenn; Bueno, Raphael; De Rienzo, Assunta; Richards, William G.; Kalkanis, Steven; Mikkelsen, Tom; Noushmehr, Houtan; Scarpace, Lisa; Girard, Nicolas; Aymerich, Marta; Campo, Elias; Giné, Eva; Guillermo, Armando López; Van Bang, Nguyen; Hanh, Phan Thi; Phu, Bui Duc; Tang, Yufang; Colman, Howard; Evason, Kimberley; Dottino, Peter R.; Martignetti, John A.; Gabra, Hani; Juhl, Hartmut; Akeredolu, Teniola; Stepa, Serghei; Hoon, Dave; Ahn, Keunsoo; Kang, Koo Jeong; Beuschlein, Felix; Breggia, Anne; Birrer, Michael; Bell, Debra; Borad, Mitesh; Bryce, Alan H.; Castle, Erik; Chandan, Vishal; Cheville, John; Copland, John A.; Farnell, Michael; Flotte, Thomas; Giama, Nasra; Ho, Thai; Kendrick, Michael; Kocher, Jean Pierre; Kopp, Karla; Moser, Catherine; Nagorney, David; O'Brien, Daniel; O'Neill, Brian Patrick; Patel, Tushar; Petersen, Gloria; Que, Florencia; Rivera, Michael; Roberts, Lewis; Smallridge, Robert; Smyrk, Thomas; Stanton, Melissa; Thompson, R. Houston; Torbenson, Michael; Yang, Ju Dong; Zhang, Lizhi; Brimo, Fadi; Ajani, Jaffer A.; Angulo Gonzalez, Ana Maria; Behrens, Carmen; Bondaruk, Jolanta; Broaddus, Russell; Czerniak, Bogdan; Esmaeli, Bita; Fujimoto, Junya; Gershenwald, Jeffrey; Guo, Charles; Lazar, Alexander J.; Logothetis, Christopher; Meric-Bernstam, Funda; Moran, Cesar; Ramondetta, Lois; Rice, David; Sood, Anil; Tamboli, Pheroze; Thompson, Timothy; Troncoso, Patricia; Tsao, Anne; Wistuba, Ignacio; Carter, Candace; Haydu, Lauren; Hersey, Peter; Jakrot, Valerie; Kakavand, Hojabr; Kefford, Richard; Lee, Kenneth; Long, Georgina; Mann, Graham; Quinn, Michael; Saw, Robyn; Scolyer, Richard; Shannon, Kerwin; Spillane, Andrew; Stretch, Jonathan; Synott, Maria; Thompson, John; Wilmott, James; Al-Ahmadie, Hikmat; Chan, Timothy A.; Ghossein, Ronald; Gopalan, Anuradha; Levine, Douglas A.; Reuter, Victor; Singer, Samuel; Singh, Bhuvanesh; Tien, Nguyen Viet; Broudy, Thomas; Mirsaidi, Cyrus; Nair, Praveen; Drwiega, Paul; Miller, Judy; Smith, Jennifer; Zaren, Howard; Park, Joong Won; Hung, Nguyen Phi; Kebebew, Electron; Linehan, W. Marston; Metwalli, Adam R.; Pacak, Karel; Pinto, Peter A.; Schiffman, Mark; Schmidt, Laura S.; Vocke, Cathy D.; Wentzensen, Nicolas; Worrell, Robert; Yang, Hannah; Moncrieff, Marc; Goparaju, Chandra; Melamed, Jonathan; Pass, Harvey; Botnariuc, Natalia; Caraman, Irina; Cernat, Mircea; Chemencedji, Inga; Clipca, Adrian; Doruc, Serghei; Gorincioi, Ghenadie; Mura, Sergiu; Pirtac, Maria; Stancul, Irina; Tcaciuc, Diana; Albert, Monique; Alexopoulou, Iakovina; Arnaout, Angel; Bartlett, John; Engel, Jay; Gilbert, Sebastien; Parfitt, Jeremy; Sekhon, Harman; Thomas, George; Rassl, Doris M.; Rintoul, Robert C.; Bifulco, Carlo; Tamakawa, Raina; Urba, Walter; Hayward, Nicholas; Timmers, Henri; Antenucci, Anna; Facciolo, Francesco; Grazi, Gianluca; Marino, Mirella; Merola, Roberta; de Krijger, Ronald; Gimenez-Roqueplo, Anne Paule; Piché, Alain; Chevalier, Simone; McKercher, Ginette; Birsoy, Kivanc; Barnett, Gene; Brewer, Cathy; Farver, Carol; Naska, Theresa; Pennell, Nathan A.; Raymond, Daniel; Schilero, Cathy; Smolenski, Kathy; Williams, Felicia; Morrison, Carl; Borgia, Jeffrey A.; Liptay, Michael J.; Pool, Mark; Seder, Christopher W.; Junker, Kerstin; Omberg, Larsson; Dinkin, Mikhail; Manikhas, George; Alvaro, Domenico; Bragazzi, Maria Consiglia; Cardinale, Vincenzo; Carpino, Guido; Gaudio, Eugenio; Chesla, David; Cottingham, Sandra; Dubina, Michael; Moiseenko, Fedor; Dhanasekaran, Renumathy; Becker, Karl Friedrich; Janssen, Klaus Peter; Slotta-Huspenina, Julia; Abdel-Rahman, Mohamed H.; Aziz, Dina; Bell, Sue; Cebulla, Colleen M.; Davis, Amy; Duell, Rebecca; Elder, J. Bradley; Hilty, Joe; Kumar, Bahavna; Lang, James; Lehman, Norman L.; Mandt, Randy; Nguyen, Phuong; Pilarski, Robert; Rai, Karan; Schoenfield, Lynn; Senecal, Kelly; Wakely, Paul; Hansen, Paul; Lechan, Ronald; Powers, James; Tischler, Arthur; Grizzle, William E.; Sexton, Katherine C.; Kastl, Alison; Henderson, Joel; Porten, Sima; Waldmann, Jens; Fassnacht, Martin; Asa, Sylvia L.; Schadendorf, Dirk; Couce, Marta; Graefen, Markus; Huland, Hartwig; Sauter, Guido; Schlomm, Thorsten; Simon, Ronald; Tennstedt, Pierre; Olabode, Oluwole; Nelson, Mark; Bathe, Oliver; Carroll, Peter R.; Chan, June M.; Disaia, Philip; Glenn, Pat; Kelley, Robin K.; Landen, Charles N.; Phillips, Joanna; Prados, Michael; Simko, Jeffry; Smith-McCune, Karen; VandenBerg, Scott; Roggin, Kevin; Fehrenbach, Ashley; Kendler, Ady; Sifri, Suzanne; Steele, Ruth; Jimeno, Antonio; Carey, Francis; Forgie, Ian; Mannelli, Massimo; Carney, Michael; Hernandez, Brenda; Campos, Benito; Herold-Mende, Christel; Jungk, Christin; Unterberg, Andreas; von Deimling, Andreas; Bossler, Aaron; Galbraith, Joseph; Jacobus, Laura; Knudson, Michael; Knutson, Tina; Ma, Deqin; Milhem, Mohammed; Sigmund, Rita; Godwin, Andrew K.; Madan, Rashna; Rosenthal, Howard G.; Adebamowo, Clement; Adebamowo, Sally N.; Boussioutas, Alex; Beer, David; Giordano, Thomas; Mes-Masson, Anne Marie; Saad, Fred; Bocklage, Therese; Landrum, Lisa; Mannel, Robert; Moore, Kathleen; Moxley, Katherine; Postier, Russel; Walker, Joan; Zuna, Rosemary; Feldman, Michael; Valdivieso, Federico; Dhir, Rajiv; Luketich, James; Mora Pinero, Edna M.; Quintero-Aguilo, Mario; Carlotti, Carlos Gilberto; Dos Santos, Jose Sebastião; Kemp, Rafael; Sankarankuty, Ajith; Tirapelli, Daniela; Catto, James; Agnew, Kathy; Swisher, Elizabeth; Creaney, Jenette; Robinson, Bruce; Shelley, Carl Simon; Godwin, Eryn M.; Kendall, Sara; Shipman, Cassaundra; Bradford, Carol; Carey, Thomas; Haddad, Andrea; Moyer, Jeffey; Peterson, Lisa; Prince, Mark; Rozek, Laura; Wolf, Gregory; Bowman, Rayleen; Fong, Kwun M.; Yang, Ian; Korst, Robert; Rathmell, W. Kimryn; Fantacone-Campbell, J. Leigh; Hooke, Jeffrey A.; Kovatich, Albert J.; Shriver, Craig D.; DiPersio, John; Drake, Bettina; Govindan, Ramaswamy; Heath, Sharon; Ley, Timothy; Van Tine, Brian; Westervelt, Peter; Rubin, Mark A.; Lee, Jung Il; Aredes, Natália D.; Mariamidze, Armaz

    2018-01-01

    The Cancer Genome Atlas (TCGA) cancer genomics dataset includes over 10,000 tumor-normal exome pairs across 33 different cancer types, in total >400 TB of raw data files requiring analysis. Here we describe the Multi-Center Mutation Calling in Multiple Cancers project, our effort to generate a

  4. An evaluation of multiple annealing and looping based genome amplification using a synthetic bacterial community

    KAUST Repository

    Wang, Yong; Gao, Zhaoming; Xu, Ying; Li, Guangyu; He, Lisheng; Qian, Peiyuan

    2016-01-01

    -generation-sequencing technology. Using a synthetic bacterial community, the amplification efficiency of the Multiple Annealing and Looping Based Amplification Cycles (MALBAC) kit that is originally developed to amplify the single-cell genomic DNA of mammalian organisms

  5. Analysis of the genetic variation in Mycobacterium tuberculosis strains by multiple genome alignments

    Directory of Open Access Journals (Sweden)

    Morales Juan

    2008-11-01

    Full Text Available Abstract Background The recent determination of the complete nucleotide sequence of several Mycobacterium tuberculosis (MTB genomes allows the use of comparative genomics as a tool for dissecting the nature and consequence of genetic variability within this species. The multiple alignment of the genomes of clinical strains (CDC1551, F11, Haarlem and C, along with the genomes of laboratory strains (H37Rv and H37Ra, provides new insights on the mechanisms of adaptation of this bacterium to the human host. Findings The genetic variation found in six M. tuberculosis strains does not involve significant genomic rearrangements. Most of the variation results from deletion and transposition events preferentially associated with insertion sequences and genes of the PE/PPE family but not with genes implicated in virulence. Using a Perl-based software islandsanalyser, which creates a representation of the genetic variation in the genome, we identified differences in the patterns of distribution and frequency of the polymorphisms across the genome. The identification of genes displaying strain-specific polymorphisms and the extrapolation of the number of strain-specific polymorphisms to an unlimited number of genomes indicates that the different strains contain a limited number of unique polymorphisms. Conclusion The comparison of multiple genomes demonstrates that the M. tuberculosis genome is currently undergoing an active process of gene decay, analogous to the adaptation process of obligate bacterial symbionts. This observation opens new perspectives into the evolution and the understanding of the pathogenesis of this bacterium.

  6. Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms

    Directory of Open Access Journals (Sweden)

    Meller Jaroslaw

    2007-03-01

    Full Text Available Abstract Background Identifying syntenic regions, i.e., blocks of genes or other markers with evolutionary conserved order, and quantifying evolutionary relatedness between genomes in terms of chromosomal rearrangements is one of the central goals in comparative genomics. However, the analysis of synteny and the resulting assessment of genome rearrangements are sensitive to the choice of a number of arbitrary parameters that affect the detection of synteny blocks. In particular, the choice of a set of markers and the effect of different aggregation strategies, which enable coarse graining of synteny blocks and exclusion of micro-rearrangements, need to be assessed. Therefore, existing tools and resources that facilitate identification, visualization and analysis of synteny need to be further improved to provide a flexible platform for such analysis, especially in the context of multiple genomes. Results We present a new tool, Cinteny, for fast identification and analysis of synteny with different sets of markers and various levels of coarse graining of syntenic blocks. Using Hannenhalli-Pevzner approach and its extensions, Cinteny also enables interactive determination of evolutionary relationships between genomes in terms of the number of rearrangements (the reversal distance. In particular, Cinteny provides: i integration of synteny browsing with assessment of evolutionary distances for multiple genomes; ii flexibility to adjust the parameters and re-compute the results on-the-fly; iii ability to work with user provided data, such as orthologous genes, sequence tags or other conserved markers. In addition, Cinteny provides many annotated mammalian, invertebrate and fungal genomes that are pre-loaded and available for analysis at http://cinteny.cchmc.org. Conclusion Cinteny allows one to automatically compare multiple genomes and perform sensitivity analysis for synteny block detection and for the subsequent computation of reversal distances

  7. Ascaris phylogeny based on multiple whole mtDNA genomes

    DEFF Research Database (Denmark)

    Nejsum, Peter; Hawash, Mohamed B F; Betson, Martha

    2016-01-01

    and C) of human and pig Ascaris based on partial cox1 sequences. In the present study, we selected major haplotypes from these different clusters to characterize their whole mitochondrial genomes for phylogenetic analysis. We also undertook coalescent simulations to investigate the evolutionary history...

  8. mpscan: Fast Localisation of Multiple Reads in Genomes

    Science.gov (United States)

    Rivals, Eric; Salmela, Leena; Kiiskinen, Petteri; Kalsi, Petri; Tarhio, Jorma

    With Next Generation Sequencers, sequence based transcriptomic or epigenomic assays yield millions of short sequence reads that need to be mapped back on a reference genome. The upcoming versions of these sequencers promise even higher sequencing capacities; this may turn the read mapping task into a bottleneck for which alternative pattern matching approaches must be experimented. We present an algorithm and its implementation, called mpscan, which uses a sophisticated filtration scheme to match a set of patterns/reads exactly on a sequence. mpscan can search for millions of reads in a single pass through the genome without indexing its sequence. Moreover, we show that mpscan offers an optimal average time complexity, which is sublinear in the text length, meaning that it does not need to examine all sequence positions. Comparisons with BLAT-like tools and with six specialised read mapping programs (like bowtie or zoom) demonstrate that mpscan also is the fastest algorithm in practice for exact matching. Our accuracy and scalability comparisons reveal that some tools are inappropriate for read mapping. Moreover, we provide evidence suggesting that exact matching may be a valuable solution in some read mapping applications. As most read mapping programs somehow rely on exact matching procedures to perform approximate pattern mapping, the filtration scheme we experimented may reveal useful in the design of future algorithms. The absence of genome index gives mpscan its low memory requirement and flexibility that let it run on a desktop computer and avoids a time-consuming genome preprocessing.

  9. Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

    Science.gov (United States)

    Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

    2011-01-01

    Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

  10. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae : Implications for the microbial "pan-genome"

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Donati, C; Medini, D; Ward, NL; Angiuoli, SV; Crabtree, J; Jones, AL; Durkin, AS; DeBoy, RT; Davidsen, TM; Mora, M; Scarselli, M; Ros, IMY; Peterson, JD; Hauser, CR; Sundaram, JP; Nelson, WC; Madupu, R; Brinkac, LM; Dodson, RJ; Rosovitz, MJ; Sullivan, SA; Daugherty, SC; Haft, DH; Selengut, J; Gwinn, ML; Zhou, LW; Zafar, N; Khouri, H; Radune, D; Dimitrov, G; Watkins, K; O'Connor, KJB; Smith, S; Utterback, TR; White, O; Rubens, CE; Grandi, G; Madoff, LC; Kasper, DL; Telford, JL; Wessels, MR; Rappuoli, R; Fraser, CM

    2005-01-01

    The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and

  11. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes.

    Science.gov (United States)

    Singh, Param Priya; Arora, Jatin; Isambert, Hervé

    2015-07-01

    Whole genome duplications (WGD) have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases.

  12. Multiple reference genomes and transcriptomes for Arabidopsis thaliana

    KAUST Repository

    Gan, Xiangchao

    2011-08-28

    Genetic differences between Arabidopsis thaliana accessions underlie the plants extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. ©2011 Macmillan Publishers Limited. All rights reserved.

  13. Multiple reference genomes and transcriptomes for Arabidopsis thaliana

    KAUST Repository

    Gan, Xiangchao; Stegle, Oliver; Behr, Jonas; Steffen, Joshua G.; Drewe, Philipp; Hildebrand, Katie L.; Lyngsoe, Rune; Schultheiss, Sebastian J.; Osborne, Edward J.; Sreedharan, Vipin T.; Kahles, André ; Bohnert, Regina; Jean, Gé raldine; Derwent, Paul; Kersey, Paul; Belfield, Eric J.; Harberd, Nicholas P.; Kemen, Eric; Toomajian, Christopher; Kover, Paula X.; Clark, Richard M.; Rä tsch, Gunnar; Mott, Richard

    2011-01-01

    Genetic differences between Arabidopsis thaliana accessions underlie the plants extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. ©2011 Macmillan Publishers Limited. All rights reserved.

  14. Rapid genome reshaping by multiple-gene loss after whole-genome duplication in teleost fish suggested by mathematical modeling

    Science.gov (United States)

    Sato, Yukuto; Tsukamoto, Katsumi; Nishida, Mutsumi

    2015-01-01

    Whole-genome duplication (WGD) is believed to be a significant source of major evolutionary innovation. Redundant genes resulting from WGD are thought to be lost or acquire new functions. However, the rates of gene loss and thus temporal process of genome reshaping after WGD remain unclear. The WGD shared by all teleost fish, one-half of all jawed vertebrates, was more recent than the two ancient WGDs that occurred before the origin of jawed vertebrates, and thus lends itself to analysis of gene loss and genome reshaping. Using a newly developed orthology identification pipeline, we inferred the post–teleost-specific WGD evolutionary histories of 6,892 protein-coding genes from nine phylogenetically representative teleost genomes on a time-calibrated tree. We found that rapid gene loss did occur in the first 60 My, with a loss of more than 70–80% of duplicated genes, and produced similar genomic gene arrangements within teleosts in that relatively short time. Mathematical modeling suggests that rapid gene loss occurred mainly by events involving simultaneous loss of multiple genes. We found that the subsequent 250 My were characterized by slow and steady loss of individual genes. Our pipeline also identified about 1,100 shared single-copy genes that are inferred to have become singletons before the divergence of clupeocephalan teleosts. Therefore, our comparative genome analysis suggests that rapid gene loss just after the WGD reshaped teleost genomes before the major divergence, and provides a useful set of marker genes for future phylogenetic analysis. PMID:26578810

  15. Multiple-Trait Genomic Selection Methods Increase Genetic Value Prediction Accuracy

    Science.gov (United States)

    Jia, Yi; Jannink, Jean-Luc

    2012-01-01

    Genetic correlations between quantitative traits measured in many breeding programs are pervasive. These correlations indicate that measurements of one trait carry information on other traits. Current single-trait (univariate) genomic selection does not take advantage of this information. Multivariate genomic selection on multiple traits could accomplish this but has been little explored and tested in practical breeding programs. In this study, three multivariate linear models (i.e., GBLUP, BayesA, and BayesCπ) were presented and compared to univariate models using simulated and real quantitative traits controlled by different genetic architectures. We also extended BayesA with fixed hyperparameters to a full hierarchical model that estimated hyperparameters and BayesCπ to impute missing phenotypes. We found that optimal marker-effect variance priors depended on the genetic architecture of the trait so that estimating them was beneficial. We showed that the prediction accuracy for a low-heritability trait could be significantly increased by multivariate genomic selection when a correlated high-heritability trait was available. Further, multiple-trait genomic selection had higher prediction accuracy than single-trait genomic selection when phenotypes are not available on all individuals and traits. Additional factors affecting the performance of multiple-trait genomic selection were explored. PMID:23086217

  16. PSP: rapid identification of orthologous coding genes under positive selection across multiple closely related prokaryotic genomes.

    Science.gov (United States)

    Su, Fei; Ou, Hong-Yu; Tao, Fei; Tang, Hongzhi; Xu, Ping

    2013-12-27

    With genomic sequences of many closely related bacterial strains made available by deep sequencing, it is now possible to investigate trends in prokaryotic microevolution. Positive selection is a sub-process of microevolution, in which a particular mutation is favored, causing the allele frequency to continuously shift in one direction. Wide scanning of prokaryotic genomes has shown that positive selection at the molecular level is much more frequent than expected. Genes with significant positive selection may play key roles in bacterial adaption to different environmental pressures. However, selection pressure analyses are computationally intensive and awkward to configure. Here we describe an open access web server, which is designated as PSP (Positive Selection analysis for Prokaryotic genomes) for performing evolutionary analysis on orthologous coding genes, specially designed for rapid comparison of dozens of closely related prokaryotic genomes. Remarkably, PSP facilitates functional exploration at the multiple levels by assignments and enrichments of KO, GO or COG terms. To illustrate this user-friendly tool, we analyzed Escherichia coli and Bacillus cereus genomes and found that several genes, which play key roles in human infection and antibiotic resistance, show significant evidence of positive selection. PSP is freely available to all users without any login requirement at: http://db-mml.sjtu.edu.cn/PSP/. PSP ultimately allows researchers to do genome-scale analysis for evolutionary selection across multiple prokaryotic genomes rapidly and easily, and identify the genes undergoing positive selection, which may play key roles in the interactions of host-pathogen and/or environmental adaptation.

  17. Genomic Physics. Multiple Laser Beam Treatment of Alzheimer's Disease

    Science.gov (United States)

    Stefan, V. Alexander

    2014-03-01

    The synapses affected by Alzheimer's disease can be rejuvenated by the multiple ultrashort wavelength laser beams.[2] The guiding lasers scan the whole area to detect the amyloid plaques based on the laser scattering technique. The scanning lasers pinpoint the areas with plaques and eliminate them. Laser interaction is highly efficient, because of the focusing capabilities and possibility for the identification of the damaging proteins by matching the protein oscillation eigen-frequency with laser frequency.[3] Supported by Nikola Tesla Labs, La Jolla, California, USA.

  18. HAL: a hierarchical format for storing and analyzing multiple genome alignments.

    Science.gov (United States)

    Hickey, Glenn; Paten, Benedict; Earl, Dent; Zerbino, Daniel; Haussler, David

    2013-05-15

    Large multiple genome alignments and inferred ancestral genomes are ideal resources for comparative studies of molecular evolution, and advances in sequencing and computing technology are making them increasingly obtainable. These structures can provide a rich understanding of the genetic relationships between all subsets of species they contain. Current formats for storing genomic alignments, such as XMFA and MAF, are all indexed or ordered using a single reference genome, however, which limits the information that can be queried with respect to other species and clades. This loss of information grows with the number of species under comparison, as well as their phylogenetic distance. We present HAL, a compressed, graph-based hierarchical alignment format for storing multiple genome alignments and ancestral reconstructions. HAL graphs are indexed on all genomes they contain. Furthermore, they are organized phylogenetically, which allows for modular and parallel access to arbitrary subclades without fragmentation because of rearrangements that have occurred in other lineages. HAL graphs can be created or read with a comprehensive C++ API. A set of tools is also provided to perform basic operations, such as importing and exporting data, identifying mutations and coordinate mapping (liftover). All documentation and source code for the HAL API and tools are freely available at http://github.com/glennhickey/hal. hickey@soe.ucsc.edu or haussler@soe.ucsc.edu Supplementary data are available at Bioinformatics online.

  19. Imputation and quality control steps for combining multiple genome-wide datasets

    Directory of Open Access Journals (Sweden)

    Shefali S Verma

    2014-12-01

    Full Text Available The electronic MEdical Records and GEnomics (eMERGE network brings together DNA biobanks linked to electronic health records (EHRs from multiple institutions. Approximately 52,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic R2 (estimated correlation between the imputed and true genotypes, and the relationship between allelic R2 and minor allele frequency. Computation time and memory resources required by two different software packages (BEAGLE and IMPUTE2 were also evaluated. A number of challenges were encountered due to the complexity of using two different imputation software packages, multiple ancestral populations, and many different genotyping platforms. We present lessons learned and describe the pipeline implemented here to impute and merge genomic data sets. The eMERGE imputed dataset will serve as a valuable resource for discovery, leveraging the clinical data that can be mined from the EHR.

  20. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

    Science.gov (United States)

    Khan, Aziz; Mathelier, Anthony

    2017-05-31

    A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Intervene and its web application companion provide an easy command line and an interactive web interface to compute intersections of multiple genomic and list sets. They have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene , with the web application available at https://asntech.shinyapps.io/intervene .

  1. Automated whole-genome multiple alignment of rat, mouse, and human

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  2. Visual Comparison of Multiple Gene Expression Datasets in a Genomic Context

    Directory of Open Access Journals (Sweden)

    Borowski Krzysztof

    2008-06-01

    Full Text Available The need for novel methods of visualizing microarray data is growing. New perspectives are beneficial to finding patterns in expression data. The Bluejay genome browser provides an integrative way of visualizing gene expression datasets in a genomic context. We have now developed the functionality to display multiple microarray datasets simultaneously in Bluejay, in order to provide researchers with a comprehensive view of their datasets linked to a graphical representation of gene function. This will enable biologists to obtain valuable insights on expression patterns, by allowing them to analyze the expression values in relation to the gene locations as well as to compare expression profiles of related genomes or of di erent experiments for the same genome.

  3. Multiple hybrid de novo genome assembly of finger millet, an orphan allotetraploid crop.

    Science.gov (United States)

    Hatakeyama, Masaomi; Aluri, Sirisha; Balachadran, Mathi Thumilan; Sivarajan, Sajeevan Radha; Patrignani, Andrea; Grüter, Simon; Poveda, Lucy; Shimizu-Inatsugi, Rie; Baeten, John; Francoijs, Kees-Jan; Nataraja, Karaba N; Reddy, Yellodu A Nanja; Phadnis, Shamprasad; Ravikumar, Ramapura L; Schlapbach, Ralph; Sreeman, Sheshshayee M; Shimizu, Kentaro K

    2017-09-05

    Finger millet (Eleusine coracana (L.) Gaertn) is an important crop for food security because of its tolerance to drought, which is expected to be exacerbated by global climate changes. Nevertheless, it is often classified as an orphan/underutilized crop because of the paucity of scientific attention. Among several small millets, finger millet is considered as an excellent source of essential nutrient elements, such as iron and zinc; hence, it has potential as an alternate coarse cereal. However, high-quality genome sequence data of finger millet are currently not available. One of the major problems encountered in the genome assembly of this species was its polyploidy, which hampers genome assembly compared with a diploid genome. To overcome this problem, we sequenced its genome using diverse technologies with sufficient coverage and assembled it via a novel multiple hybrid assembly workflow that combines next-generation with single-molecule sequencing, followed by whole-genome optical mapping using the Bionano Irys® system. The total number of scaffolds was 1,897 with an N50 length >2.6 Mb and detection of 96% of the universal single-copy orthologs. The majority of the homeologs were assembled separately. This indicates that the proposed workflow is applicable to the assembly of other allotetraploid genomes. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  4. M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species

    Directory of Open Access Journals (Sweden)

    Messeguer Xavier

    2006-10-01

    Full Text Available Abstract Background Due to recent advances in whole genome shotgun sequencing and assembly technologies, the financial cost of decoding an organism's DNA has been drastically reduced, resulting in a recent explosion of genomic sequencing projects. This increase in related genomic data will allow for in depth studies of evolution in closely related species through multiple whole genome comparisons. Results To facilitate such comparisons, we present an interactive multiple genome comparison and alignment tool, M-GCAT, that can efficiently construct multiple genome comparison frameworks in closely related species. M-GCAT is able to compare and identify highly conserved regions in up to 20 closely related bacterial species in minutes on a standard computer, and as many as 90 (containing 75 cloned genomes from a set of 15 published enterobacterial genomes in an hour. M-GCAT also incorporates a novel comparative genomics data visualization interface allowing the user to globally and locally examine and inspect the conserved regions and gene annotations. Conclusion M-GCAT is an interactive comparative genomics tool well suited for quickly generating multiple genome comparisons frameworks and alignments among closely related species. M-GCAT is freely available for download for academic and non-commercial use at: http://alggen.lsi.upc.es/recerca/align/mgcat/intro-mgcat.html.

  5. [Investigation of RNA viral genome amplification by multiple displacement amplification technique].

    Science.gov (United States)

    Pang, Zheng; Li, Jian-Dong; Li, Chuan; Liang, Mi-Fang; Li, De-Xin

    2013-06-01

    In order to facilitate the detection of newly emerging or rare viral infectious diseases, a negative-strand RNA virus-severe fever with thrombocytopenia syndrome bunyavirus, and a positive-strand RNA virus-dengue virus, were used to investigate RNA viral genome unspecific amplification by multiple displacement amplification technique from clinical samples. Series of 10-fold diluted purified viral RNA were utilized as analog samples with different pathogen loads, after a series of reactions were sequentially processed, single-strand cDNA, double-strand cDNA, double-strand cDNA treated with ligation without or with supplemental RNA were generated, then a Phi29 DNA polymerase depended isothermal amplification was employed, and finally the target gene copies were detected by real time PCR assays to evaluate the amplification efficiencies of various methods. The results showed that multiple displacement amplification effects of single-strand or double-strand cDNA templates were limited, while the fold increases of double-strand cDNA templates treated with ligation could be up to 6 X 10(3), even 2 X 10(5) when supplemental RNA existed, and better results were obtained when viral RNA loads were lower. A RNA viral genome amplification system using multiple displacement amplification technique was established in this study and effective amplification of RNA viral genome with low load was achieved, which could provide a tool to synthesize adequate viral genome for multiplex pathogens detection.

  6. Multiplicity of genome equivalents in the radiation-resistant bacterium Micrococcus radiodurans.

    Science.gov (United States)

    Hansen, M T

    1978-01-01

    The complexity of the genome of Micrococcus radiodurans was determined to be (2.0 +/- 0.3) X 10(9) daltons by DNA renaturation kinetics. The number of genome equivalents of DNA per cell was calculated from the complexity and the content of DNA. A lower limit of four genome equivalents per cell was approached with decreasing growth rate. Thus, no haploid stage appeared to be realized in this organism. The replication time was estimated from the kinetics and amount of residual DNA synthesis after inhibiting initiation of new rounds of replication. From this, the redundancy of terminal genetic markers was calculated to vary with growth rate from four to approximately eight copies per cell. All genetic material, including the least abundant, is thus multiply represented in each cell. The potential significance of the maintenance in each cell of multiple gene copies is discussed in relation to the extreme radiation resistance of M. radiodurans. PMID:649572

  7. Multiple recent horizontal transfers of a large genomic region in cheese making fungi.

    Science.gov (United States)

    Cheeseman, Kevin; Ropars, Jeanne; Renault, Pierre; Dupont, Joëlle; Gouzy, Jérôme; Branca, Antoine; Abraham, Anne-Laure; Ceppi, Maurizio; Conseiller, Emmanuel; Debuchy, Robert; Malagnac, Fabienne; Goarin, Anne; Silar, Philippe; Lacoste, Sandrine; Sallet, Erika; Bensimon, Aaron; Giraud, Tatiana; Brygoo, Yves

    2014-01-01

    While the extent and impact of horizontal transfers in prokaryotes are widely acknowledged, their importance to the eukaryotic kingdom is unclear and thought by many to be anecdotal. Here we report multiple recent transfers of a huge genomic island between Penicillium spp. found in the food environment. Sequencing of the two leading filamentous fungi used in cheese making, P. roqueforti and P. camemberti, and comparison with the penicillin producer P. rubens reveals a 575 kb long genomic island in P. roqueforti--called Wallaby--present as identical fragments at non-homologous loci in P. camemberti and P. rubens. Wallaby is detected in Penicillium collections exclusively in strains from food environments. Wallaby encompasses about 250 predicted genes, some of which are probably involved in competition with microorganisms. The occurrence of multiple recent eukaryotic transfers in the food environment provides strong evidence for the importance of this understudied and probably underestimated phenomenon in eukaryotes.

  8. An evaluation of multiple annealing and looping based genome amplification using a synthetic bacterial community

    KAUST Repository

    Wang, Yong

    2016-02-23

    The low biomass in environmental samples is a major challenge for microbial metagenomic studies. The amplification of a genomic DNA was frequently applied to meeting the minimum requirement of the DNA for a high-throughput next-generation-sequencing technology. Using a synthetic bacterial community, the amplification efficiency of the Multiple Annealing and Looping Based Amplification Cycles (MALBAC) kit that is originally developed to amplify the single-cell genomic DNA of mammalian organisms is examined. The DNA template of 10 pg in each reaction of the MALBAC amplification may generate enough DNA for Illumina sequencing. Using 10 pg and 100 pg templates for each reaction set, the MALBAC kit shows a stable and homogeneous amplification as indicated by the highly consistent coverage of the reads from the two amplified samples on the contigs assembled by the original unamplified sample. Although GenomePlex whole genome amplification kit allows one to generate enough DNA using 100 pg of template in each reaction, the minority of the mixed bacterial species is not linearly amplified. For both of the kits, the GC-rich regions of the genomic DNA are not efficiently amplified as suggested by the low coverage of the contigs with the high GC content. The high efficiency of the MALBAC kit is supported for the amplification of environmental microbial DNA samples, and the concerns on its application are also raised to bacterial species with the high GC content.

  9. Natural selection affects multiple aspects of genetic variation at putatively neutral sites across the human genome.

    Science.gov (United States)

    Lohmueller, Kirk E; Albrechtsen, Anders; Li, Yingrui; Kim, Su Yeon; Korneliussen, Thorfinn; Vinckenbosch, Nicolas; Tian, Geng; Huerta-Sanchez, Emilia; Feder, Alison F; Grarup, Niels; Jørgensen, Torben; Jiang, Tao; Witte, Daniel R; Sandbæk, Annelli; Hellmann, Ines; Lauritzen, Torsten; Hansen, Torben; Pedersen, Oluf; Wang, Jun; Nielsen, Rasmus

    2011-10-01

    A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.

  10. Partial replicas of uv-irradiated bacteriophage T4 genomes and their role in multiplicity reactivation

    International Nuclear Information System (INIS)

    Rayssiguier, C.; Kozinski, A.W.; Doermann, A.H.

    1980-01-01

    A physicochemical study was made of the replication and transmission of uv-irradiated T4 genomes. The data presented in this paper justify the following conclusions. (i) For both low and high multiplicity of infection there was abundant replication from uv-irradiated parental templates. It exceeded by far the efficiency predicted by the hypothesis that a single lethal hit completely prevents replication of the killed phage DNA: i.e., some dead phage particles must replicate parts of their DNA. (ii) Replication of the uv-irradiated DNA was repetitive as shown by density reversal experiments. (iii) Newly synthesized progeny DNA originating from uv-irradiated templates appeared as significantly shorter segments of the genomes than progeny DNA produced from non-uv-irradiated templates. A good correlation existed between the number of uv hits and the number of random cuts that would be needed to reduce replication fragments to the length observed. (iv) The contribution of uv-irradiated parental DNA among progeny phage in multiplicity reactivation was disposed in shorter subunits than was the DNA from unirradiated parental phage. It is important to emphasize that it was mainly in the form of replicative hybrid. These conclusions appear to justify excluding interparental recombination as a prerequisite for multiplicity reactivation. They lead directly to some form of partial replica hypothesis for multiplicity reactivation

  11. Nanoliter reactors improve multiple displacement amplification of genomes from single cells.

    Directory of Open Access Journals (Sweden)

    Yann Marcy

    2007-09-01

    Full Text Available Since only a small fraction of environmental bacteria are amenable to laboratory culture, there is great interest in genomic sequencing directly from single cells. Sufficient DNA for sequencing can be obtained from one cell by the Multiple Displacement Amplification (MDA method, thereby eliminating the need to develop culture methods. Here we used a microfluidic device to isolate individual Escherichia coli and amplify genomic DNA by MDA in 60-nl reactions. Our results confirm a report that reduced MDA reaction volume lowers nonspecific synthesis that can result from contaminant DNA templates and unfavourable interaction between primers. The quality of the genome amplification was assessed by qPCR and compared favourably to single-cell amplifications performed in standard 50-microl volumes. Amplification bias was greatly reduced in nanoliter volumes, thereby providing a more even representation of all sequences. Single-cell amplicons from both microliter and nanoliter volumes provided high-quality sequence data by high-throughput pyrosequencing, thereby demonstrating a straightforward route to sequencing genomes from single cells.

  12. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species

    Science.gov (United States)

    Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that i...

  13. An evolvable oestrogen receptor activity sensor: development of a modular system for integrating multiple genes into the yeast genome

    NARCIS (Netherlands)

    Fox, J.E.; Bridgham, J.T.; Bovee, T.F.H.; Thornton, J.W.

    2007-01-01

    To study a gene interaction network, we developed a gene-targeting strategy that allows efficient and stable genomic integration of multiple genetic constructs at distinct target loci in the yeast genome. This gene-targeting strategy uses a modular plasmid with a recyclable selectable marker and a

  14. Analysis of Genome-Wide Association Studies with Multiple Outcomes Using Penalization

    Science.gov (United States)

    Liu, Jin; Huang, Jian; Ma, Shuangge

    2012-01-01

    Genome-wide association studies have been extensively conducted, searching for markers for biologically meaningful outcomes and phenotypes. Penalization methods have been adopted in the analysis of the joint effects of a large number of SNPs (single nucleotide polymorphisms) and marker identification. This study is partly motivated by the analysis of heterogeneous stock mice dataset, in which multiple correlated phenotypes and a large number of SNPs are available. Existing penalization methods designed to analyze a single response variable cannot accommodate the correlation among multiple response variables. With multiple response variables sharing the same set of markers, joint modeling is first employed to accommodate the correlation. The group Lasso approach is adopted to select markers associated with all the outcome variables. An efficient computational algorithm is developed. Simulation study and analysis of the heterogeneous stock mice dataset show that the proposed method can outperform existing penalization methods. PMID:23272092

  15. Multiple displacement amplification of whole genomic DNA from urediospores of Puccinia striiformis f. sp. tritici.

    Science.gov (United States)

    Zhang, R; Ma, Z H; Wu, B M

    2015-05-01

    Biotrophic fungi, such as Puccinia striiformis f. sp. tritici, because they cannot be cultured on nutrient media, to obtain adequate quantity of DNA for molecular genetic analysis, are usually propagated on living hosts, wheat plants in case of P. striiformis f. sp. tritici. The propagation process is time-, space- and labor-consuming and has been a bottleneck to molecular genetic analysis of this pathogen. In this study we evaluated multiple displacement amplification (MDA) of pathogen genomic DNA from urediospores as an alternative approach to traditional propagation of urediospores followed by DNA extraction. The quantities of pathogen genomic DNA in the products were further determined via real-time PCR with a pair of primers specific for the β-tubulin gene of P. striiformis f. sp. tritici. The amplified fragment length polymorphism (AFLP) fingerprints were also compared between the DNA products. The results demonstrated that adequate genomic DNA at fragment size larger than 23 Kb could be amplified from 20 to 30 urediospores via MDA method. The real-time PCR results suggested that although fresh urediospores collected from diseased leaves were the best, spores picked from diseased leaves stored for a prolonged period could also be used for amplification. AFLP fingerprints exhibited no significant differences between amplified DNA and DNA extracted with CTAB method, suggesting amplified DNA can represent the pathogen's genomic DNA very well. Therefore, MDA could be used to obtain genomic DNA from small precious samples (dozens of spores) for molecular genetic analysis of wheat stripe rust pathogen, and other fungi that are difficult to propagate.

  16. A Bayesian method and its variational approximation for prediction of genomic breeding values in multiple traits

    Directory of Open Access Journals (Sweden)

    Hayashi Takeshi

    2013-01-01

    Full Text Available Abstract Background Genomic selection is an effective tool for animal and plant breeding, allowing effective individual selection without phenotypic records through the prediction of genomic breeding value (GBV. To date, genomic selection has focused on a single trait. However, actual breeding often targets multiple correlated traits, and, therefore, joint analysis taking into consideration the correlation between traits, which might result in more accurate GBV prediction than analyzing each trait separately, is suitable for multi-trait genomic selection. This would require an extension of the prediction model for single-trait GBV to multi-trait case. As the computational burden of multi-trait analysis is even higher than that of single-trait analysis, an effective computational method for constructing a multi-trait prediction model is also needed. Results We described a Bayesian regression model incorporating variable selection for jointly predicting GBVs of multiple traits and devised both an MCMC iteration and variational approximation for Bayesian estimation of parameters in this multi-trait model. The proposed Bayesian procedures with MCMC iteration and variational approximation were referred to as MCBayes and varBayes, respectively. Using simulated datasets of SNP genotypes and phenotypes for three traits with high and low heritabilities, we compared the accuracy in predicting GBVs between multi-trait and single-trait analyses as well as between MCBayes and varBayes. The results showed that, compared to single-trait analysis, multi-trait analysis enabled much more accurate GBV prediction for low-heritability traits correlated with high-heritability traits, by utilizing the correlation structure between traits, while the prediction accuracy for uncorrelated low-heritability traits was comparable or less with multi-trait analysis in comparison with single-trait analysis depending on the setting for prior probability that a SNP has zero

  17. Digital Droplet Multiple Displacement Amplification (ddMDA for Whole Genome Sequencing of Limited DNA Samples.

    Directory of Open Access Journals (Sweden)

    Minsoung Rhee

    Full Text Available Multiple displacement amplification (MDA is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA technique where partitioning of the template DNA into thousands of sub-nanoliter droplets, each containing a small number of DNA fragments, greatly reduces the competition among DNA fragments for primers and polymerase thereby greatly reducing amplification bias. Consequently, the ddMDA approach enabled a more uniform coverage of amplification over the entire length of the genome, with significantly lower bias and non-specific amplification than conventional MDA. For a sample containing 0.1 pg/μL of E. coli DNA (equivalent of ~3/1000 of an E. coli genome per droplet, ddMDA achieves a 65-fold increase in coverage in de novo assembly, and more than 20-fold increase in specificity (percentage of reads mapping to E. coli compared to the conventional tube MDA. ddMDA offers a powerful method useful for many applications including medical diagnostics, forensics, and environmental microbiology.

  18. Dynamic evolution of Geranium mitochondrial genomes through multiple horizontal and intracellular gene transfers.

    Science.gov (United States)

    Park, Seongjun; Grewe, Felix; Zhu, Andan; Ruhlman, Tracey A; Sabir, Jamal; Mower, Jeffrey P; Jansen, Robert K

    2015-10-01

    The exchange of genetic material between cellular organelles through intracellular gene transfer (IGT) or between species by horizontal gene transfer (HGT) has played an important role in plant mitochondrial genome evolution. The mitochondrial genomes of Geraniaceae display a number of unusual phenomena including highly accelerated rates of synonymous substitutions, extensive gene loss and reduction in RNA editing. Mitochondrial DNA sequences assembled for 17 species of Geranium revealed substantial reduction in gene and intron content relative to the ancestor of the Geranium lineage. Comparative analyses of nuclear transcriptome data suggest that a number of these sequences have been functionally relocated to the nucleus via IGT. Evidence for rampant HGT was detected in several Geranium species containing foreign organellar DNA from diverse eudicots, including many transfers from parasitic plants. One lineage has experienced multiple, independent HGT episodes, many of which occurred within the past 5.5 Myr. Both duplicative and recapture HGT were documented in Geranium lineages. The mitochondrial genome of Geranium brycei contains at least four independent HGT tracts that are absent in its nearest relative. Furthermore, G. brycei mitochondria carry two copies of the cox1 gene that differ in intron content, providing insight into contrasting hypotheses on cox1 intron evolution. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  19. A "candidate-interactome" aggregate analysis of genome-wide association data in multiple sclerosis

    DEFF Research Database (Denmark)

    Mechelli, Rosella; Umeton, Renato; Policano, Claudia

    2013-01-01

    of genes whose products are known to physically interact with environmental factors that may be relevant for disease pathogenesis) analysis of genome-wide association data in multiple sclerosis. We looked for statistical enrichment of associations among interactomes that, at the current state of knowledge......, may be representative of gene-environment interactions of potential, uncertain or unlikely relevance for multiple sclerosis pathogenesis: Epstein-Barr virus, human immunodeficiency virus, hepatitis B virus, hepatitis C virus, cytomegalovirus, HHV8-Kaposi sarcoma, H1N1-influenza, JC virus, human innate...... immunity interactome for type I interferon, autoimmune regulator, vitamin D receptor, aryl hydrocarbon receptor and a panel of proteins targeted by 70 innate immune-modulating viral open reading frames from 30 viral species. Interactomes were either obtained from the literature or were manually curated...

  20. The Arabidopsis thaliana homolog of the helicase RTEL1 plays multiple roles in preserving genome stability.

    Science.gov (United States)

    Recker, Julia; Knoll, Alexander; Puchta, Holger

    2014-12-01

    In humans, mutations in the DNA helicase Regulator of Telomere Elongation Helicase1 (RTEL1) lead to Hoyeraal-Hreidarsson syndrome, a severe, multisystem disorder. Here, we demonstrate that the RTEL1 homolog in Arabidopsis thaliana plays multiple roles in preserving genome stability. RTEL1 suppresses homologous recombination in a pathway parallel to that of the DNA translocase FANCM. Cytological analyses of root meristems indicate that RTEL1 is involved in processing DNA replication intermediates independently from FANCM and the nuclease MUS81. Moreover, RTEL1 is involved in interstrand and intrastrand DNA cross-link repair independently from FANCM and (in intrastrand cross-link repair) parallel to MUS81. RTEL1 contributes to telomere homeostasis; the concurrent loss of RTEL1 and the telomerase TERT leads to rapid, severe telomere shortening, which occurs much more rapidly than it does in the single-mutant line tert, resulting in developmental arrest after four generations. The double mutant rtel1-1 recq4A-4 exhibits massive growth defects, indicating that this RecQ family helicase, which is also involved in the suppression of homologous recombination and the repair of DNA lesions, can partially replace RTEL1 in the processing of DNA intermediates. The requirement for RTEL1 in multiple pathways to preserve genome stability in plants can be explained by its putative role in the destabilization of DNA loop structures, such as D-loops and T-loops. © 2014 American Society of Plant Biologists. All rights reserved.

  1. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations.

    Science.gov (United States)

    Shi, Hongbo; Zhang, Guangde; Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD.

  2. Model training across multiple breeding cycles significantly improves genomic prediction accuracy in rye (Secale cereale L.).

    Science.gov (United States)

    Auinger, Hans-Jürgen; Schönleben, Manfred; Lehermeier, Christina; Schmidt, Malthe; Korzun, Viktor; Geiger, Hartwig H; Piepho, Hans-Peter; Gordillo, Andres; Wilde, Peer; Bauer, Eva; Schön, Chris-Carolin

    2016-11-01

    Genomic prediction accuracy can be significantly increased by model calibration across multiple breeding cycles as long as selection cycles are connected by common ancestors. In hybrid rye breeding, application of genome-based prediction is expected to increase selection gain because of long selection cycles in population improvement and development of hybrid components. Essentially two prediction scenarios arise: (1) prediction of the genetic value of lines from the same breeding cycle in which model training is performed and (2) prediction of lines from subsequent cycles. It is the latter from which a reduction in cycle length and consequently the strongest impact on selection gain is expected. We empirically investigated genome-based prediction of grain yield, plant height and thousand kernel weight within and across four selection cycles of a hybrid rye breeding program. Prediction performance was assessed using genomic and pedigree-based best linear unbiased prediction (GBLUP and PBLUP). A total of 1040 S 2 lines were genotyped with 16 k SNPs and each year testcrosses of 260 S 2 lines were phenotyped in seven or eight locations. The performance gap between GBLUP and PBLUP increased significantly for all traits when model calibration was performed on aggregated data from several cycles. Prediction accuracies obtained from cross-validation were in the order of 0.70 for all traits when data from all cycles (N CS  = 832) were used for model training and exceeded within-cycle accuracies in all cases. As long as selection cycles are connected by a sufficient number of common ancestors and prediction accuracy has not reached a plateau when increasing sample size, aggregating data from several preceding cycles is recommended for predicting genetic values in subsequent cycles despite decreasing relatedness over time.

  3. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations.

    Directory of Open Access Journals (Sweden)

    Hongbo Shi

    Full Text Available MicroRNAs (miRNAs play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD.

  4. Origin of multiple periodicities in the Fourier power spectra of the Plasmodium falciparum genome

    Directory of Open Access Journals (Sweden)

    Nunes Miriam CS

    2011-12-01

    Full Text Available Abstract Background Fourier transforms and their associated power spectra are used for detecting periodicities and protein-coding genes and is generally regarded as a well established technique. Many of the periodicities which have been found with this method are quite well understood such as the periodicity of 3 nt which is associated to codon usage. But what is the origin of the peculiar frequency multiples k/21 which were reported for a tiny section of chromosome 2 in P. falciparum? Are these present in other chromosomes and perhaps in related organisms? And how should we interpret fractional periodicities in genomes? Results We applied the binary indicator power spectrum to all chromosomes of P. falciparum, and found that the frequency overtones k/21 are present only in non-coding sections. We did not find such frequency overtones in any other related genomes. Furthermore, the frequency overtones were identified as artifacts of the way the genome is encoded into a numerical sequence, that is, they are frequency aliases. By choosing a different way to encode the sequence the overtones do not appear. In view of these results, we revisited early applications of this technique to proteins where frequency overtones were reported. Conclusions Some authors hinted recently at the possibility of mapping artifacts and frequency aliases in power spectra. However, in the case of P. falciparum the frequency aliases are particularly strong and can mask the 1/3 frequency which is used for gene detecting. This shows that albeit being a well known technique, with a long history of application in proteins, few researchers seem to be aware of the problems represented by frequency aliases.

  5. Inactivating UBE2M impacts the DNA damage response and genome integrity involving multiple cullin ligases.

    Directory of Open Access Journals (Sweden)

    Scott Cukras

    Full Text Available Protein neddylation is involved in a wide variety of cellular processes. Here we show that the DNA damage response is perturbed in cells inactivated with an E2 Nedd8 conjugating enzyme UBE2M, measured by RAD51 foci formation kinetics and cell based DNA repair assays. UBE2M knockdown increases DNA breakages and cellular sensitivity to DNA damaging agents, further suggesting heightened genomic instability and defective DNA repair activity. Investigating the downstream Cullin targets of UBE2M revealed that silencing of Cullin 1, 2, and 4 ligases incurred significant DNA damage. In particular, UBE2M knockdown, or defective neddylation of Cullin 2, leads to a blockade in the G1 to S progression and is associated with delayed S-phase dependent DNA damage response. Cullin 4 inactivation leads to an aberrantly high DNA damage response that is associated with increased DNA breakages and sensitivity of cells to DNA damaging agents, suggesting a DNA repair defect is associated. siRNA interrogation of key Cullin substrates show that CDT1, p21, and Claspin are involved in elevated DNA damage in the UBE2M knockdown cells. Therefore, UBE2M is required to maintain genome integrity by activating multiple Cullin ligases throughout the cell cycle.

  6. Inactivating UBE2M impacts the DNA damage response and genome integrity involving multiple cullin ligases.

    Science.gov (United States)

    Cukras, Scott; Morffy, Nicholas; Ohn, Takbum; Kee, Younghoon

    2014-01-01

    Protein neddylation is involved in a wide variety of cellular processes. Here we show that the DNA damage response is perturbed in cells inactivated with an E2 Nedd8 conjugating enzyme UBE2M, measured by RAD51 foci formation kinetics and cell based DNA repair assays. UBE2M knockdown increases DNA breakages and cellular sensitivity to DNA damaging agents, further suggesting heightened genomic instability and defective DNA repair activity. Investigating the downstream Cullin targets of UBE2M revealed that silencing of Cullin 1, 2, and 4 ligases incurred significant DNA damage. In particular, UBE2M knockdown, or defective neddylation of Cullin 2, leads to a blockade in the G1 to S progression and is associated with delayed S-phase dependent DNA damage response. Cullin 4 inactivation leads to an aberrantly high DNA damage response that is associated with increased DNA breakages and sensitivity of cells to DNA damaging agents, suggesting a DNA repair defect is associated. siRNA interrogation of key Cullin substrates show that CDT1, p21, and Claspin are involved in elevated DNA damage in the UBE2M knockdown cells. Therefore, UBE2M is required to maintain genome integrity by activating multiple Cullin ligases throughout the cell cycle.

  7. webMGR: an online tool for the multiple genome rearrangement problem.

    Science.gov (United States)

    Lin, Chi Ho; Zhao, Hao; Lowcay, Sean Harry; Shahab, Atif; Bourque, Guillaume

    2010-02-01

    The algorithm MGR enables the reconstruction of rearrangement phylogenies based on gene or synteny block order in multiple genomes. Although MGR has been successfully applied to study the evolution of different sets of species, its utilization has been hampered by the prohibitive running time for some applications. In the current work, we have designed new heuristics that significantly speed up the tool without compromising its accuracy. Moreover, we have developed a web server (webMGR) that includes elaborate web output to facilitate navigation through the results. webMGR can be accessed via http://www.gis.a-star.edu.sg/~bourque. The source code of the improved standalone version of MGR is also freely available from the web site. Supplementary data are available at Bioinformatics online.

  8. On the representability of complete genomes by multiple competing finite-context (Markov models.

    Directory of Open Access Journals (Sweden)

    Armando J Pinho

    Full Text Available A finite-context (Markov model of order k yields the probability distribution of the next symbol in a sequence of symbols, given the recent past up to depth k. Markov modeling has long been applied to DNA sequences, for example to find gene-coding regions. With the first studies came the discovery that DNA sequences are non-stationary: distinct regions require distinct model orders. Since then, Markov and hidden Markov models have been extensively used to describe the gene structure of prokaryotes and eukaryotes. However, to our knowledge, a comprehensive study about the potential of Markov models to describe complete genomes is still lacking. We address this gap in this paper. Our approach relies on (i multiple competing Markov models of different orders (ii careful programming techniques that allow orders as large as sixteen (iii adequate inverted repeat handling (iv probability estimates suited to the wide range of context depths used. To measure how well a model fits the data at a particular position in the sequence we use the negative logarithm of the probability estimate at that position. The measure yields information profiles of the sequence, which are of independent interest. The average over the entire sequence, which amounts to the average number of bits per base needed to describe the sequence, is used as a global performance measure. Our main conclusion is that, from the probabilistic or information theoretic point of view and according to this performance measure, multiple competing Markov models explain entire genomes almost as well or even better than state-of-the-art DNA compression methods, such as XM, which rely on very different statistical models. This is surprising, because Markov models are local (short-range, contrasting with the statistical models underlying other methods, where the extensive data repetitions in DNA sequences is explored, and therefore have a non-local character.

  9. Syntenic block overlap multiplicities with a panel of reference genomes provide a signature of ancient polyploidization events.

    Science.gov (United States)

    Zheng, Chunfang; Santos Muñoz, Daniella; Albert, Victor A; Sankoff, David

    2015-01-01

    Following whole genome duplication (WGD), there is a compact distribution of gene similarities within the genome reflecting duplicate pairs of all the genes in the genome. With time, the distribution broadens and loses volume due to variable decay of duplicate gene similarity and to the process of duplicate gene loss. If there are two WGD, the older one becomes so reduced and broad that it merges with the tail of the distributions resulting from more recent events, and it becomes difficult to distinguish them. The goal of this paper is to advance statistical methods of identifying, or at least counting, the WGD events in the lineage of a given genome. For a set of 15 angiosperm genomes, we analyze all 15 × 14 = 210 ordered pairs of target genome versus reference genome, using SynMap to find syntenic blocks. We consider all sets of B ≥ 2 syntenic blocks in the target genome that overlap in the reference genome as evidence of WGD activity in the target, whether it be one event or several. We hypothesize that in fitting an exponential function to the tail of the empirical distribution f (B) of block multiplicities, the size of the exponent will reflect the amount of WGD in the history of the target genome. By amalgamating the results from all reference genomes, a range of values of SynMap parameters, and alternative cutoff points for the tail, we find a clear pattern whereby multiple-WGD core eudicots have the smallest (negative) exponents, followed by core eudicots with only the single "γ" triplication in their history, followed by a non-core eudicot with a single WGD, followed by the monocots, with a basal angiosperm, the WGD-free Amborella having the largest exponent. The hypothesis that the exponent of the fit to the tail of the multiplicity distribution is a signature of the amount of WGD is verified, but there is also a clear complicating factor in the monocot clade, where a history of multiple WGD is not reflected in a small exponent.

  10. Genome diversity and divergence in Drosophila mauritiana: multiple signatures of faster X evolution.

    Science.gov (United States)

    Garrigan, Daniel; Kingan, Sarah B; Geneva, Anthony J; Vedanayagam, Jeffrey P; Presgraves, Daven C

    2014-09-04

    Drosophila mauritiana is an Indian Ocean island endemic species that diverged from its two sister species, Drosophila simulans and Drosophila sechellia, approximately 240,000 years ago. Multiple forms of incomplete reproductive isolation have evolved among these species, including sexual, gametic, ecological, and intrinsic postzygotic barriers, with crosses among all three species conforming to Haldane's rule: F(1) hybrid males are sterile and F(1) hybrid females are fertile. Extensive genetic resources and the fertility of hybrid females have made D. mauritiana, in particular, an important model for speciation genetics. Analyses between D. mauritiana and both of its siblings have shown that the X chromosome makes a disproportionate contribution to hybrid male sterility. But why the X plays a special role in the evolution of hybrid sterility in these, and other, species remains an unsolved problem. To complement functional genetic analyses, we have investigated the population genomics of D. mauritiana, giving special attention to differences between the X and the autosomes. We present a de novo genome assembly of D. mauritiana annotated with RNAseq data and a whole-genome analysis of polymorphism and divergence from ten individuals. Our analyses show that, relative to the autosomes, the X chromosome has reduced nucleotide diversity but elevated nucleotide divergence; an excess of recurrent adaptive evolution at its protein-coding genes; an excess of recent, strong selective sweeps; and a large excess of satellite DNA. Interestingly, one of two centimorgan-scale selective sweeps on the D. mauritiana X chromosome spans a region containing two sex-ratio meiotic drive elements and a high concentration of satellite DNA. Furthermore, genes with roles in reproduction and chromosome biology are enriched among genes that have histories of recurrent adaptive protein evolution. Together, these genome-wide analyses suggest that genetic conflict and frequent positive natural

  11. Ion torrent personal genome machine sequencing for genomic typing of Neisseria meningitidis for rapid determination of multiple layers of typing information.

    Science.gov (United States)

    Vogel, Ulrich; Szczepanowski, Rafael; Claus, Heike; Jünemann, Sebastian; Prior, Karola; Harmsen, Dag

    2012-06-01

    Neisseria meningitidis causes invasive meningococcal disease in infants, toddlers, and adolescents worldwide. DNA sequence-based typing, including multilocus sequence typing, analysis of genetic determinants of antibiotic resistance, and sequence typing of vaccine antigens, has become the standard for molecular epidemiology of the organism. However, PCR of multiple targets and consecutive Sanger sequencing provide logistic constraints to reference laboratories. Taking advantage of the recent development of benchtop next-generation sequencers (NGSs) and of BIGSdb, a database accommodating and analyzing genome sequence data, we therefore explored the feasibility and accuracy of Ion Torrent Personal Genome Machine (PGM) sequencing for genomic typing of meningococci. Three strains from a previous meningococcus serogroup B community outbreak were selected to compare conventional typing results with data generated by semiconductor chip-based sequencing. In addition, sequencing of the meningococcal type strain MC58 provided information about the general performance of the technology. The PGM technology generated sequence information for all target genes addressed. The results were 100% concordant with conventional typing results, with no further editing being necessary. In addition, the amount of typing information, i.e., nucleotides and target genes analyzed, could be substantially increased by the combined use of genome sequencing and BIGSdb compared to conventional methods. In the near future, affordable and fast benchtop NGS machines like the PGM might enable reference laboratories to switch to genomic typing on a routine basis. This will reduce workloads and rapidly provide information for laboratory surveillance, outbreak investigation, assessment of vaccine preventability, and antibiotic resistance gene monitoring.

  12. Genomic screening for dissection of a complex disease: The multiple sclerosis phenotype

    Energy Technology Data Exchange (ETDEWEB)

    Haines, J.L.; Bazyk, A.; Gusella, J.F. [Massachusetts General Hospital, Boston, MA (United States)] [and others

    1994-09-01

    Application of positional cloning to diseases with a complex etiology is fraught with problems. These include undefined modes of inheritance, heterogeneity, and epistasis. Although microsatellite markers now make genotyping the genome a straightforward task, no single analytical method is available to efficiently and accurately use these data for a complex disease. We have developed a multi-stage genomic screening strategy which uses a combination of non-parametric approaches (Affected Pedigree Member (APM) linkage analysis and robust sib pair analysis (SP)), and the parametric lod score approach (using four different genetic models). To warrant follow-up, a marker must have two or more of: a nominal P value of 0.05 or less on the non-parametric tests, or a lod score greater than 1.0 for any model. Two adjacent markers each fulfilling one criterion are also considered for follow-up. These criteria were determined both by simulation studies and our empirical experience in screening a large number of other disorders. We applied this approach to multiple sclerosis (MS), a complex neurological disorder with a strong but ill-defined genetic component. Analysis of the first 91 markers from our screen of 55 multiplex families found 5 markers which met the SP criteria, 13 markers which met the APM criteria, and 8 markers which met the lod score criteria. Five regions (on chromosomes 2, 4, 7, 14, and 19) met our overall criteria. However, no single method identified all of these regions, suggesting that each method is sensitive to various (unknown) influences. The chromosome 14 results were not supported by follow-up typing and analysis of markers in that region, but the chromosome 19 results remain well supported. Updated screening results will be presented.

  13. Effects of DNA mass on multiple displacement whole genome amplification and genotyping performance

    Directory of Open Access Journals (Sweden)

    Haque Kashif A

    2005-09-01

    Full Text Available Abstract Background Whole genome amplification (WGA promises to eliminate practical molecular genetic analysis limitations associated with genomic DNA (gDNA quantity. We evaluated the performance of multiple displacement amplification (MDA WGA using gDNA extracted from lymphoblastoid cell lines (N = 27 with a range of starting gDNA input of 1–200 ng into the WGA reaction. Yield and composition analysis of whole genome amplified DNA (wgaDNA was performed using three DNA quantification methods (OD, PicoGreen® and RT-PCR. Two panels of N = 15 STR (using the AmpFlSTR® Identifiler® panel and N = 49 SNP (TaqMan® genotyping assays were performed on each gDNA and wgaDNA sample in duplicate. gDNA and wgaDNA masses of 1, 4 and 20 ng were used in the SNP assays to evaluate the effects of DNA mass on SNP genotyping assay performance. A total of N = 6,880 STR and N = 56,448 SNP genotype attempts provided adequate power to detect differences in STR and SNP genotyping performance between gDNA and wgaDNA, and among wgaDNA produced from a range of gDNA templates inputs. Results The proportion of double-stranded wgaDNA and human-specific PCR amplifiable wgaDNA increased with increased gDNA input into the WGA reaction. Increased amounts of gDNA input into the WGA reaction improved wgaDNA genotyping performance. Genotype completion or genotype concordance rates of wgaDNA produced from all gDNA input levels were observed to be reduced compared to gDNA, although the reduction was not always statistically significant. Reduced wgaDNA genotyping performance was primarily due to the increased variance of allelic amplification, resulting in loss of heterozygosity or increased undetermined genotypes. MDA WGA produces wgaDNA from no template control samples; such samples exhibited substantial false-positive genotyping rates. Conclusion The amount of gDNA input into the MDA WGA reaction is a critical determinant of genotyping performance of wgaDNA. At least 10 ng of

  14. Genome-Wide Association Identifies Multiple Genomic Regions Associated with Susceptibility to and Control of Ovine Lentivirus

    Science.gov (United States)

    2012-10-17

    to varying degrees of dyspnea (respiratory distress), cachexia (body condition wasting), mastitis , arthritis, and/or encephalitis [5,6]. One of the...General Transcription Factor IIH, polypeptide 5), the gene order does not agree with other mammal genomes including cow , human, dog, and mouse, and it may

  15. Evaluation of multiple approaches to identify genome-wide polymorphisms in closely related genotypes of sweet cherry (Prunus avium L.

    Directory of Open Access Journals (Sweden)

    Seanna Hewitt

    Full Text Available Identification of genetic polymorphisms and subsequent development of molecular markers is important for marker assisted breeding of superior cultivars of economically important species. Sweet cherry (Prunus avium L. is an economically important non-climacteric tree fruit crop in the Rosaceae family and has undergone a genetic bottleneck due to breeding, resulting in limited genetic diversity in the germplasm that is utilized for breeding new cultivars. Therefore, it is critical to recognize the best platforms for identifying genome-wide polymorphisms that can help identify, and consequently preserve, the diversity in a genetically constrained species. For the identification of polymorphisms in five closely related genotypes of sweet cherry, a gel-based approach (TRAP, reduced representation sequencing (TRAPseq, a 6k cherry SNParray, and whole genome sequencing (WGS approaches were evaluated in the identification of genome-wide polymorphisms in sweet cherry cultivars. All platforms facilitated detection of polymorphisms among the genotypes with variable efficiency. In assessing multiple SNP detection platforms, this study has demonstrated that a combination of appropriate approaches is necessary for efficient polymorphism identification, especially between closely related cultivars of a species. The information generated in this study provides a valuable resource for future genetic and genomic studies in sweet cherry, and the insights gained from the evaluation of multiple approaches can be utilized for other closely related species with limited genetic diversity in the breeding germplasm. Keywords: Polymorphisms, Prunus avium, Next-generation sequencing, Target region amplification polymorphism (TRAP, Genetic diversity, SNParray, Reduced representation sequencing, Whole genome sequencing (WGS

  16. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  17. Identification of multiple sites suitable for insertion of foreign genes in herpes simplex virus genomes.

    Science.gov (United States)

    Morimoto, Tomomi; Arii, Jun; Akashi, Hiroomi; Kawaguchi, Yasushi

    2009-03-01

    Information on sites in HSV genomes at which foreign gene(s) can be inserted without disrupting viral genes or affecting properties of the parental virus are important for basic research on HSV and development of HSV-based vectors for human therapy. The intergenic region between HSV-1 UL3 and UL4 genes has been reported to satisfy the requirements for such an insertion site. The UL3 and UL4 genes are oriented toward the intergenic region and, therefore, insertion of a foreign gene(s) into the region between the UL3 and UL4 polyadenylation signals should not disrupt any viral genes or transcriptional units. HSV-1 and HSV-2 each have more than 10 additional regions structurally similar to the intergenic region between UL3 and UL4. In the studies reported here, it has been demonstrated that insertion of a reporter gene expression cassette into several of the HSV-1 and HSV-2 intergenic regions has no effect on viral growth in cell culture or virulence in mice, suggesting that these multiple intergenic regions may be suitable HSV sites for insertion of foreign genes.

  18. Whole-genome sequencing of multiple myeloma from diagnosis to plasma cell leukemia reveals genomic initiating events, evolution, and clonal tides.

    Science.gov (United States)

    Egan, Jan B; Shi, Chang-Xin; Tembe, Waibhav; Christoforides, Alexis; Kurdoglu, Ahmet; Sinari, Shripad; Middha, Sumit; Asmann, Yan; Schmidt, Jessica; Braggio, Esteban; Keats, Jonathan J; Fonseca, Rafael; Bergsagel, P Leif; Craig, David W; Carpten, John D; Stewart, A Keith

    2012-08-02

    The longitudinal evolution of a myeloma genome from diagnosis to plasma cell leukemia has not previously been reported. We used whole-genome sequencing (WGS) on 4 purified tumor samples and patient germline DNA drawn over a 5-year period in a t(4;14) multiple myeloma patient. Tumor samples were acquired at diagnosis, first relapse, second relapse, and end-stage secondary plasma cell leukemia (sPCL). In addition to the t(4;14), all tumor time points also shared 10 common single-nucleotide variants (SNVs) on WGS comprising shared initiating events. Interestingly, we observed genomic sequence variants that waxed and waned with time in progressive tumors, suggesting the presence of multiple independent, yet related, clones at diagnosis that rose and fell in dominance. Five newly acquired SNVs, including truncating mutations of RB1 and ZKSCAN3, were observed only in the final sPCL sample suggesting leukemic transformation events. This longitudinal WGS characterization of the natural history of a high-risk myeloma patient demonstrated tumor heterogeneity at diagnosis with shifting dominance of tumor clones over time and has also identified potential mutations contributing to myelomagenesis as well as transformation from myeloma to overt extramedullary disease such as sPCL.

  19. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes

    DEFF Research Database (Denmark)

    Albertsen, Mads; Hugenholtz, Philip; Skarshewski, Adam

    2013-01-01

    Reference genomes are required to understand the diverse roles of microorganisms in ecology, evolution, human and animal health, but most species remain uncultured. Here we present a sequence composition–independent approach to recover high-quality microbial genomes from deeply sequenced metageno......Reference genomes are required to understand the diverse roles of microorganisms in ecology, evolution, human and animal health, but most species remain uncultured. Here we present a sequence composition–independent approach to recover high-quality microbial genomes from deeply sequenced...

  20. Population genetic analysis of shotgun assemblies of genomic sequences from multiple individuals

    DEFF Research Database (Denmark)

    Hellmann, Ines; Mang, Yuan; Gu, Zhiping

    2008-01-01

    We introduce a simple, broadly applicable method for obtaining estimates of nucleotide diversity from genomic shotgun sequencing data. The method takes into account the special nature of these data: random sampling of genomic segments from one or more individuals and a relatively high error rate...... for individual reads. Applying this method to data from the Celera human genome sequencing and SNP discovery project, we obtain estimates of nucleotide diversity in windows spanning the human genome and show that the diversity to divergence ratio is reduced in regions of low recombination. Furthermore, we show...

  1. The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.

    Directory of Open Access Journals (Sweden)

    Adam Alexander Thil Smith

    2012-05-01

    Full Text Available Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes, a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short. The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.

  2. Multiple Evolutionary Selections Involved in Synonymous Codon Usages in the Streptococcus agalactiae Genome.

    Science.gov (United States)

    Ma, Yan-Ping; Ke, Hao; Liang, Zhi-Ling; Liu, Zhen-Xing; Hao, Le; Ma, Jiang-Yao; Li, Yu-Gu

    2016-02-24

    Streptococcus agalactiae is an important human and animal pathogen. To better understand the genetic features and evolution of S. agalactiae, multiple factors influencing synonymous codon usage patterns in S. agalactiae were analyzed in this study. A- and U-ending rich codons were used in S. agalactiae function genes through the overall codon usage analysis, indicating that Adenine (A)/Thymine (T) compositional constraints might contribute an important role to the synonymous codon usage pattern. The GC3% against the effective number of codon (ENC) value suggested that translational selection was the important factor for codon bias in the microorganism. Principal component analysis (PCA) showed that (i) mutational pressure was the most important factor in shaping codon usage of all open reading frames (ORFs) in the S. agalactiae genome; (ii) strand specific mutational bias was not capable of influencing the codon usage bias in the leading and lagging strands; and (iii) gene length was not the important factor in synonymous codon usage pattern in this organism. Additionally, the high correlation between tRNA adaptation index (tAI) value and codon adaptation index (CAI), frequency of optimal codons (Fop) value, reinforced the role of natural selection for efficient translation in S. agalactiae. Comparison of synonymous codon usage pattern between S. agalactiae and susceptible hosts (human and tilapia) showed that synonymous codon usage of S. agalactiae was independent of the synonymous codon usage of susceptible hosts. The study of codon usage in S. agalactiae may provide evidence about the molecular evolution of the bacterium and a greater understanding of evolutionary relationships between S. agalactiae and its hosts.

  3. Identification of an Arabidopsis thaliana protein that binds to tomato mosaic virus genomic RNA and inhibits its multiplication

    International Nuclear Information System (INIS)

    Fujisaki, Koki; Ishikawa, Masayuki

    2008-01-01

    The genomic RNAs of positive-strand RNA viruses carry RNA elements that play positive, or in some cases, negative roles in virus multiplication by interacting with viral and cellular proteins. In this study, we purified Arabidopsis thaliana proteins that specifically bind to 5' or 3' terminal regions of tomato mosaic virus (ToMV) genomic RNA, which contain important regulatory elements for translation and RNA replication, and identified these proteins by mass spectrometry analyses. One of these host proteins, named BTR1, harbored three heterogeneous nuclear ribonucleoprotein K-homology RNA-binding domains and preferentially bound to RNA fragments that contained a sequence around the initiation codon of the 130K and 180K replication protein genes. The knockout and overexpression of BTR1 specifically enhanced and inhibited, respectively, ToMV multiplication in inoculated A. thaliana leaves, while such effect was hardly detectable in protoplasts. These results suggest that BTR1 negatively regulates the local spread of ToMV

  4. EasyCloneMulti: A Set of Vectors for Simultaneous and Multiple Genomic Integrations in Saccharomyces cerevisiae

    DEFF Research Database (Denmark)

    Maury, Jerome; Germann, Susanne Manuela; Jacobsen, Simo Abdessamad

    2016-01-01

    Saccharomyces cerevisiae is widely used in the biotechnology industry for production of ethanol, recombinant proteins, food ingredients and other chemicals. In order to generate highly producing and stable strains, genome integration of genes encoding metabolic pathway enzymes is the preferred...... of integrative vectors, EasyCloneMulti, that enables multiple and simultaneous integration of genes in S. cerevisiae. By creating vector backbones that combine consensus sequences that aim at targeting subsets of Ty sequences and a quickly degrading selective marker, integrations at multiple genomic loci...... and a range of expression levels were obtained, as assessed with the green fluorescent protein (GFP) reporter system. The EasyCloneMulti vector set was applied to balance the expression of the rate-controlling step in the β-alanine pathway for biosynthesis of 3-hydroxypropionic acid (3HP). The best 3HP...

  5. GWIS: Genome-Wide Inferred Statistics for Functions of Multiple Phenotypes

    NARCIS (Netherlands)

    Nieuwboer, H.A.; Pool, R.; Dolan, C.V.; Boomsma, D.I.; Nivard, M.G.

    2016-01-01

    Here we present a method of genome-wide inferred study (GWIS) that provides an approximation of genome-wide association study (GWAS) summary statistics for a variable that is a function of phenotypes for which GWAS summary statistics, phenotypic means, and covariances are available. A GWIS can be

  6. Geographic isolates of Lymantria dispar multiple nucleopolyhedrovirus: Genome sequence analysis and pathogenicity against European and Asian gypsy moth strains.

    Science.gov (United States)

    Harrison, Robert L; Rowley, Daniel L; Keena, Melody A

    2016-06-01

    Isolates of the baculovirus species Lymantria dispar multiple nucleopolyhedrovirus have been formulated and applied to suppress outbreaks of the gypsy moth, L. dispar. To evaluate the genetic diversity in this species at the genomic level, the genomes of three isolates from Massachusetts, USA (LdMNPV-Ab-a624), Spain (LdMNPV-3054), and Japan (LdMNPV-3041) were sequenced and compared with four previously determined LdMNPV genome sequences. The LdMNPV genome sequences were collinear and contained the same homologous repeats (hrs) and clusters of baculovirus repeat orf (bro) gene family members in the same relative positions in their genomes, although sequence identities in these regions were low. Of 146 non-bro ORFs annotated in the genome of the representative isolate LdMNPV 5-6, 135 ORFs were found in every other LdMNPV genome, including the 37 core genes of Baculoviridae and other genes conserved in genus Alphabaculovirus. Phylogenetic inference with an alignment of the core gene nucleotide sequences grouped isolates 3041 (Japan) and 2161 (Korea) separately from a cluster containing isolates from Europe, North America, and Russia. To examine phenotypic diversity, bioassays were carried out with a selection of isolates against neonate larvae from three European gypsy moth (Lymantria dispar dispar) and three Asian gypsy moth (Lymantria dispar asiatica and Lymantria dispar japonica) colonies. LdMNPV isolates 2161 (Korea), 3029 (Russia), and 3041 (Japan) exhibited a greater degree of pathogenicity against all L. dispar strains than LdMNPV from a sample of Gypchek. This study provides additional information on the genetic diversity of LdMNPV isolates and their activity against the Asian gypsy moth, a potential invasive pest of North American trees and forests. Published by Elsevier Inc.

  7. saSNP Approach for Scalable SNP Analyses of Multiple Bacterial or Viral Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, Shea [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Slezak, Tom [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2010-07-27

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs. The method is fast to compute, finding SNPs and building a SNP phylogeny in seconds to hours. We use it to identify thousands of putative SNPs from all publicly available Filoviridae, Poxviridae, foot-and-mouth disease virus, Bacillus, and Escherichia coli genomes and plasmids. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle as input hundreds of gigabases of sequence in a single run. The algorithm is based on k-mer analysis using a suffix array, so we call it saSNP.

  8. Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa

    Directory of Open Access Journals (Sweden)

    Walker M Andrew

    2006-09-01

    Full Text Available Abstract Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c, 54 (Dixon, 83 (Ann1 and 9 (Temecula-1. A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes

  9. mEBT: multiple-matching Evidence-based Translator of Murine Genomic Responses for Human Immunity Studies.

    Science.gov (United States)

    Tae, Donghyun; Seok, Junhee

    2018-05-29

    In this paper, we introduce multiple-matching Evidence-based Translator (mEBT) to discover genomic responses from murine expression data for human immune studies, which are significant in the given condition of mice and likely have similar responses in the corresponding condition of human. mEBT is evaluated over multiple data sets and shows improved inter-species agreement. mEBT is expected to be useful for research groups who use murine models to study human immunity. http://cdal.korea.ac.kr/mebt/. jseok14@korea.ac.kr. Supplementary data are available at Bioinformatics online.

  10. Analysis of Multiple Genomic Sequence Alignments: A Web Resource, Online Tools, and Lessons Learned From Analysis of Mammalian SCL Loci

    Science.gov (United States)

    Chapman, Michael A.; Donaldson, Ian J.; Gilbert, James; Grafham, Darren; Rogers, Jane; Green, Anthony R.; Göttgens, Berthold

    2004-01-01

    Comparative analysis of genomic sequences is becoming a standard technique for studying gene regulation. However, only a limited number of tools are currently available for the analysis of multiple genomic sequences. An extensive data set for the testing and training of such tools is provided by the SCL gene locus. Here we have expanded the data set to eight vertebrate species by sequencing the dog SCL locus and by annotating the dog and rat SCL loci. To provide a resource for the bioinformatics community, all SCL sequences and functional annotations, comprising a collation of the extensive experimental evidence pertaining to SCL regulation, have been made available via a Web server. A Web interface to new tools specifically designed for the display and analysis of multiple sequence alignments was also implemented. The unique SCL data set and new sequence comparison tools allowed us to perform a rigorous examination of the true benefits of multiple sequence comparisons. We demonstrate that multiple sequence alignments are, overall, superior to pairwise alignments for identification of mammalian regulatory regions. In the search for individual transcription factor binding sites, multiple alignments markedly increase the signal-to-noise ratio compared to pairwise alignments. PMID:14718377

  11. Multiple Whole Genome Alignments and Novel Biomedical Applicationsat the VISTA Portal

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Minovitsky, Simon; Ratnere,Igor; Dubchak, Inna

    2007-02-01

    The VISTA portal for comparative genomics is designed togive biomedical scientists a unified set of tools to lead them from theraw DNA sequences through the alignment and annotation to thevisualization of the results. The VISTA portal also hosts alignments of anumber of genomes computed by our group, allowing users to study regionsof their interest without having to manually download the individualsequences. Here we describe various algorithmic and functionalimprovements implemented in the VISTA portal over the last two years. TheVISTA Portal is accessible at http://genome.lbl.gov/vista.

  12. Genome-wide association identifies multiple genomic regions associated with susceptibility to and control of ovine lentivirus.

    Directory of Open Access Journals (Sweden)

    Stephen N White

    Full Text Available BACKGROUND: Like human immunodeficiency virus (HIV, ovine lentivirus (OvLV is macrophage-tropic and causes lifelong infection. OvLV infects one quarter of U.S. sheep and induces pneumonia and body condition wasting. There is no vaccine to prevent OvLV infection and no cost-effective treatment for infected animals. However, breed differences in prevalence and proviral concentration have indicated a genetic basis for susceptibility to OvLV. A recent study identified TMEM154 variants in OvLV susceptibility. The objective here was to identify additional loci associated with odds and/or control of OvLV infection. METHODOLOGY/PRINCIPAL FINDINGS: This genome-wide association study (GWAS included 964 sheep from Rambouillet, Polypay, and Columbia breeds with serological status and proviral concentration phenotypes. Analytic models accounted for breed and age, as well as genotype. This approach identified TMEM154 (nominal P=9.2×10(-7; empirical P=0.13, provided 12 additional genomic regions associated with odds of infection, and provided 13 regions associated with control of infection (all nominal P<1 × 10(-5. Rapid decline of linkage disequilibrium with distance suggested many regions included few genes each. Genes in regions associated with odds of infection included DPPA2/DPPA4 (empirical P=0.006, and SYTL3 (P=0.051. Genes in regions associated with control of infection included a zinc finger cluster (ZNF192, ZSCAN16, ZNF389, and ZNF165; P=0.001, C19orf42/TMEM38A (P=0.047, and DLGAP1 (P=0.092. CONCLUSIONS/SIGNIFICANCE: These associations provide targets for mutation discovery in sheep susceptibility to OvLV. Aside from TMEM154, these genes have not been associated previously with lentiviral infection in any species, to our knowledge. Further, data from other species suggest functional hypotheses for future testing of these genes in OvLV and other lentiviral infections. Specifically, SYTL3 binds and may regulate RAB27A, which is required for enveloped

  13. Natural selection affects multiple aspects of genetic variation at putatively peutral sites across the human genome

    DEFF Research Database (Denmark)

    Lohmueller, Kirk E; Albrechtsen, Anders; Li, Yingrui

    2011-01-01

    A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries...... these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination...... and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations...

  14. Prostate cancer risk locus at 8q24 as a regulatory hub by physical interactions with multiple genomic loci across the genome.

    Science.gov (United States)

    Du, Meijun; Yuan, Tiezheng; Schilter, Kala F; Dittmar, Rachel L; Mackinnon, Alexander; Huang, Xiaoyi; Tschannen, Michael; Worthey, Elizabeth; Jacob, Howard; Xia, Shu; Gao, Jianzhong; Tillmans, Lori; Lu, Yan; Liu, Pengyuan; Thibodeau, Stephen N; Wang, Liang

    2015-01-01

    Chromosome 8q24 locus contains regulatory variants that modulate genetic risk to various cancers including prostate cancer (PC). However, the biological mechanism underlying this regulation is not well understood. Here, we developed a chromosome conformation capture (3C)-based multi-target sequencing technology and systematically examined three PC risk regions at the 8q24 locus and their potential regulatory targets across human genome in six cell lines. We observed frequent physical contacts of this risk locus with multiple genomic regions, in particular, inter-chromosomal interaction with CD96 at 3q13 and intra-chromosomal interaction with MYC at 8q24. We identified at least five interaction hot spots within the predicted functional regulatory elements at the 8q24 risk locus. We also found intra-chromosomal interaction genes PVT1, FAM84B and GSDMC and inter-chromosomal interaction gene CXorf36 in most of the six cell lines. Other gene regions appeared to be cell line-specific, such as RRP12 in LNCaP, USP14 in DU-145 and SMIN3 in lymphoblastoid cell line. We further found that the 8q24 functional domains more likely interacted with genomic regions containing genes enriched in critical pathways such as Wnt signaling and promoter motifs such as E2F1 and TCF3. This result suggests that the risk locus may function as a regulatory hub by physical interactions with multiple genes important for prostate carcinogenesis. Further understanding genetic effect and biological mechanism of these chromatin interactions will shed light on the newly discovered regulatory role of the risk locus in PC etiology and progression. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. Species-independent identification of known and novel recurrent genomic entities in multiple cancer patients

    DEFF Research Database (Denmark)

    Friis-Nielsen, Jens; Gonzalez-Izarzugaza, Jose Maria; Brunak, Søren

    2016-01-01

    Here we present a new method for the identification of recurrent genomic entities that play a causative role in the onset of disease. Our approach is particularly amenable for the analyses highthroughput sequencing data.......Here we present a new method for the identification of recurrent genomic entities that play a causative role in the onset of disease. Our approach is particularly amenable for the analyses highthroughput sequencing data....

  16. An Integrative Bioinformatics Framework for Genome-scale Multiple Level Network Reconstruction of Rice

    Directory of Open Access Journals (Sweden)

    Liu Lili

    2013-06-01

    Full Text Available Understanding how metabolic reactions translate the genome of an organism into its phenotype is a grand challenge in biology. Genome-wide association studies (GWAS statistically connect genotypes to phenotypes, without any recourse to known molecular interactions, whereas a molecular mechanistic description ties gene function to phenotype through gene regulatory networks (GRNs, protein-protein interactions (PPIs and molecular pathways. Integration of different regulatory information levels of an organism is expected to provide a good way for mapping genotypes to phenotypes. However, the lack of curated metabolic model of rice is blocking the exploration of genome-scale multi-level network reconstruction. Here, we have merged GRNs, PPIs and genome-scale metabolic networks (GSMNs approaches into a single framework for rice via omics’ regulatory information reconstruction and integration. Firstly, we reconstructed a genome-scale metabolic model, containing 4,462 function genes, 2,986 metabolites involved in 3,316 reactions, and compartmentalized into ten subcellular locations. Furthermore, 90,358 pairs of protein-protein interactions, 662,936 pairs of gene regulations and 1,763 microRNA-target interactions were integrated into the metabolic model. Eventually, a database was developped for systematically storing and retrieving the genome-scale multi-level network of rice. This provides a reference for understanding genotype-phenotype relationship of rice, and for analysis of its molecular regulatory network.

  17. Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

    Directory of Open Access Journals (Sweden)

    Emmanouil A Trantas

    2015-08-01

    Full Text Available The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor and P. mediterranea (Pmed, are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for commercially significant chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of a type III secretion system and of known type III effectors from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes.

  18. Novel and rare functional genomic variants in multiple autoimmune syndrome and Sjögren's syndrome.

    Science.gov (United States)

    Johar, Angad S; Mastronardi, Claudio; Rojas-Villarraga, Adriana; Patel, Hardip R; Chuah, Aaron; Peng, Kaiman; Higgins, Angela; Milburn, Peter; Palmer, Stephanie; Silva-Lara, Maria Fernanda; Velez, Jorge I; Andrews, Dan; Field, Matthew; Huttley, Gavin; Goodnow, Chris; Anaya, Juan-Manuel; Arcos-Burgos, Mauricio

    2015-06-02

    Multiple autoimmune syndrome (MAS), an extreme phenotype of autoimmune disorders, is a very well suited trait to tackle genomic variants of these conditions. Whole exome sequencing (WES) is a widely used strategy for detection of protein coding and splicing variants associated with inherited diseases. The DNA of eight patients affected by MAS [all of whom presenting with Sjögren's syndrome (SS)], four patients affected by SS alone and 38 unaffected individuals, were subject to WES. Filters to identify novel and rare functional (pathogenic-deleterious) homozygous and/or compound heterozygous variants in these patients and controls were applied. Bioinformatics tools such as the Human gene connectome as well as pathway and network analysis were applied to test overrepresentation of genes harbouring these variants in critical pathways and networks involved in autoimmunity. Eleven novel and rare functional variants were identified in cases but not in controls, harboured in: MACF1, KIAA0754, DUSP12, ICA1, CELA1, LRP1/STAT6, GRIN3B, ANKLE1, TMEM161A, and FKRP. These were subsequently subject to network analysis and their functional relatedness to genes already associated with autoimmunity was evaluated. Notably, the LRP1/STAT6 novel mutation was homozygous in one MAS affected patient and heterozygous in another. LRP1/STAT6 disclosed the strongest plausibility for autoimmunity. LRP1/STAT6 are involved in extracellular and intracellular anti-inflammatory pathways that play key roles in maintaining the homeostasis of the immune system. Further; networks, pathways, and interaction analyses showed that LRP1 is functionally related to the HLA-B and IL10 genes and it has a substantial impact within immunological pathways and/or reaction to bacterial and other foreign proteins (phagocytosis, regulation of phospholipase A2 activity, negative regulation of apoptosis and response to lipopolysaccharides). Further, ICA1 and STAT6 were also closely related to AIRE and IRF5, two very

  19. Genome-Wide Identification and Expression Analysis of WRKY Transcription Factors under Multiple Stresses in Brassica napus.

    Science.gov (United States)

    He, Yajun; Mao, Shaoshuai; Gao, Yulong; Zhu, Liying; Wu, Daoming; Cui, Yixin; Li, Jiana; Qian, Wei

    2016-01-01

    WRKY transcription factors play important roles in responses to environmental stress stimuli. Using a genome-wide domain analysis, we identified 287 WRKY genes with 343 WRKY domains in the sequenced genome of Brassica napus, 139 in the A sub-genome and 148 in the C sub-genome. These genes were classified into eight groups based on phylogenetic analysis. In the 343 WRKY domains, a total of 26 members showed divergence in the WRKY domain, and 21 belonged to group I. This finding suggested that WRKY genes in group I are more active and variable compared with genes in other groups. Using genome-wide identification and analysis of the WRKY gene family in Brassica napus, we observed genome duplication, chromosomal/segmental duplications and tandem duplication. All of these duplications contributed to the expansion of the WRKY gene family. The duplicate segments that were detected indicated that genome duplication events occurred in the two diploid progenitors B. rapa and B. olearecea before they combined to form B. napus. Analysis of the public microarray database and EST database for B. napus indicated that 74 WRKY genes were induced or preferentially expressed under stress conditions. According to the public QTL data, we identified 77 WRKY genes in 31 QTL regions related to various stress tolerance. We further evaluated the expression of 26 BnaWRKY genes under multiple stresses by qRT-PCR. Most of the genes were induced by low temperature, salinity and drought stress, indicating that the WRKYs play important roles in B. napus stress responses. Further, three BnaWRKY genes were strongly responsive to the three multiple stresses simultaneously, which suggests that these 3 WRKY may have multi-functional roles in stress tolerance and can potentially be used in breeding new rapeseed cultivars. We also found six tandem repeat pairs exhibiting similar expression profiles under the various stress conditions, and three pairs were mapped in the stress related QTL regions

  20. Genome-Wide Identification and Expression Analysis of WRKY Transcription Factors under Multiple Stresses in Brassica napus.

    Directory of Open Access Journals (Sweden)

    Yajun He

    Full Text Available WRKY transcription factors play important roles in responses to environmental stress stimuli. Using a genome-wide domain analysis, we identified 287 WRKY genes with 343 WRKY domains in the sequenced genome of Brassica napus, 139 in the A sub-genome and 148 in the C sub-genome. These genes were classified into eight groups based on phylogenetic analysis. In the 343 WRKY domains, a total of 26 members showed divergence in the WRKY domain, and 21 belonged to group I. This finding suggested that WRKY genes in group I are more active and variable compared with genes in other groups. Using genome-wide identification and analysis of the WRKY gene family in Brassica napus, we observed genome duplication, chromosomal/segmental duplications and tandem duplication. All of these duplications contributed to the expansion of the WRKY gene family. The duplicate segments that were detected indicated that genome duplication events occurred in the two diploid progenitors B. rapa and B. olearecea before they combined to form B. napus. Analysis of the public microarray database and EST database for B. napus indicated that 74 WRKY genes were induced or preferentially expressed under stress conditions. According to the public QTL data, we identified 77 WRKY genes in 31 QTL regions related to various stress tolerance. We further evaluated the expression of 26 BnaWRKY genes under multiple stresses by qRT-PCR. Most of the genes were induced by low temperature, salinity and drought stress, indicating that the WRKYs play important roles in B. napus stress responses. Further, three BnaWRKY genes were strongly responsive to the three multiple stresses simultaneously, which suggests that these 3 WRKY may have multi-functional roles in stress tolerance and can potentially be used in breeding new rapeseed cultivars. We also found six tandem repeat pairs exhibiting similar expression profiles under the various stress conditions, and three pairs were mapped in the stress related

  1. Preparation of genomic DNA from a single species of uncultured magnetotactic bacterium by multiple-displacement amplification.

    Science.gov (United States)

    Arakaki, Atsushi; Shibusawa, Mie; Hosokawa, Masahito; Matsunaga, Tadashi

    2010-03-01

    Magnetotactic bacteria comprise a phylogenetically diverse group that is capable of synthesizing intracellular magnetic particles. Although various morphotypes of magnetotactic bacteria have been observed in the environment, bacterial strains available in pure culture are currently limited to a few genera due to difficulties in their enrichment and cultivation. In order to obtain genetic information from uncultured magnetotactic bacteria, a genome preparation method that involves magnetic separation of cells, flow cytometry, and multiple displacement amplification (MDA) using phi29 polymerase was used in this study. The conditions for the MDA reaction using samples containing 1 to 100 cells were evaluated using a pure-culture magnetotactic bacterium, "Magnetospirillum magneticum AMB-1," whose complete genome sequence is available. Uniform gene amplification was confirmed by quantitative PCR (Q-PCR) when 100 cells were used as a template. This method was then applied for genome preparation of uncultured magnetotactic bacteria from complex bacterial communities in an aquatic environment. A sample containing 100 cells of the uncultured magnetotactic coccus was prepared by magnetic cell separation and flow cytometry and used as an MDA template. 16S rRNA sequence analysis of the MDA product from these 100 cells revealed that the amplified genomic DNA was from a single species of magnetotactic bacterium that was phylogenetically affiliated with magnetotactic cocci in the Alphaproteobacteria. The combined use of magnetic separation, flow cytometry, and MDA provides a new strategy to access individual genetic information from magnetotactic bacteria in environmental samples.

  2. Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

    Science.gov (United States)

    Trantas, Emmanouil A.; Licciardello, Grazia; Almeida, Nalvo F.; Witek, Kamil; Strano, Cinzia P.; Duxbury, Zane; Ververidis, Filippos; Goumas, Dimitrios E.; Jones, Jonathan D. G.; Guttman, David S.; Catara, Vittoria; Sarris, Panagiotis F.

    2015-01-01

    The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor) and P. mediterranea (Pmed), are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for genes that encode proteins involved in commercially important chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of type III secretion system and known type III effector-encoding genes from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes. Genome-mining also revealed the presence of gene clusters for biosynthesis of siderophores, polyketides, non-ribosomal peptides, and hydrogen cyanide. A highly conserved quorum sensing system was detected in all strains, although species specific differences were observed. Our study provides the basis for in-depth investigations regarding the molecular mechanisms underlying virulence strategies in the battle between plants and microbes. PMID:26300874

  3. Genomic resources for multiple species in the Drosophila ananassae species group.

    Science.gov (United States)

    Signor, Sarah; Seher, Thaddeus; Kopp, Artyom

    2013-01-01

    The development of genomic resources in non-model taxa is essential for understanding the genetic basis of biological diversity. Although the genomes of many Drosophila species have been sequenced, most of the phenotypic diversity in this genus remains to be explored. To facilitate the genetic analysis of interspecific and intraspecific variation, we have generated new genomic resources for seven species and subspecies in the D. ananassae species subgroup. We have generated large amounts of transcriptome sequence data for D. ercepeae, D. merina, D. bipectinata, D. malerkotliana malerkotliana, D. m. pallens, D. pseudoananassae pseudoananassae, and D. p. nigrens. de novo assembly resulted in contigs covering more than half of the predicted transcriptome and matching an average of 59% of annotated genes in the complete genome of D. ananassae. Most contigs, corresponding to an average of 49% of D. ananassae genes, contain sequence polymorphisms that can be used as genetic markers. Subsets of these markers were validated by genotyping the progeny of inter- and intraspecific crosses. The ananassae subgroup is an excellent model system for examining the molecular basis of speciation and phenotypic evolution. The new genomic resources will facilitate the genetic analysis of inter- and intraspecific differences in this lineage. Transcriptome sequencing provides a simple and cost-effective way to identify molecular markers at nearly single-gene density, and is equally applicable to any non-model taxa.

  4. Genome-wide meta-analyses identify multiple loci associated with smoking behavior

    NARCIS (Netherlands)

    H. Furberg (Helena); Y. Kim (Yunjung); J. Dackor (Jennifer); E.A. Boerwinkle (Eric); N. Franceschini (Nora); D. Ardissino (Diego); L. Bernardinelli (Luisa); P.M. Mannucci (Pier); F. Mauri (Francesco); P.A. Merlini (Piera); D. Absher (Devin); T.L. Assimes (Themistocles); S.P. Fortmann (Stephen); C. Iribarren (Carlos); J.W. Knowles (Joshua); T. Quertermous (Thomas); L. Ferrucci (Luigi); T. Tanaka (Toshiko); J.C. Bis (Joshua); T. Haritunians (Talin); B. McKnight (Barbara); B.M. Psaty (Bruce); K.D. Taylor (Kent); E.L. Thacker (Evan); P. Almgren (Peter); L. Groop (Leif); C. Ladenvall (Claes); M. Boehnke (Michael); A.U. Jackson (Anne); K.L. Mohlke (Karen); H.M. Stringham (Heather); J. Tuomilehto (Jaakko); E.J. Benjamin (Emelia); S.J. Hwang; D. Levy (Daniel); S.R. Preis; R.S. Vasan (Ramachandran Srini); J. Duan (Jubao); P.V. Gejman (Pablo); D.F. Levinson (Douglas); A.R. Sanders (Alan); J. Shi (Jianxin); E.H. Lips (Esther); J.D. McKay (James); A. Agudo (Antonio); L. Barzan (Luigi); V. Bencko (Vladimir); S. Benhamou (Simone); X. Castellsagué (Xavier); C. Canova (Cristina); D.I. Conway (David); E. Fabianova (Eleonora); L. Foretova (Lenka); V. Janout (Vladimir); C.M. Healy (Claire); I. Holcátová (Ivana); K. Kjaerheim (Kristina); P. Lagiou; J. Lissowska (Jolanta); R. Lowry (Ray); T.V. MacFarlane (Tatiana); D. Mates (Dana); L. Richiardi (Lorenzo); P. Rudnai (Peter); N. Szeszenia-Dabrowska (Neonilia); D. Zaridze; A. Znaor (Ariana); M. Lathrop (Mark); P. Brennan (Paul); S. Bandinelli (Stefania); T.M. Frayling (Timothy); J.M. Guralnik (Jack); Y. Milaneschi (Yuri); J.R.B. Perry (John); D. Altshuler (David); R. Elosua (Roberto); S. Kathiresan (Sekar); G. Lucas (Gavin); O. Melander (Olle); V. Salomaa (Veikko); S.M. Schwartz (Stephen); B.F. Voight (Benjamin); B.W.J.H. Penninx (Brenda); J.H. Smit (Johannes); N. Vogelzangs (Nicole); D.I. Boomsma (Dorret); E.J.C. de Geus (Eco); J.M. Vink (Jacqueline); G.A.H.M. Willemsen (Gonneke); S.J. Chanock (Stephen); F. Gu (Fangyi); S.E. Hankinson (Susan); D. Hunter (David); A. Hofman (Albert); H.W. Tiemeier (Henning); A.G. Uitterlinden (André); P. Tikka-Kleemola (Päivi); S. Walter (Stefan); D.I. Chasman (Daniel); B.M. Everett (Brendan); G. Pare (Guillaume); P.M. Ridker (Paul); M.D. Li (Ming); H.H. Maes (Hermine); J. Audrain-Mcgovern (Janet); D. Posthuma (Danielle); L.M. Thornton (Laura); C. Lerman (Caryn); J. Kaprio (Jaakko); J.E. Rose (Jed); J.P.A. Ioannidis (John); P. Kraft (Peter); D.Y. Lin (Dan); P.F. Sullivan (Patrick); C.J. O'Donnell (Christopher)

    2010-01-01

    textabstractConsistent but indirect evidence has implicated genetic factors in smoking behavior. We report meta-analyses of several smoking phenotypes within cohorts of the Tobacco and Genetics Consortium (n = 74,053). We also partnered with the European Network of Genetic and Genomic Epidemiology

  5. Meta-analysis of genome-wide association studies discovers multiple loci for chronic lymphocytic leukemia

    NARCIS (Netherlands)

    Berndt, Sonja I; Camp, Nicola J; Skibola, Christine F; Vijai, Joseph; Wang, Zhaoming; Gu, Jian; Nieters, Alexandra; Kelly, Rachel S; Smedby, Karin E; Monnereau, Alain; Cozen, Wendy; Cox, Angela; Wang, Sophia S; Lan, Qing; Teras, Lauren R; Machado, Moara; Yeager, Meredith; Brooks-Wilson, Angela R; Hartge, Patricia; Purdue, Mark P; Birmann, Brenda M; Vajdic, Claire M; Cocco, Pierluigi; Zhang, Yawei; Giles, Graham G; Zeleniuch-Jacquotte, Anne; Lawrence, Charles; Montalvan, Rebecca; Burdett, Laurie; Hutchinson, Amy; Ye, Yuanqing; Call, Timothy G; Shanafelt, Tait D; Novak, Anne J; Kay, Neil E; Liebow, Mark; Cunningham, Julie M; Allmer, Cristine; Hjalgrim, Henrik; Adami, Hans-Olov; Melbye, Mads; Glimelius, Bengt; Chang, Ellen T; Glenn, Martha; Curtin, Karen; Cannon-Albright, Lisa A; Diver, W Ryan; Link, Brian K; Weiner, George J; Conde, Lucia; Bracci, Paige M; Riby, Jacques; Arnett, Donna K; Zhi, Degui; Leach, Justin M; Holly, Elizabeth A; Jackson, Rebecca D; Tinker, Lesley F; Benavente, Yolanda; Sala, Núria; Casabonne, Delphine; Becker, Nikolaus; Boffetta, Paolo; Brennan, Paul; Foretova, Lenka; Maynadie, Marc; McKay, James; Staines, Anthony; Chaffee, Kari G; Achenbach, Sara J; Vachon, Celine M; Goldin, Lynn R; Strom, Sara S; Leis, Jose F; Weinberg, J Brice; Caporaso, Neil E; Norman, Aaron D; De Roos, Anneclaire J; Morton, Lindsay M; Severson, Richard K; Riboli, Elio; Vineis, Paolo; Kaaks, Rudolph; Masala, Giovanna; Weiderpass, Elisabete; Chirlaque, María-Dolores; Vermeulen, Roel C H|info:eu-repo/dai/nl/216532620; Travis, Ruth C; Southey, Melissa C; Milne, Roger L; Albanes, Demetrius; Virtamo, Jarmo; Weinstein, Stephanie; Clavel, Jacqueline; Zheng, Tongzhang; Holford, Theodore R; Villano, Danylo J; Maria, Ann; Spinelli, John J; Gascoyne, Randy D; Connors, Joseph M; Bertrand, Kimberly A; Giovannucci, Edward; Kraft, Peter; Kricker, Anne; Turner, Jenny; Ennas, Maria Grazia; Ferri, Giovanni M; Miligi, Lucia; Liang, Liming; Ma, Baoshan; Huang, Jinyan; Crouch, Simon; Park, Ju-Hyun; Chatterjee, Nilanjan; North, Kari E; Snowden, John A; Wright, Josh; Fraumeni, Joseph F; Offit, Kenneth; Wu, Xifeng; de Sanjose, Silvia; Cerhan, James R; Chanock, Stephen J; Rothman, Nathaniel; Slager, Susan L

    2016-01-01

    Chronic lymphocytic leukemia (CLL) is a common lymphoid malignancy with strong heritability. To further understand the genetic susceptibility for CLL and identify common loci associated with risk, we conducted a meta-analysis of four genome-wide association studies (GWAS) composed of 3,100 cases and

  6. Genome-wide meta-analysis identifies multiple novel associations and ethnic heterogeneity of psoriasis susceptibility

    NARCIS (Netherlands)

    Yin, Xianyong; Low, Hui Qi; Wang, Ling; Li, Yonghong; Ellinghaus, Eva; Han, Jiali; Estivill, Xavier; Sun, Liangdan; Zuo, Xianbo; Shen, Changbing; Zhu, Caihong; Zhang, Anping; Sanchez, Fabio; Padyukov, Leonid; Catanese, Joseph J; Krueger, Gerald G; Duffin, Kristina Callis; Mucha, Sören; Weichenthal, Michael; Weidinger, Stephan; Lieb, Wolfgang; Foo, Jia Nee; Li, Yi; Sim, Karseng; Liany, Herty; Irwan, Ishak; Teo, Yikying; Theng, Colin T S; Gupta, Rashmi; Bowcock, Anne; De Jager, Philip L; Qureshi, Abrar A; de Bakker, Paul I W; Seielstad, Mark; Liao, Wilson; Ståhle, Mona; Franke, Andre; Zhang, Xuejun; Liu, Jianjun

    2015-01-01

    Psoriasis is a common inflammatory skin disease with complex genetics and different degrees of prevalence across ethnic populations. Here we present the largest trans-ethnic genome-wide meta-analysis (GWMA) of psoriasis in 15,369 cases and 19,517 controls of Caucasian and Chinese ancestries. We

  7. Genome-wide association study identifies multiple susceptibility loci for diffuse large B cell lymphoma

    NARCIS (Netherlands)

    Cerhan, James R.; Berndt, Sonja I.; Vijai, Joseph; Ghesquières, Hervé; McKay, James; Wang, Sophia S.; Wang, Zhaoming; Yeager, Meredith; Conde, Lucia; De Bakker, Paul I W; Nieters, Alexandra; Cox, David; Burdett, Laurie; Monnereau, Alain; Flowers, Christopher R.; De Roos, Anneclaire J.; Brooks-Wilson, Angela R.; Lan, Qing; Severi, Gianluca; Melbye, Mads; Gu, Jian; Jackson, Rebecca D.; Kane, Eleanor; Teras, Lauren R.; Purdue, Mark P.; Vajdic, Claire M.; Spinelli, John J.; Giles, Graham G.; Albanes, Demetrius; Kelly, Rachel S.; Zucca, Mariagrazia; Bertrand, Kimberly A.; Zeleniuch-Jacquotte, Anne; Lawrence, Charles; Hutchinson, Amy; Zhi, Degui; Habermann, Thomas M.; Link, Brian K.; Novak, Anne J.; Dogan, Ahmet; Asmann, Yan W.; Liebow, Mark; Thompson, Carrie A.; Ansell, Stephen M.; Witzig, Thomas E.; Weiner, George J.; Veron, Amelie S.; Zelenika, Diana; Tilly, Hervé; Haioun, Corinne; Molina, Thierry Jo; Hjalgrim, Henrik; Glimelius, Bengt; Adami, Hans Olov; Bracci, Paige M.; Riby, Jacques; Smith, Martyn T.; Holly, Elizabeth A.; Cozen, Wendy; Hartge, Patricia; Morton, Lindsay M.; Severson, Richard K.; Tinker, Lesley F.; North, Kari E.; Becker, Nikolaus; Benavente, Yolanda; Boffetta, Paolo; Brennan, Paul; Foretova, Lenka; Maynadie, Marc; Staines, Anthony; Lightfoot, Tracy; Crouch, Simon; Smith, Alex; Roman, Eve; Diver, W. Ryan; Offit, Kenneth; Zelenetz, Andrew; Klein, Robert J.; Villano, Danylo J.; Zheng, Tongzhang; Zhang, Yawei; Holford, Theodore R.; Kricker, Anne; Turner, Jenny; Southey, Melissa C.; Clavel, Jacqueline; Virtamo, Jarmo; Weinstein, Stephanie; Riboli, Elio; Vineis, Paolo; Kaaks, Rudolph; Trichopoulos, Dimitrios; Vermeulen, Roel C H; Boeing, Heiner; Tjonneland, Anne; Angelucci, Emanuele; Di Lollo, Simonetta; Rais, Marco; Birmann, Brenda M.; Laden, Francine; Giovannucci, Edward; Kraft, Peter; Huang, Jinyan; Ma, Baoshan; Ye, Yuanqing; Chiu, Brian C H; Sampson, Joshua; Liang, Liming; Park, Ju Hyun; Chung, Charles C.; Weisenburger, Dennis D.; Chatterjee, Nilanjan; Fraumeni, Joseph F.; Slager, Susan L.; Wu, Xifeng; De Sanjose, Silvia; Smedby, Karin E.; Salles, Gilles; Skibola, Christine F.; Rothman, Nathaniel; Chanock, Stephen J.

    2014-01-01

    Diffuse large B cell lymphoma (DLBCL) is the most common lymphoma subtype and is clinically aggressive. To identify genetic susceptibility loci for DLBCL, we conducted a meta-analysis of 3 new genome-wide association studies (GWAS) and 1 previous scan, totaling 3,857 cases and 7,666 controls of

  8. Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function

    NARCIS (Netherlands)

    D.B. Hancock (Dana); M. Eijgelsheim (Mark); J.B. Wilk (Jemma); S.A. Gharib (Sina); L.R. Loehr (Laura); K. Marciante (Kristin); N. Franceschini (Nora); Y.M.T.A. van Durme; T.H. Chen; R.G. Barr (Graham); M.B. Schabath (Matthew); D.J. Couper (David); G.G. Brusselle (Guy); B.M. Psaty (Bruce); P. Tikka-Kleemola (Päivi); J.I. Rotter (Jerome); A.G. Uitterlinden (André); A. Hofman (Albert); N.M. Punjabi (Naresh); F. Rivadeneira Ramirez (Fernando); A.C. Morrison (Alanna); P.L. Enright (Paul); K.E. North (Kari); S.R. Heckbert (Susan); T. Lumley (Thomas); B.H.Ch. Stricker (Bruno); G.T. O'Connor (George); S.J. London (Stephanie)

    2010-01-01

    textabstractSpirometric measures of lung function are heritable traits that reflect respiratory health and predict morbidity and mortality. We meta-analyzed genome-wide association studies for two clinically important lung-function measures: forced expiratory volume in the first second (FEV1) and

  9. H2DB: a heritability database across multiple species by annotating trait-associated genomic loci.

    Science.gov (United States)

    Kaminuma, Eli; Fujisawa, Takatomo; Tanizawa, Yasuhiro; Sakamoto, Naoko; Kurata, Nori; Shimizu, Tokurou; Nakamura, Yasukazu

    2013-01-01

    H2DB (http://tga.nig.ac.jp/h2db/), an annotation database of genetic heritability estimates for humans and other species, has been developed as a knowledge database to connect trait-associated genomic loci. Heritability estimates have been investigated for individual species, particularly in human twin studies and plant/animal breeding studies. However, there appears to be no comprehensive heritability database for both humans and other species. Here, we introduce an annotation database for genetic heritabilities of various species that was annotated by manually curating online public resources in PUBMED abstracts and journal contents. The proposed heritability database contains attribute information for trait descriptions, experimental conditions, trait-associated genomic loci and broad- and narrow-sense heritability specifications. Annotated trait-associated genomic loci, for which most are single-nucleotide polymorphisms derived from genome-wide association studies, may be valuable resources for experimental scientists. In addition, we assigned phenotype ontologies to the annotated traits for the purposes of discussing heritability distributions based on phenotypic classifications.

  10. Unexpected inheritance: multiple integrations of ancient bornavirus and ebolavirus/marburgvirus sequences in vertebrate genomes.

    Science.gov (United States)

    Belyi, Vladimir A; Levine, Arnold J; Skalka, Anna Marie

    2010-07-29

    Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected), later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important biological

  11. Unexpected inheritance: multiple integrations of ancient bornavirus and ebolavirus/marburgvirus sequences in vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Vladimir A Belyi

    2010-07-01

    Full Text Available Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected, later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important

  12. Insights on Genomic and Molecular Alterations in Multiple Myeloma and Their Incorporation towards Risk-Adapted Treatment Strategy: Concise Clinical Review

    Directory of Open Access Journals (Sweden)

    Taiga Nishihori

    2017-01-01

    Full Text Available Although recent advances in novel treatment approaches and therapeutics have shifted the treatment landscape of multiple myeloma, it remains an incurable plasma cell malignancy. Growing knowledge of the genome and expressed genomic information characterizing the biologic behavior of multiple myeloma continues to accumulate. However, translation and incorporation of vast molecular understanding of complex tumor biology to deliver personalized and precision treatment to cure multiple myeloma have not been successful to date. Our review focuses on current evidence and understanding of myeloma biology with characterization in the context of genomic and molecular alterations. We also discuss future clinical application of the genomic and molecular knowledge, and more translational research is needed to benefit our myeloma patients.

  13. Genome-wide SNP identification in multiple morphotypes of allohexaploid tall fescue (Festuca arundinacea Schreb

    Directory of Open Access Journals (Sweden)

    Hand Melanie L

    2012-06-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs provide essential tools for the advancement of research in plant genomics, and the development of SNP resources for many species has been accelerated by the capabilities of second-generation sequencing technologies. The current study aimed to develop and use a novel bioinformatic pipeline to generate a comprehensive collection of SNP markers within the agriculturally important pasture grass tall fescue; an outbreeding allopolyploid species displaying three distinct morphotypes: Continental, Mediterranean and rhizomatous. Results A bioinformatic pipeline was developed that successfully identified SNPs within genotypes from distinct tall fescue morphotypes, following the sequencing of 414 polymerase chain reaction (PCR – generated amplicons using 454 GS FLX technology. Equivalent amplicon sets were derived from representative genotypes of each morphotype, including six Continental, five Mediterranean and one rhizomatous. A total of 8,584 and 2,292 SNPs were identified with high confidence within the Continental and Mediterranean morphotypes respectively. The success of the bioinformatic approach was demonstrated through validation (at a rate of 70% of a subset of 141 SNPs using both SNaPshot™ and GoldenGate™ assay chemistries. Furthermore, the quantitative genotyping capability of the GoldenGate™ assay revealed that approximately 30% of the putative SNPs were accessible to co-dominant scoring, despite the hexaploid genome structure. The sub-genome-specific origin of each SNP validated from Continental tall fescue was predicted using a phylogenetic approach based on comparison with orthologous sequences from predicted progenitor species. Conclusions Using the appropriate bioinformatic approach, amplicon resequencing based on 454 GS FLX technology is an effective method for the identification of polymorphic SNPs within the genomes of Continental and Mediterranean tall fescue. The

  14. A bi-dimensional genome scan for prolificacy traits in pigs shows the existence of multiple epistatic QTL

    Directory of Open Access Journals (Sweden)

    Bidanel Jean P

    2009-12-01

    Full Text Available Abstract Background Prolificacy is the most important trait influencing the reproductive efficiency of pig production systems. The low heritability and sex-limited expression of prolificacy have hindered to some extent the improvement of this trait through artificial selection. Moreover, the relative contributions of additive, dominant and epistatic QTL to the genetic variance of pig prolificacy remain to be defined. In this work, we have undertaken this issue by performing one-dimensional and bi-dimensional genome scans for number of piglets born alive (NBA and total number of piglets born (TNB in a three generation Iberian by Meishan F2 intercross. Results The one-dimensional genome scan for NBA and TNB revealed the existence of two genome-wide highly significant QTL located on SSC13 (P SSC17 (P P P P P Conclusions The complex inheritance of prolificacy traits in pigs has been evidenced by identifying multiple additive (SSC13 and SSC17, dominant and epistatic QTL in an Iberian × Meishan F2 intercross. Our results demonstrate that a significant fraction of the phenotypic variance of swine prolificacy traits can be attributed to first-order gene-by-gene interactions emphasizing that the phenotypic effects of alleles might be strongly modulated by the genetic background where they segregate.

  15. Whole Genome Scan to Detect Chromosomal Regions Affecting Multiple Traits in Dairy Cattle

    NARCIS (Netherlands)

    Schrooten, C.; Bink, M.C.A.M.; Bovenhuis, H.

    2004-01-01

    Chromosomal regions affecting multiple traits ( multiple trait quantitative trait regions or MQR) in dairy cattle were detected using a method based on results from single trait analyses to detect quantitative trait loci (QTL). The covariance between contrasts for different traits in single trait

  16. Evolutionary changes of multiple visual pigment genes in the complete genome of Pacific bluefin tuna

    OpenAIRE

    Nakamura, Yoji; Mori, Kazuki; Saitoh, Kenji; Oshima, Kenshiro; Mekuchi, Miyuki; Sugaya, Takuma; Shigenobu, Yuya; Ojima, Nobuhiko; Muta, Shigeru; Fujiwara, Atushi; Yasuike, Motoshige; Oohara, Ichiro; Hirakawa, Hideki; Chowdhury, Vishwajit Sur; Kobayashi, Takanori

    2013-01-01

    Tunas are migratory fishes in offshore habitats and top predators with unique features. Despite their ecological importance and high market values, the open-ocean lifestyle of tuna, in which effective sensing systems such as color vision are required for capture of prey, has been poorly understood. To elucidate the genetic and evolutionary basis of optic adaptation of tuna, we determined the genome sequence of the Pacific bluefin tuna (Thunnus orientalis), using next-generation sequencing tec...

  17. From "Cellular" RNA to "Smart" RNA: Multiple Roles of RNA in Genome Stability and Beyond.

    Science.gov (United States)

    Michelini, Flavia; Jalihal, Ameya P; Francia, Sofia; Meers, Chance; Neeb, Zachary T; Rossiello, Francesca; Gioia, Ubaldo; Aguado, Julio; Jones-Weinert, Corey; Luke, Brian; Biamonti, Giuseppe; Nowacki, Mariusz; Storici, Francesca; Carninci, Piero; Walter, Nils G; Fagagna, Fabrizio d'Adda di

    2018-03-30

    Coding for proteins has been considered the main function of RNA since the "central dogma" of biology was proposed. The discovery of noncoding transcripts shed light on additional roles of RNA, ranging from the support of polypeptide synthesis, to the assembly of subnuclear structures, to gene expression modulation. Cellular RNA has therefore been recognized as a central player in often unanticipated biological processes, including genomic stability. This ever-expanding list of functions inspired us to think of RNA as a "smart" phone, which has replaced the older obsolete "cellular" phone. In this review, we summarize the last two decades of advances in research on the interface between RNA biology and genome stability. We start with an account of the emergence of noncoding RNA, and then we discuss the involvement of RNA in DNA damage signaling and repair, telomere maintenance, and genomic rearrangements. We continue with the depiction of single-molecule RNA detection techniques, and we conclude by illustrating the possibilities of RNA modulation in hopes of creating or improving new therapies. The widespread biological functions of RNA have made this molecule a reoccurring theme in basic and translational research, warranting it the transcendence from classically studied "cellular" RNA to "smart" RNA.

  18. The complete sequence of the first Spodoptera frugiperda Betabaculovirus genome: a natural multiple recombinant virus.

    Science.gov (United States)

    Cuartas, Paola E; Barrera, Gloria P; Belaich, Mariano N; Barreto, Emiliano; Ghiringhelli, Pablo D; Villamizar, Laura F

    2015-01-20

    Spodoptera frugiperda (Lepidoptera: Noctuidae) is a major pest in maize crops in Colombia, and affects several regions in America. A granulovirus isolated from S. frugiperda (SfGV VG008) has potential as an enhancer of insecticidal activity of previously described nucleopolyhedrovirus from the same insect species (SfMNPV). The SfGV VG008 genome was sequenced and analyzed showing circular double stranded DNA of 140,913 bp encoding 146 putative ORFs that include 37 Baculoviridae core genes, 88 shared with betabaculoviruses, two shared only with betabaculoviruses from Noctuide insects, two shared with alphabaculoviruses, three copies of own genes (paralogs) and the other 14 corresponding to unique genes without representation in the other baculovirus species. Particularly, the genome encodes for important virulence factors such as 4 chitinases and 2 enhancins. The sequence analysis revealed the existence of eight homologous regions (hrs) and also suggests processes of gene acquisition by horizontal transfer including the SfGV VG008 ORFs 046/047 (paralogs), 059, 089 and 099. The bioinformatics evidence indicates that the genome donors of mentioned genes could be alpha- and/or betabaculovirus species. The previous reported ability of SfGV VG008 to naturally co-infect the same host with other virus show a possible mechanism to capture genes and thus improve its fitness.

  19. Early modern human dispersal from Africa: genomic evidence for multiple waves of migration.

    Science.gov (United States)

    Tassi, Francesca; Ghirotto, Silvia; Mezzavilla, Massimo; Vilaça, Sibelle Torres; De Santi, Lisa; Barbujani, Guido

    2015-01-01

    Anthropological and genetic data agree in indicating the African continent as the main place of origin for anatomically modern humans. However, it is unclear whether early modern humans left Africa through a single, major process, dispersing simultaneously over Asia and Europe, or in two main waves, first through the Arab Peninsula into southern Asia and Oceania, and later through a northern route crossing the Levant. Here, we show that accurate genomic estimates of the divergence times between European and African populations are more recent than those between Australo-Melanesia and Africa and incompatible with the effects of a single dispersal. This difference cannot possibly be accounted for by the effects of either hybridization with archaic human forms in Australo-Melanesia or back migration from Europe into Africa. Furthermore, in several populations of Asia we found evidence for relatively recent genetic admixture events, which could have obscured the signatures of the earliest processes. We conclude that the hypothesis of a single major human dispersal from Africa appears hardly compatible with the observed historical and geographical patterns of genome diversity and that Australo-Melanesian populations seem still to retain a genomic signature of a more ancient divergence from Africa.

  20. The Complete Sequence of the First Spodoptera frugiperda Betabaculovirus Genome: A Natural Multiple Recombinant Virus

    Directory of Open Access Journals (Sweden)

    Paola E. Cuartas

    2015-01-01

    Full Text Available Spodoptera frugiperda (Lepidoptera: Noctuidae is a major pest in maize crops in Colombia, and affects several regions in America. A granulovirus isolated from S. frugiperda (SfGV VG008 has potential as an enhancer of insecticidal activity of previously described nucleopolyhedrovirus from the same insect species (SfMNPV. The SfGV VG008 genome was sequenced and analyzed showing circular double stranded DNA of 140,913 bp encoding 146 putative ORFs that include 37 Baculoviridae core genes, 88 shared with betabaculoviruses, two shared only with betabaculoviruses from Noctuide insects, two shared with alphabaculoviruses, three copies of own genes (paralogs and the other 14 corresponding to unique genes without representation in the other baculovirus species. Particularly, the genome encodes for important virulence factors such as 4 chitinases and 2 enhancins. The sequence analysis revealed the existence of eight homologous regions (hrs and also suggests processes of gene acquisition by horizontal transfer including the SfGV VG008 ORFs 046/047 (paralogs, 059, 089 and 099. The bioinformatics evidence indicates that the genome donors of mentioned genes could be alpha- and/or betabaculovirus species. The previous reported ability of SfGV VG008 to naturally co-infect the same host with other virus show a possible mechanism to capture genes and thus improve its fitness.

  1. Efficient genome-wide association in biobanks using topic modeling identifies multiple novel disease loci.

    Science.gov (United States)

    McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H

    2017-08-31

    Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that may be unreliable and fail to capture the relationship between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records (EHR) for 10845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes are included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p<1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than for single phenome-wide diagnostic codes, and incorporation of less strongly-loading diagnostic codes enhanced association. This strategy provides a more efficient means of phenome-wide association in biobanks with coded clinical data.

  2. The Arabidopsis thaliana Homolog of the Helicase RTEL1 Plays Multiple Roles in Preserving Genome Stability[C][W

    Science.gov (United States)

    Recker, Julia; Knoll, Alexander; Puchta, Holger

    2014-01-01

    In humans, mutations in the DNA helicase Regulator of Telomere Elongation Helicase1 (RTEL1) lead to Hoyeraal-Hreidarsson syndrome, a severe, multisystem disorder. Here, we demonstrate that the RTEL1 homolog in Arabidopsis thaliana plays multiple roles in preserving genome stability. RTEL1 suppresses homologous recombination in a pathway parallel to that of the DNA translocase FANCM. Cytological analyses of root meristems indicate that RTEL1 is involved in processing DNA replication intermediates independently from FANCM and the nuclease MUS81. Moreover, RTEL1 is involved in interstrand and intrastrand DNA cross-link repair independently from FANCM and (in intrastrand cross-link repair) parallel to MUS81. RTEL1 contributes to telomere homeostasis; the concurrent loss of RTEL1 and the telomerase TERT leads to rapid, severe telomere shortening, which occurs much more rapidly than it does in the single-mutant line tert, resulting in developmental arrest after four generations. The double mutant rtel1-1 recq4A-4 exhibits massive growth defects, indicating that this RecQ family helicase, which is also involved in the suppression of homologous recombination and the repair of DNA lesions, can partially replace RTEL1 in the processing of DNA intermediates. The requirement for RTEL1 in multiple pathways to preserve genome stability in plants can be explained by its putative role in the destabilization of DNA loop structures, such as D-loops and T-loops. PMID:25516598

  3. Genome-Wide Screening of Cytogenetic Abnormalities in Multiple Myeloma Patients Using Array-CGH Technique: A Czech Multicenter Experience

    Directory of Open Access Journals (Sweden)

    Jan Smetana

    2014-01-01

    Full Text Available Characteristic recurrent copy number aberrations (CNAs play a key role in multiple myeloma (MM pathogenesis and have important prognostic significance for MM patients. Array-based comparative genomic hybridization (aCGH provides a powerful tool for genome-wide classification of CNAs and thus should be implemented into MM routine diagnostics. We demonstrate the possibility of effective utilization of oligonucleotide-based aCGH in 91 MM patients. Chromosomal aberrations associated with effect on the prognosis of MM were initially evaluated by I-FISH and were found in 93.4% (85/91. Incidence of hyperdiploidy was 49.5% (45/91; del(13(q14 was detected in 57.1% (52/91; gain(1(q21 occurred in 58.2% (53/91; del(17(p13 was observed in 15.4% (14/91; and t(4;14(p16;q32 was found in 18.6% (16/86. Genome-wide screening using Agilent 44K aCGH microarrays revealed copy number alterations in 100% (91/91. Most common deletions were found at 13q (58.9%, 1p (39.6%, and 8p (31.1%, whereas gain of whole 1q was the most often duplicated region (50.6%. Furthermore, frequent homozygous deletions of genes playing important role in myeloma biology such as TRAF3, BIRC1/BIRC2, RB1, or CDKN2C were observed. Taken together, we demonstrated the utilization of aCGH technique in clinical diagnostics as powerful tool for identification of unbalanced genomic abnormalities with prognostic significance for MM patients.

  4. Quantitative genome re-sequencing defines multiple mutations conferring chloroquine resistance in rodent malaria

    Science.gov (United States)

    2012-01-01

    Background Drug resistance in the malaria parasite Plasmodium falciparum severely compromises the treatment and control of malaria. A knowledge of the critical mutations conferring resistance to particular drugs is important in understanding modes of drug action and mechanisms of resistances. They are required to design better therapies and limit drug resistance. A mutation in the gene (pfcrt) encoding a membrane transporter has been identified as a principal determinant of chloroquine resistance in P. falciparum, but we lack a full account of higher level chloroquine resistance. Furthermore, the determinants of resistance in the other major human malaria parasite, P. vivax, are not known. To address these questions, we investigated the genetic basis of chloroquine resistance in an isogenic lineage of rodent malaria parasite P. chabaudi in which high level resistance to chloroquine has been progressively selected under laboratory conditions. Results Loci containing the critical genes were mapped by Linkage Group Selection, using a genetic cross between the high-level chloroquine-resistant mutant and a genetically distinct sensitive strain. A novel high-resolution quantitative whole-genome re-sequencing approach was used to reveal three regions of selection on chr11, chr03 and chr02 that appear progressively at increasing drug doses on three chromosomes. Whole-genome sequencing of the chloroquine-resistant parent identified just four point mutations in different genes on these chromosomes. Three mutations are located at the foci of the selection valleys and are therefore predicted to confer different levels of chloroquine resistance. The critical mutation conferring the first level of chloroquine resistance is found in aat1, a putative aminoacid transporter. Conclusions Quantitative trait loci conferring selectable phenotypes, such as drug resistance, can be mapped directly using progressive genome-wide linkage group selection. Quantitative genome-wide short

  5. Genome-wide meta-analyses identify multiple loci associated with smoking behavior.

    LENUS (Irish Health Repository)

    2010-05-01

    Consistent but indirect evidence has implicated genetic factors in smoking behavior. We report meta-analyses of several smoking phenotypes within cohorts of the Tobacco and Genetics Consortium (n = 74,053). We also partnered with the European Network of Genetic and Genomic Epidemiology (ENGAGE) and Oxford-GlaxoSmithKline (Ox-GSK) consortia to follow up the 15 most significant regions (n > 140,000). We identified three loci associated with number of cigarettes smoked per day. The strongest association was a synonymous 15q25 SNP in the nicotinic receptor gene CHRNA3 (rs1051730[A], beta = 1.03, standard error (s.e.) = 0.053, P = 2.8 x 10(-73)). Two 10q25 SNPs (rs1329650[G], beta = 0.367, s.e. = 0.059, P = 5.7 x 10(-10); and rs1028936[A], beta = 0.446, s.e. = 0.074, P = 1.3 x 10(-9)) and one 9q13 SNP in EGLN2 (rs3733829[G], beta = 0.333, s.e. = 0.058, P = 1.0 x 10(-8)) also exceeded genome-wide significance for cigarettes per day. For smoking initiation, eight SNPs exceeded genome-wide significance, with the strongest association at a nonsynonymous SNP in BDNF on chromosome 11 (rs6265[C], odds ratio (OR) = 1.06, 95% confidence interval (Cl) 1.04-1.08, P = 1.8 x 10(-8)). One SNP located near DBH on chromosome 9 (rs3025343[G], OR = 1.12, 95% Cl 1.08-1.18, P = 3.6 x 10(-8)) was significantly associated with smoking cessation.

  6. Genome-wide association study identifies multiple risk loci for chronic lymphocytic leukemia

    OpenAIRE

    Berndt, S.I.; Skibola, C.F.; Joseph, V.; Camp, N.J.; Nieters, A.; Wang, Z.; Cozen, W.; Monnereau, A.; Wang, S.S.; Kelly, R.S.; Lan, Q.; Teras, L.R.; Chatterjee, N.; Chung, C.C.; Yeager, M.

    2013-01-01

    Genome-wide association studies (GWAS) have previously identified 13 loci associated with risk of chronic lymphocytic leukemia or small lymphocytic lymphoma (CLL). To identify additional CLL susceptibility loci, we conducted the largest meta-analysis for CLL thus far, including four GWAS with a total of 3,100 individuals with CLL (cases) and 7,667 controls. In the meta-analysis, we identified ten independent associated SNPs in nine new loci at 10q23.31 (ACTA2 or FAS (ACTA2/FAS), P = 1.22 × 10...

  7. Multiple source genes of HAmo SINE actively expanded and ongoing retroposition in cyprinid genomes relying on its partner LINE

    Directory of Open Access Journals (Sweden)

    Gan Xiaoni

    2010-04-01

    Full Text Available Abstract Background We recently characterized HAmo SINE and its partner LINE in silver carp and bighead carp based on hybridization capture of repetitive elements from digested genomic DNA in solution using a bead-probe 1. To reveal the distribution and evolutionary history of SINEs and LINEs in cyprinid genomes, we performed a multi-species search for HAmo SINE and its partner LINE using the bead-probe capture and internal-primer-SINE polymerase chain reaction (PCR techniques. Results Sixty-seven full-size and 125 internal-SINE sequences (as well as 34 full-size and 9 internal sequences previously reported in bighead carp and silver carp from 17 species of the family Cyprinidae were aligned as well as 14 new isolated HAmoL2 sequences. Four subfamilies (type I, II, III and IV, which were divided based on diagnostic nucleotides in the tRNA-unrelated region, expanded preferentially within a certain lineage or within the whole family of Cyprinidae as multiple active source genes. The copy numbers of HAmo SINEs were estimated to vary from 104 to 106 in cyprinid genomes by quantitative RT-PCR. Over one hundred type IV members were identified and characterized in the primitive cyprinid Danio rerio genome but only tens of sequences were found to be similar with type I, II and III since the type IV was the oldest subfamily and its members dispersed in almost all investigated cyprinid fishes. For determining the taxonomic distribution of HAmo SINE, inter-primer SINE PCR was conducted in other non-cyprinid fishes, the results shows that HAmo SINE- related sequences may disperse in other families of order Cypriniforms but absent in other orders of bony fishes: Siluriformes, Polypteriformes, Lepidosteiformes, Acipenseriformes and Osteoglossiforms. Conclusions Depending on HAmo LINE2, multiple source genes (subfamilies of HAmo SINE actively expanded and underwent retroposition in a certain lineage or within the whole family of Cyprinidae. From this

  8. Multiple independent structural dynamic events in the evolution of snake mitochondrial genomes.

    Science.gov (United States)

    Qian, Lifu; Wang, Hui; Yan, Jie; Pan, Tao; Jiang, Shanqun; Rao, Dingqi; Zhang, Baowei

    2018-05-10

    Mitochondrial DNA sequences have long been used in phylogenetic studies. However, little attention has been paid to the changes in gene arrangement patterns in the snake's mitogenome. Here, we analyzed the complete mitogenome sequences and structures of 65 snake species from 14 families and examined their structural patterns, organization and evolution. Our purpose was to further investigate the evolutionary implications and possible rearrangement mechanisms of the mitogenome within snakes. In total, eleven types of mitochondrial gene arrangement patterns were detected (Type I, II, III, III-A, III-B, III-B1, III-C, III-D, III-E, III-F, III-G), with mitochondrial genome rearrangements being a major trend in snakes, especially in Alethinophidia. In snake mitogenomes, the rearrangements mainly involved three processes, gene loss, translocation and duplication. Within Scolecophidia, the O L was lost several times in Typhlopidae and Leptotyphlopidae, but persisted as a plesiomorphy in the Alethinophidia. Duplication of the control region and translocation of the tRNA Leu gene are two visible features in Alethinophidian mitochondrial genomes. Independently and stochastically, the duplication of pseudo-Pro (P*) emerged in seven different lineages of unequal size in three families, indicating that the presence of P* was a polytopic event in the mitogenome. The WANCY tRNA gene cluster and the control regions and their adjacent segments were hotspots for mitogenome rearrangement. Maintenance of duplicate control regions may be the source for snake mitogenome structural diversity.

  9. Complete mitochondrial genome phylogeographic analysis of killer whales (Orcinus orca) indicates multiple species

    DEFF Research Database (Denmark)

    Morin, Phillip A; Archer, Frederick I.; Foote, Andrew David

    2010-01-01

    Killer whales (Orcinus orca) currently comprise a single, cosmopolitan species with a diverse diet. However, studies over the last 30 yr have revealed populations of sympatric "ecotypes" with discrete prey preferences, morphology, and behaviors. Although these ecotypes avoid social interactions...... and are not known to interbreed, genetic studies to date have found extremely low levels of diversity in the mitochondrial control region, and few clear phylogeographic patterns worldwide. This low level of diversity is likely due to low mitochondrial mutation rates that are common to cetaceans. Using killer whales...... as a case study, we have developed a method to readily sequence, assemble, and analyze complete mitochondrial genomes from large numbers of samples to more accurately assess phylogeography and estimate divergence times. This represents an important tool for wildlife management, not only for killer whales...

  10. Ancient genomes document multiple waves of migration in Southeast Asian prehistory.

    Science.gov (United States)

    Lipson, Mark; Cheronet, Olivia; Mallick, Swapan; Rohland, Nadin; Oxenham, Marc; Pietrusewsky, Michael; Pryce, Thomas Oliver; Willis, Anna; Matsumura, Hirofumi; Buckley, Hallie; Domett, Kate; Hai, Nguyen Giang; Hiep, Trinh Hoang; Kyaw, Aung Aung; Win, Tin Tin; Pradier, Baptiste; Broomandkhoshbacht, Nasreen; Candilio, Francesca; Changmai, Piya; Fernandes, Daniel; Ferry, Matthew; Gamarra, Beatriz; Harney, Eadaoin; Kampuansai, Jatupol; Kutanan, Wibhu; Michel, Megan; Novak, Mario; Oppenheimer, Jonas; Sirak, Kendra; Stewardson, Kristin; Zhang, Zhao; Flegontov, Pavel; Pinhasi, Ron; Reich, David

    2018-05-17

    Southeast Asia is home to rich human genetic and linguistic diversity, but the details of past population movements in the region are not well known. Here, we report genome-wide ancient DNA data from eighteen Southeast Asian individuals spanning from the Neolithic period through the Iron Age (4100-1700 years ago). Early farmers from Man Bac in Vietnam exhibit a mixture of East Asian (southern Chinese agriculturalist) and deeply diverged eastern Eurasian (hunter-gatherer) ancestry characteristic of Austroasiatic speakers, with similar ancestry as far south as Indonesia providing evidence for an expansive initial spread of Austroasiatic languages. By the Bronze Age, in a parallel pattern to Europe, sites in Vietnam and Myanmar show close connections to present-day majority groups, reflecting substantial additional influxes of migrants. Copyright © 2018, American Association for the Advancement of Science.

  11. Germline large genomic alterations on 7q in patients with multiple primary cancers

    DEFF Research Database (Denmark)

    Villacis, Rolando A R; Basso, Tatiane R; Canto, Luisa M

    2017-01-01

    Patients with multiple primary cancers (MPCs) are suspected to have a hereditary cancer syndrome. However, only a small proportion may be explained by mutations in high-penetrance genes. We investigate two unrelated MPC patients that met Hereditary Breast and Ovaria Cancer criteria, both presenti...

  12. A multiple genome analysis of Mycobacterium tuberculosis reveals specific novel genes and mutations associated with pyrazinamide resistance

    KAUST Repository

    Sheen, Patricia

    2017-10-11

    Tuberculosis (TB) is a major global health problem and drug resistance compromises the efforts to control this disease. Pyrazinamide (PZA) is an important drug used in both first and second line treatment regimes. However, its complete mechanism of action and resistance remains unclear.We genotyped and sequenced the complete genomes of 68 M. tuberculosis strains isolated from unrelated TB patients in Peru. No clustering pattern of the strains was verified based on spoligotyping. We analyzed the association between PZA resistance with non-synonymous mutations and specific genes. We found mutations in pncA and novel genes significantly associated with PZA resistance in strains without pncA mutations. These included genes related to transportation of metal ions, pH regulation and immune system evasion.These results suggest potential alternate mechanisms of PZA resistance that have not been found in other populations, supporting that the antibacterial activity of PZA may hit multiple targets.

  13. Quantitative Seq-LGS: Genome-Wide Identification of Genetic Drivers of Multiple Phenotypes in Malaria Parasites

    KAUST Repository

    Abkallo, Hussein M.

    2016-10-01

    Identifying the genetic determinants of phenotypes that impact on disease severity is of fundamental importance for the design of new interventions against malaria. Traditionally, such discovery has relied on labor-intensive approaches that require significant investments of time and resources. By combining Linkage Group Selection (LGS), quantitative whole genome population sequencing and a novel mathematical modeling approach (qSeq-LGS), we simultaneously identified multiple genes underlying two distinct phenotypes, identifying novel alleles for growth rate and strain specific immunity (SSI), while removing the need for traditionally required steps such as cloning, individual progeny phenotyping and marker generation. The detection of novel variants, verified by experimental phenotyping methods, demonstrates the remarkable potential of this approach for the identification of genes controlling selectable phenotypes in malaria and other apicomplexan parasites for which experimental genetic crosses are amenable.

  14. A multiple genome analysis of Mycobacterium tuberculosis reveals specific novel genes and mutations associated with pyrazinamide resistance

    KAUST Repository

    Sheen, Patricia; Requena, David; Gushiken, Eduardo; Gilman, Robert H.; Antiparra, Ricardo; Lucero, Bryan; Lizá rraga, Pilar; Cieza, Basilio; Roncal, Elisa; Grandjean, Louis; Pain, Arnab; McNerney, Ruth; Clark, Taane G.; Moore, David; Zimic, Mirko

    2017-01-01

    Tuberculosis (TB) is a major global health problem and drug resistance compromises the efforts to control this disease. Pyrazinamide (PZA) is an important drug used in both first and second line treatment regimes. However, its complete mechanism of action and resistance remains unclear.We genotyped and sequenced the complete genomes of 68 M. tuberculosis strains isolated from unrelated TB patients in Peru. No clustering pattern of the strains was verified based on spoligotyping. We analyzed the association between PZA resistance with non-synonymous mutations and specific genes. We found mutations in pncA and novel genes significantly associated with PZA resistance in strains without pncA mutations. These included genes related to transportation of metal ions, pH regulation and immune system evasion.These results suggest potential alternate mechanisms of PZA resistance that have not been found in other populations, supporting that the antibacterial activity of PZA may hit multiple targets.

  15. Genomic and phenotypic characterization of myxoma virus from Great Britain reveals multiple evolutionary pathways distinct from those in Australia

    Science.gov (United States)

    Kerr, Peter J.; Cattadori, Isabella M.; Fitch, Adam; Geber, Adam; Liu, June; Sim, Derek G.; Boag, Brian; Ghedin, Elodie

    2017-01-01

    The co-evolution of myxoma virus (MYXV) and the European rabbit occurred independently in Australia and Europe from different progenitor viruses. Although this is the canonical study of the evolution of virulence, whether the genomic and phenotypic outcomes of MYXV evolution in Europe mirror those observed in Australia is unknown. We addressed this question using viruses isolated in the United Kingdom early in the MYXV epizootic (1954–1955) and between 2008–2013. The later UK viruses fell into three distinct lineages indicative of a long period of separation and independent evolution. Although rates of evolutionary change were almost identical to those previously described for MYXV in Australia and strongly clock-like, genome evolution in the UK and Australia showed little convergence. The phenotypes of eight UK viruses from three lineages were characterized in laboratory rabbits and compared to the progenitor (release) Lausanne strain. Inferred virulence ranged from highly virulent (grade 1) to highly attenuated (grade 5). Two broad disease types were seen: cutaneous nodular myxomatosis characterized by multiple raised secondary cutaneous lesions, or an amyxomatous phenotype with few or no secondary lesions. A novel clinical outcome was acute death with pulmonary oedema and haemorrhage, often associated with bacteria in many tissues but an absence of inflammatory cells. Notably, reading frame disruptions in genes defined as essential for virulence in the progenitor Lausanne strain were compatible with the acquisition of high virulence. Combined, these data support a model of ongoing host-pathogen co-evolution in which multiple genetic pathways can produce successful outcomes in the field that involve both different virulence grades and disease phenotypes, with alterations in tissue tropism and disease mechanisms. PMID:28253375

  16. Genomic and phenotypic characterization of myxoma virus from Great Britain reveals multiple evolutionary pathways distinct from those in Australia.

    Directory of Open Access Journals (Sweden)

    Peter J Kerr

    2017-03-01

    Full Text Available The co-evolution of myxoma virus (MYXV and the European rabbit occurred independently in Australia and Europe from different progenitor viruses. Although this is the canonical study of the evolution of virulence, whether the genomic and phenotypic outcomes of MYXV evolution in Europe mirror those observed in Australia is unknown. We addressed this question using viruses isolated in the United Kingdom early in the MYXV epizootic (1954-1955 and between 2008-2013. The later UK viruses fell into three distinct lineages indicative of a long period of separation and independent evolution. Although rates of evolutionary change were almost identical to those previously described for MYXV in Australia and strongly clock-like, genome evolution in the UK and Australia showed little convergence. The phenotypes of eight UK viruses from three lineages were characterized in laboratory rabbits and compared to the progenitor (release Lausanne strain. Inferred virulence ranged from highly virulent (grade 1 to highly attenuated (grade 5. Two broad disease types were seen: cutaneous nodular myxomatosis characterized by multiple raised secondary cutaneous lesions, or an amyxomatous phenotype with few or no secondary lesions. A novel clinical outcome was acute death with pulmonary oedema and haemorrhage, often associated with bacteria in many tissues but an absence of inflammatory cells. Notably, reading frame disruptions in genes defined as essential for virulence in the progenitor Lausanne strain were compatible with the acquisition of high virulence. Combined, these data support a model of ongoing host-pathogen co-evolution in which multiple genetic pathways can produce successful outcomes in the field that involve both different virulence grades and disease phenotypes, with alterations in tissue tropism and disease mechanisms.

  17. Genomic and phenotypic characterization of myxoma virus from Great Britain reveals multiple evolutionary pathways distinct from those in Australia.

    Science.gov (United States)

    Kerr, Peter J; Cattadori, Isabella M; Rogers, Matthew B; Fitch, Adam; Geber, Adam; Liu, June; Sim, Derek G; Boag, Brian; Eden, John-Sebastian; Ghedin, Elodie; Read, Andrew F; Holmes, Edward C

    2017-03-01

    The co-evolution of myxoma virus (MYXV) and the European rabbit occurred independently in Australia and Europe from different progenitor viruses. Although this is the canonical study of the evolution of virulence, whether the genomic and phenotypic outcomes of MYXV evolution in Europe mirror those observed in Australia is unknown. We addressed this question using viruses isolated in the United Kingdom early in the MYXV epizootic (1954-1955) and between 2008-2013. The later UK viruses fell into three distinct lineages indicative of a long period of separation and independent evolution. Although rates of evolutionary change were almost identical to those previously described for MYXV in Australia and strongly clock-like, genome evolution in the UK and Australia showed little convergence. The phenotypes of eight UK viruses from three lineages were characterized in laboratory rabbits and compared to the progenitor (release) Lausanne strain. Inferred virulence ranged from highly virulent (grade 1) to highly attenuated (grade 5). Two broad disease types were seen: cutaneous nodular myxomatosis characterized by multiple raised secondary cutaneous lesions, or an amyxomatous phenotype with few or no secondary lesions. A novel clinical outcome was acute death with pulmonary oedema and haemorrhage, often associated with bacteria in many tissues but an absence of inflammatory cells. Notably, reading frame disruptions in genes defined as essential for virulence in the progenitor Lausanne strain were compatible with the acquisition of high virulence. Combined, these data support a model of ongoing host-pathogen co-evolution in which multiple genetic pathways can produce successful outcomes in the field that involve both different virulence grades and disease phenotypes, with alterations in tissue tropism and disease mechanisms.

  18. Meta-analysis of genome-wide association studies discovers multiple loci for chronic lymphocytic leukemia

    Science.gov (United States)

    Berndt, Sonja I.; Camp, Nicola J.; Skibola, Christine F.; Vijai, Joseph; Wang, Zhaoming; Gu, Jian; Nieters, Alexandra; Kelly, Rachel S.; Smedby, Karin E.; Monnereau, Alain; Cozen, Wendy; Cox, Angela; Wang, Sophia S.; Lan, Qing; Teras, Lauren R.; Machado, Moara; Yeager, Meredith; Brooks-Wilson, Angela R.; Hartge, Patricia; Purdue, Mark P.; Birmann, Brenda M.; Vajdic, Claire M.; Cocco, Pierluigi; Zhang, Yawei; Giles, Graham G.; Zeleniuch-Jacquotte, Anne; Lawrence, Charles; Montalvan, Rebecca; Burdett, Laurie; Hutchinson, Amy; Ye, Yuanqing; Call, Timothy G.; Shanafelt, Tait D.; Novak, Anne J.; Kay, Neil E.; Liebow, Mark; Cunningham, Julie M.; Allmer, Cristine; Hjalgrim, Henrik; Adami, Hans-Olov; Melbye, Mads; Glimelius, Bengt; Chang, Ellen T.; Glenn, Martha; Curtin, Karen; Cannon-Albright, Lisa A.; Diver, W Ryan; Link, Brian K.; Weiner, George J.; Conde, Lucia; Bracci, Paige M.; Riby, Jacques; Arnett, Donna K.; Zhi, Degui; Leach, Justin M.; Holly, Elizabeth A.; Jackson, Rebecca D.; Tinker, Lesley F.; Benavente, Yolanda; Sala, Núria; Casabonne, Delphine; Becker, Nikolaus; Boffetta, Paolo; Brennan, Paul; Foretova, Lenka; Maynadie, Marc; McKay, James; Staines, Anthony; Chaffee, Kari G.; Achenbach, Sara J.; Vachon, Celine M.; Goldin, Lynn R.; Strom, Sara S.; Leis, Jose F.; Weinberg, J. Brice; Caporaso, Neil E.; Norman, Aaron D.; De Roos, Anneclaire J.; Morton, Lindsay M.; Severson, Richard K.; Riboli, Elio; Vineis, Paolo; Kaaks, Rudolph; Masala, Giovanna; Weiderpass, Elisabete; Chirlaque, María- Dolores; Vermeulen, Roel C. H.; Travis, Ruth C.; Southey, Melissa C.; Milne, Roger L.; Albanes, Demetrius; Virtamo, Jarmo; Weinstein, Stephanie; Clavel, Jacqueline; Zheng, Tongzhang; Holford, Theodore R.; Villano, Danylo J.; Maria, Ann; Spinelli, John J.; Gascoyne, Randy D.; Connors, Joseph M.; Bertrand, Kimberly A.; Giovannucci, Edward; Kraft, Peter; Kricker, Anne; Turner, Jenny; Ennas, Maria Grazia; Ferri, Giovanni M.; Miligi, Lucia; Liang, Liming; Ma, Baoshan; Huang, Jinyan; Crouch, Simon; Park, Ju-Hyun; Chatterjee, Nilanjan; North, Kari E.; Snowden, John A.; Wright, Josh; Fraumeni, Joseph F.; Offit, Kenneth; Wu, Xifeng; de Sanjose, Silvia; Cerhan, James R.; Chanock, Stephen J.; Rothman, Nathaniel; Slager, Susan L.

    2016-01-01

    Chronic lymphocytic leukemia (CLL) is a common lymphoid malignancy with strong heritability. To further understand the genetic susceptibility for CLL and identify common loci associated with risk, we conducted a meta-analysis of four genome-wide association studies (GWAS) composed of 3,100 cases and 7,667 controls with follow-up replication in 1,958 cases and 5,530 controls. Here we report three new loci at 3p24.1 (rs9880772, EOMES, P=2.55 × 10−11), 6p25.2 (rs73718779, SERPINB6, P=1.97 × 10−8) and 3q28 (rs9815073, LPP, P=3.62 × 10−8), as well as a new independent SNP at the known 2q13 locus (rs9308731, BCL2L11, P=1.00 × 10−11) in the combined analysis. We find suggestive evidence (P<5 × 10−7) for two additional new loci at 4q24 (rs10028805, BANK1, P=7.19 × 10−8) and 3p22.2 (rs1274963, CSRNP1, P=2.12 × 10−7). Pathway analyses of new and known CLL loci consistently show a strong role for apoptosis, providing further evidence for the importance of this biological pathway in CLL susceptibility. PMID:26956414

  19. Genomic epidemiology reveals multiple introductions of Zika virus into the United States

    Science.gov (United States)

    Grubaugh, Nathan D.; Ladner, Jason T.; Kraemer, Moritz U. G.; Dudas, Gytis; Tan, Amanda L.; Gangavarapu, Karthik; Wiley, Michael R.; White, Stephen; Thézé, Julien; Magnani, Diogo M.; Prieto, Karla; Reyes, Daniel; Bingham, Andrea M.; Paul, Lauren M.; Robles-Sikisaka, Refugio; Oliveira, Glenn; Pronty, Darryl; Barcellona, Carolyn M.; Metsky, Hayden C.; Baniecki, Mary Lynn; Barnes, Kayla G.; Chak, Bridget; Freije, Catherine A.; Gladden-Young, Adrianne; Gnirke, Andreas; Luo, Cynthia; Macinnis, Bronwyn; Matranga, Christian B.; Park, Daniel J.; Qu, James; Schaffner, Stephen F.; Tomkins-Tinch, Christopher; West, Kendra L.; Winnicki, Sarah M.; Wohl, Shirlee; Yozwiak, Nathan L.; Quick, Joshua; Fauver, Joseph R.; Khan, Kamran; Brent, Shannon E.; Reiner, Robert C.; Lichtenberger, Paola N.; Ricciardi, Michael J.; Bailey, Varian K.; Watkins, David I.; Cone, Marshall R.; Kopp, Edgar W.; Hogan, Kelly N.; Cannons, Andrew C.; Jean, Reynald; Monaghan, Andrew J.; Garry, Robert F.; Loman, Nicholas J.; Faria, Nuno R.; Porcelli, Mario C.; Vasquez, Chalmers; Nagle, Elyse R.; Cummings, Derek A. T.; Stanek, Danielle; Rambaut, Andrew; Sanchez-Lockhart, Mariano; Sabeti, Pardis C.; Gillis, Leah D.; Michael, Scott F.; Bedford, Trevor; Pybus, Oliver G.; Isern, Sharon; Palacios, Gustavo; Andersen, Kristian G.

    2017-06-01

    Zika virus (ZIKV) is causing an unprecedented epidemic linked to severe congenital abnormalities. In July 2016, mosquito-borne ZIKV transmission was reported in the continental United States; since then, hundreds of locally acquired infections have been reported in Florida. To gain insights into the timing, source, and likely route(s) of ZIKV introduction, we tracked the virus from its first detection in Florida by sequencing ZIKV genomes from infected patients and Aedes aegypti mosquitoes. We show that at least 4 introductions, but potentially as many as 40, contributed to the outbreak in Florida and that local transmission is likely to have started in the spring of 2016—several months before its initial detection. By analysing surveillance and genetic data, we show that ZIKV moved among transmission zones in Miami. Our analyses show that most introductions were linked to the Caribbean, a finding corroborated by the high incidence rates and traffic volumes from the region into the Miami area. Our study provides an understanding of how ZIKV initiates transmission in new regions.

  20. Evolutionary changes of multiple visual pigment genes in the complete genome of Pacific bluefin tuna.

    Science.gov (United States)

    Nakamura, Yoji; Mori, Kazuki; Saitoh, Kenji; Oshima, Kenshiro; Mekuchi, Miyuki; Sugaya, Takuma; Shigenobu, Yuya; Ojima, Nobuhiko; Muta, Shigeru; Fujiwara, Atushi; Yasuike, Motoshige; Oohara, Ichiro; Hirakawa, Hideki; Chowdhury, Vishwajit Sur; Kobayashi, Takanori; Nakajima, Kazuhiro; Sano, Motohiko; Wada, Tokio; Tashiro, Kosuke; Ikeo, Kazuho; Hattori, Masahira; Kuhara, Satoru; Gojobori, Takashi; Inouye, Kiyoshi

    2013-07-02

    Tunas are migratory fishes in offshore habitats and top predators with unique features. Despite their ecological importance and high market values, the open-ocean lifestyle of tuna, in which effective sensing systems such as color vision are required for capture of prey, has been poorly understood. To elucidate the genetic and evolutionary basis of optic adaptation of tuna, we determined the genome sequence of the Pacific bluefin tuna (Thunnus orientalis), using next-generation sequencing technology. A total of 26,433 protein-coding genes were predicted from 16,802 assembled scaffolds. From these, we identified five common fish visual pigment genes: red-sensitive (middle/long-wavelength sensitive; M/LWS), UV-sensitive (short-wavelength sensitive 1; SWS1), blue-sensitive (SWS2), rhodopsin (RH1), and green-sensitive (RH2) opsin genes. Sequence comparison revealed that tuna's RH1 gene has an amino acid substitution that causes a short-wave shift in the absorption spectrum (i.e., blue shift). Pacific bluefin tuna has at least five RH2 paralogs, the most among studied fishes; four of the proteins encoded may be tuned to blue light at the amino acid level. Moreover, phylogenetic analysis suggested that gene conversions have occurred in each of the SWS2 and RH2 loci in a short period. Thus, Pacific bluefin tuna has undergone evolutionary changes in three genes (RH1, RH2, and SWS2), which may have contributed to detecting blue-green contrast and measuring the distance to prey in the blue-pelagic ocean. These findings provide basic information on behavioral traits of predatory fish and, thereby, could help to improve the technology to culture such fish in captivity for resource management.

  1. A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging

    Science.gov (United States)

    Logsdon, Benjamin A.; Carty, Cara L.; Reiner, Alexander P.; Dai, James Y.; Kooperberg, Charles

    2012-01-01

    Motivation: For many complex traits, including height, the majority of variants identified by genome-wide association studies (GWAS) have small effects, leaving a significant proportion of the heritable variation unexplained. Although many penalized multiple regression methodologies have been proposed to increase the power to detect associations for complex genetic architectures, they generally lack mechanisms for false-positive control and diagnostics for model over-fitting. Our methodology is the first penalized multiple regression approach that explicitly controls Type I error rates and provide model over-fitting diagnostics through a novel normally distributed statistic defined for every marker within the GWAS, based on results from a variational Bayes spike regression algorithm. Results: We compare the performance of our method to the lasso and single marker analysis on simulated data and demonstrate that our approach has superior performance in terms of power and Type I error control. In addition, using the Women's Health Initiative (WHI) SNP Health Association Resource (SHARe) GWAS of African-Americans, we show that our method has power to detect additional novel associations with body height. These findings replicate by reaching a stringent cutoff of marginal association in a larger cohort. Availability: An R-package, including an implementation of our variational Bayes spike regression (vBsr) algorithm, is available at http://kooperberg.fhcrc.org/soft.html. Contact: blogsdon@fhcrc.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22563072

  2. Adaptation of tick-borne encephalitis virus from human brain to different cell cultures induces multiple genomic substitutions.

    Science.gov (United States)

    Ponomareva, Eugenia P; Ternovoi, Vladimir A; Mikryukova, Tamara P; Protopopova, Elena V; Gladysheva, Anastasia V; Shvalov, Alexander N; Konovalova, Svetlana N; Chausov, Eugene V; Loktev, Valery B

    2017-10-01

    The C11-13 strain from the Siberian subtype of tick-borne encephalitis virus (TBEV) was isolated from human brain using pig embryo kidney (PEK), 293, and Neuro-2a cells. Analysis of the complete viral genome of the C11-13 variants during six passages in these cells revealed that the cell-adapted C11-13 variants had multiple amino acid substitutions as compared to TBEV from human brain. Seven out of eight amino acids substitutions in the high-replicating C11-13(PEK) variant mapped to non-structural proteins; 13 out of 14 substitutions in the well-replicating C11-13(293) variant, and all four substitutions in the low-replicating C11-13(Neuro-2a) variant were also localized in non-structural proteins, predominantly in the NS2a (2), NS3 (6) and NS5 (3) proteins. The substitutions NS2a 1067 (Asn → Asp), NS2a 1168 (Leu → Val) in the N-terminus of NS2a and NS3 1745 (His → Gln) in the helicase domain of NS3 were found in all selected variants. We postulate that multiple substitutions in the NS2a, NS3 and NS5 genes play a key role in adaptation of TBEV to different cells.

  3. Mitochondrial genome analyses suggest multiple Trichuris species in humans, baboons, and pigs from different geographical regions

    DEFF Research Database (Denmark)

    Hawash, Mohamed B. F.; Andersen, Lee O.; Gasser, Robin B.

    2015-01-01

    Trichuris from françois' leaf monkey, suggesting multiple whipworm species circulating among non-human primates. The genetic and protein distances between pig Trichuris from Denmark and other regions were roughly 9% and 6%, respectively, while Chinese and Ugandan whipworms were more closely related......) suggesting that they represented different species. Trichuris from the olive baboon in US was genetically related to human Trichuris in China, while the other from the hamadryas baboon in Denmark was nearly identical to human Trichuris from Uganda. Baboon-derived Trichuris was genetically distinct from......BACKGROUND: The whipworms Trichuris trichiura and Trichuris suis are two parasitic nematodes of humans and pigs, respectively. Although whipworms in human and non-human primates historically have been referred to as T. trichiura, recent reports suggest that several Trichuris spp. are found...

  4. Identifying and exploiting trait-relevant tissues with multiple functional annotations in genome-wide association studies

    Science.gov (United States)

    Zhang, Shujun

    2018-01-01

    Genome-wide association studies (GWASs) have identified many disease associated loci, the majority of which have unknown biological functions. Understanding the mechanism underlying trait associations requires identifying trait-relevant tissues and investigating associations in a trait-specific fashion. Here, we extend the widely used linear mixed model to incorporate multiple SNP functional annotations from omics studies with GWAS summary statistics to facilitate the identification of trait-relevant tissues, with which to further construct powerful association tests. Specifically, we rely on a generalized estimating equation based algorithm for parameter inference, a mixture modeling framework for trait-tissue relevance classification, and a weighted sequence kernel association test constructed based on the identified trait-relevant tissues for powerful association analysis. We refer to our analytic procedure as the Scalable Multiple Annotation integration for trait-Relevant Tissue identification and usage (SMART). With extensive simulations, we show how our method can make use of multiple complementary annotations to improve the accuracy for identifying trait-relevant tissues. In addition, our procedure allows us to make use of the inferred trait-relevant tissues, for the first time, to construct more powerful SNP set tests. We apply our method for an in-depth analysis of 43 traits from 28 GWASs using tissue-specific annotations in 105 tissues derived from ENCODE and Roadmap. Our results reveal new trait-tissue relevance, pinpoint important annotations that are informative of trait-tissue relationship, and illustrate how we can use the inferred trait-relevant tissues to construct more powerful association tests in the Wellcome trust case control consortium study. PMID:29377896

  5. Identifying and exploiting trait-relevant tissues with multiple functional annotations in genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Xingjie Hao

    2018-01-01

    Full Text Available Genome-wide association studies (GWASs have identified many disease associated loci, the majority of which have unknown biological functions. Understanding the mechanism underlying trait associations requires identifying trait-relevant tissues and investigating associations in a trait-specific fashion. Here, we extend the widely used linear mixed model to incorporate multiple SNP functional annotations from omics studies with GWAS summary statistics to facilitate the identification of trait-relevant tissues, with which to further construct powerful association tests. Specifically, we rely on a generalized estimating equation based algorithm for parameter inference, a mixture modeling framework for trait-tissue relevance classification, and a weighted sequence kernel association test constructed based on the identified trait-relevant tissues for powerful association analysis. We refer to our analytic procedure as the Scalable Multiple Annotation integration for trait-Relevant Tissue identification and usage (SMART. With extensive simulations, we show how our method can make use of multiple complementary annotations to improve the accuracy for identifying trait-relevant tissues. In addition, our procedure allows us to make use of the inferred trait-relevant tissues, for the first time, to construct more powerful SNP set tests. We apply our method for an in-depth analysis of 43 traits from 28 GWASs using tissue-specific annotations in 105 tissues derived from ENCODE and Roadmap. Our results reveal new trait-tissue relevance, pinpoint important annotations that are informative of trait-tissue relationship, and illustrate how we can use the inferred trait-relevant tissues to construct more powerful association tests in the Wellcome trust case control consortium study.

  6. Combined array-comparative genomic hybridization and single-nucleotide polymorphism-loss of heterozygosity analysis reveals complex changes and multiple forms of chromosomal instability in colorectal cancers

    DEFF Research Database (Denmark)

    Gaasenbeek, Michelle; Howarth, Kimberley; Rowan, Andrew J

    2006-01-01

    Cancers with chromosomal instability (CIN) are held to be aneuploid/polyploid with multiple large-scale gains/deletions, but the processes underlying CIN are unclear and different types of CIN might exist. We investigated colorectal cancer cell lines using array-comparative genomic hybridization...

  7. Bayesian analyses of Yemeni mitochondrial genomes suggest multiple migration events with Africa and Western Eurasia.

    Science.gov (United States)

    Vyas, Deven N; Kitchen, Andrew; Miró-Herrans, Aida T; Pearson, Laurel N; Al-Meeri, Ali; Mulligan, Connie J

    2016-03-01

    Anatomically, modern humans are thought to have migrated out of Africa ∼60,000 years ago in the first successful global dispersal. This initial migration may have passed through Yemen, a region that has experienced multiple migrations events with Africa and Eurasia throughout human history. We use Bayesian phylogenetics to determine how ancient and recent migrations have shaped Yemeni mitogenomic variation. We sequenced 113 mitogenomes from multiple Yemeni regions with a focus on haplogroups M, N, and L3(xM,N) as these groups have the oldest evolutionary history outside of Africa. We performed Bayesian evolutionary analyses to generate time-measured phylogenies calibrated by Neanderthal and Denisovan mitogenomes in order to determine the age of Yemeni-specific clades. As defined by Yemeni monophyly, Yemeni in situ evolution is limited to the Holocene or latest Pleistocene (ages of clades in subhaplogroups L3b1a1a, L3h2, L3x1, M1a1f, M1a5, N1a1a3, and N1a3 range from 2 to 14 kya) and is often situated within broader Horn of Africa/southern Arabia in situ evolution (L3h2, L3x1, M1a1f, M1a5, and N1a1a3 ages range from 7 to 29 kya). Five subhaplogroups show no monophyly and are candidates for Holocene migration into Yemen (L0a2a2a, L3d1a1a, L3i2, M1a1b, and N1b1a). Yemeni mitogenomes are largely the product of Holocene migration, and subsequent in situ evolution, from Africa and western Eurasia. However, we hypothesize that recent population movements may obscure the genetic signature of more ancient migrations. Additional research, e.g., analyses of Yemeni nuclear genetic data, is needed to better reconstruct the complex population and migration histories associated with Out of Africa. © 2015 Wiley Periodicals, Inc.

  8. Extensive Genome Rearrangements and Multiple Horizontal Gene Transfers in a Population of Pyrococcus Isolates from Vulcano Island, Italy▿ †

    Science.gov (United States)

    White, James R.; Escobar-Paramo, Patricia; Mongodin, Emmanuel F.; Nelson, Karen E.; DiRuggiero, Jocelyne

    2008-01-01

    The extent of chromosome rearrangements in Pyrococcus isolates from marine hydrothermal vents in Vulcano Island, Italy, was evaluated by high-throughput genomic methods. The results illustrate the dynamic nature of the genomes of the genus Pyrococcus and raise the possibility of a connection between rapidly changing environmental conditions and adaptive genomic properties. PMID:18723649

  9. Extensive genome rearrangements and multiple horizontal gene transfers in a population of pyrococcus isolates from Vulcano Island, Italy.

    Science.gov (United States)

    White, James R; Escobar-Paramo, Patricia; Mongodin, Emmanuel F; Nelson, Karen E; DiRuggiero, Jocelyne

    2008-10-01

    The extent of chromosome rearrangements in Pyrococcus isolates from marine hydrothermal vents in Vulcano Island, Italy, was evaluated by high-throughput genomic methods. The results illustrate the dynamic nature of the genomes of the genus Pyrococcus and raise the possibility of a connection between rapidly changing environmental conditions and adaptive genomic properties.

  10. Historical Datasets Support Genomic Selection Models for the Prediction of Cotton Fiber Quality Phenotypes Across Multiple Environments.

    Science.gov (United States)

    Gapare, Washington; Liu, Shiming; Conaty, Warren; Zhu, Qian-Hao; Gillespie, Vanessa; Llewellyn, Danny; Stiller, Warwick; Wilson, Iain

    2018-03-20

    Genomic selection (GS) has successfully been used in plant breeding to improve selection efficiency and reduce breeding time and cost. However, there has not been a study to evaluate GS prediction models that may be used for predicting cotton breeding lines across multiple environments. In this study, we evaluated the performance of Bayes Ridge Regression, BayesA, BayesB, BayesC and Reproducing Kernel Hilbert Spaces regression models. We then extended the single-site GS model to accommodate genotype × environment interaction (G×E) in order to assess the merits of multi- over single-environment models in a practical breeding and selection context in cotton, a crop for which this has not previously been evaluated. Our study was based on a population of 215 upland cotton ( Gossypium hirsutum ) breeding lines which were evaluated for fiber length and strength at multiple locations in Australia and genotyped with 13,330 single nucleotide polymorphic (SNP) markers. BayesB, which assumes unique variance for each marker and a proportion of markers to have large effects, while most other markers have zero effect, was the preferred model. GS accuracy for fiber length based on a single-site model varied across sites, ranging from 0.27 to 0.77 (mean = 0.38), while that of fiber strength ranged from 0.19 to 0.58 (mean = 0.35) using randomly selected sub-populations as the training population. Prediction accuracies from the M×E model were higher than those for single-site and across-site models, with an average accuracy of 0.71 and 0.59 for fiber length and strength, respectively. The use of the M×E model could therefore identify which breeding lines have effects that are stable across environments and which ones are responsible for G×E and so reduce the amount of phenotypic screening required in cotton breeding programs to identify adaptable genotypes. Copyright © 2018, G3: Genes, Genomes, Genetics.

  11. A guild of 45 CRISPR-associated (Cas protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.

    Directory of Open Access Journals (Sweden)

    Daniel H Haft

    2005-11-01

    Full Text Available Clustered regularly interspaced short palindromic repeats (CRISPRs are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer "immunity" against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.

  12. A genome-wide study of DNA methylation patterns and gene expression levels in multiple human and chimpanzee tissues.

    Directory of Open Access Journals (Sweden)

    Athma A Pai

    2011-02-01

    Full Text Available The modification of DNA by methylation is an important epigenetic mechanism that affects the spatial and temporal regulation of gene expression. Methylation patterns have been described in many contexts within and across a range of species. However, the extent to which changes in methylation might underlie inter-species differences in gene regulation, in particular between humans and other primates, has not yet been studied. To this end, we studied DNA methylation patterns in livers, hearts, and kidneys from multiple humans and chimpanzees, using tissue samples for which genome-wide gene expression data were also available. Using the multi-species gene expression and methylation data for 7,723 genes, we were able to study the role of promoter DNA methylation in the evolution of gene regulation across tissues and species. We found that inter-tissue methylation patterns are often conserved between humans and chimpanzees. However, we also found a large number of gene expression differences between species that might be explained, at least in part, by corresponding differences in methylation levels. In particular, we estimate that, in the tissues we studied, inter-species differences in promoter methylation might underlie as much as 12%-18% of differences in gene expression levels between humans and chimpanzees.

  13. Whole-genome sequencing of monozygotic twins discordant for schizophrenia indicates multiple genetic risk factors for schizophrenia

    Institute of Scientific and Technical Information of China (English)

    Jinsong Tang; Fan He; Fengyu Zhang; Yin Yao Shugart; Chunyu Liu; Yanqing Tang; Raymond C.K.Chan; Chuan-Yue Wang; Yong-Gang Yao; Xiaogang Chen; Yu Fan; Hong Li; Qun Xiang; Deng-Feng Zhang; Zongchang Li; Ying He; Yanhui Liao; Ya Wang

    2017-01-01

    Schizophrenia is a common disorder with a high heritability,but its genetic architecture is still elusive.We implemented whole-genome sequencing (WGS) analysis of 8 families with monozygotic (MZ) twin pairs discordant for schizophrenia to assess potential association of de novo mutations (DNMs) or inherited variants with susceptibility to schizophrenia.Eight non-synonymous DNMs (including one splicing site) were identified and shared by twins,which were either located in previously reported schizophrenia risk genes (p.V24689I mutation in TTN,p.S2506T mutation in GCN1L1,IVS3+1G > T in DOCK1) or had a benign to damaging effect according to in silico prediction analysis.By searching the inherited rare damaging or loss-of-function (LOF) variants and common susceptible alleles from three classes of schizophrenia candidate genes,we were able to distill genetic alterations in several schizophrenia risk genes,including GAD1,PLXNA2,RELN and FEZ1.Four inherited copy number variations (CNVs;including a large deletion at 16p13.11) implicated for schizophrenia were identified in four families,respectively.Most of families carried both missense DNMs and inherited risk variants,which might suggest that DNMs,inherited rare damaging variants and common risk alleles together conferred to schizophrenia susceptibility.Our results support that schizophrenia is caused by a combination of multiple genetic factors,with each DNM/variant showing a relatively small effect size.

  14. The chloroplast genome sequence of the green alga Leptosira terrestris: multiple losses of the inverted repeat and extensive genome rearrangements within the Trebouxiophyceae

    Directory of Open Access Journals (Sweden)

    Turmel Monique

    2007-07-01

    Full Text Available Abstract Background In the Chlorophyta – the green algal phylum comprising the classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae and Chlorophyceae – the chloroplast genome displays a highly variable architecture. While chlorophycean chloroplast DNAs (cpDNAs deviate considerably from the ancestral pattern described for the prasinophyte Nephroselmis olivacea, the degree of remodelling sustained by the two ulvophyte cpDNAs completely sequenced to date is intermediate relative to those observed for chlorophycean and trebouxiophyte cpDNAs. Chlorella vulgaris (Chlorellales is currently the only photosynthetic trebouxiophyte whose complete cpDNA sequence has been reported. To gain insights into the evolutionary trends of the chloroplast genome in the Trebouxiophyceae, we sequenced cpDNA from the filamentous alga Leptosira terrestris (Ctenocladales. Results The 195,081-bp Leptosira chloroplast genome resembles the 150,613-bp Chlorella genome in lacking a large inverted repeat (IR but differs greatly in gene order. Six of the conserved genes present in Chlorella cpDNA are missing from the Leptosira gene repertoire. The 106 conserved genes, four introns and 11 free standing open reading frames (ORFs account for 48.3% of the genome sequence. This is the lowest gene density yet observed among chlorophyte cpDNAs. Contrary to the situation in Chlorella but similar to that in the chlorophycean Scenedesmus obliquus, the gene distribution is highly biased over the two DNA strands in Leptosira. Nine genes, compared to only three in Chlorella, have significantly expanded coding regions relative to their homologues in ancestral-type green algal cpDNAs. As observed in chlorophycean genomes, the rpoB gene is fragmented into two ORFs. Short repeats account for 5.1% of the Leptosira genome sequence and are present mainly in intergenic regions. Conclusion Our results highlight the great plasticity of the chloroplast genome in the Trebouxiophyceae and indicate

  15. A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation.

    Directory of Open Access Journals (Sweden)

    Wayne Aubrey

    Full Text Available Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences, or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1 a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2 software to design the method's primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome.

  16. Whole-genome sequencing of monozygotic twins discordant for schizophrenia indicates multiple genetic risk factors for schizophrenia.

    Science.gov (United States)

    Tang, Jinsong; Fan, Yu; Li, Hong; Xiang, Qun; Zhang, Deng-Feng; Li, Zongchang; He, Ying; Liao, Yanhui; Wang, Ya; He, Fan; Zhang, Fengyu; Shugart, Yin Yao; Liu, Chunyu; Tang, Yanqing; Chan, Raymond C K; Wang, Chuan-Yue; Yao, Yong-Gang; Chen, Xiaogang

    2017-06-20

    Schizophrenia is a common disorder with a high heritability, but its genetic architecture is still elusive. We implemented whole-genome sequencing (WGS) analysis of 8 families with monozygotic (MZ) twin pairs discordant for schizophrenia to assess potential association of de novo mutations (DNMs) or inherited variants with susceptibility to schizophrenia. Eight non-synonymous DNMs (including one splicing site) were identified and shared by twins, which were either located in previously reported schizophrenia risk genes (p.V24689I mutation in TTN, p.S2506T mutation in GCN1L1, IVS3+1G > T in DOCK1) or had a benign to damaging effect according to in silico prediction analysis. By searching the inherited rare damaging or loss-of-function (LOF) variants and common susceptible alleles from three classes of schizophrenia candidate genes, we were able to distill genetic alterations in several schizophrenia risk genes, including GAD1, PLXNA2, RELN and FEZ1. Four inherited copy number variations (CNVs; including a large deletion at 16p13.11) implicated for schizophrenia were identified in four families, respectively. Most of families carried both missense DNMs and inherited risk variants, which might suggest that DNMs, inherited rare damaging variants and common risk alleles together conferred to schizophrenia susceptibility. Our results support that schizophrenia is caused by a combination of multiple genetic factors, with each DNM/variant showing a relatively small effect size. Copyright © 2017 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. All rights reserved.

  17. Plutella xylostella granulovirus late gene promoter activity in the context of the Autographa californica multiple nucleopolyhedrovirus genome.

    Science.gov (United States)

    Ren, He-Lin; Hu, Yuan; Guo, Ya-Jun; Li, Lu-Lin

    2016-06-01

    Within Baculoviridae, little is known about the molecular mechanisms of replication in betabaculoviruses, despite extensive studies in alphabaculoviruses. In this study, the promoters of nine late genes of the betabaculovirus Plutella xylostella granulovirus (PlxyGV) were cloned into a transient expression vector and the alphabaculovirus Autographa californica multiple nucleopolyhedrovirus (AcMNPV) genome, and compared with homologous late gene promoters of AcMNPV in Sf9 cells. In transient expression assays, all PlxyGV late promoters were activated in cells transfected with the individual reporter plasmids together with an AcMNPV bacmid. In infected cells, reporter gene expression levels with the promoters of PlxyGV e18 and AcMNPV vp39 and gp41 were significantly higher than those of the corresponding AcMNPV or PlxyGV promoters, which had fewer late promoter motifs. Observed expression levels were lower for the PlxyGV p6.9, pk1, gran, p10a, and p10b promoters than for the corresponding AcMNPV promoters, despite equal numbers of late promoter motifs, indicating that species-specific elements contained in some late promoters were favored by the native viral RNA polymerases for optimal transcription. The 8-nt sequence TAAATAAG encompassing the ATAAG motif was conserved in the AcMNPV polh, p10, and pk1 promoters. The 5-nt sequence CAATT located 4 or 5 nt upstream of the T/ATAAG motif was conserved in the promoters of PlxyGV gran, p10c, and pk1. The results of this study demonstrated that PlxyGV late gene promoters could be effectively activated by the RNA polymerase from AcMNPV, implying that late gene expression systems are regulated by similar mechanisms in alphabaculoviruses and betabaculoviruses.

  18. Genome sequence of an enhancin gene-rich nucleopolyhedrovirus (NPV) from Agrotis segetum: collinearity with Spodoptera exigua multiple NPV

    NARCIS (Netherlands)

    Jakubowska, A.K.; Peters, S.A.; Ziemnicka, J.; Vlak, J.M.; Oers, van M.M.

    2006-01-01

    The genome sequence of a Polish isolate of Agrotis segetum nucleopolyhedrovirus (AgseNPV-A) was determined and analysed. The circular genome is composed of 147 544 bp and has a G+C content of 45¿7 mol%. It contains 153 putative, non-overlapping open reading frames (ORFs) encoding predicted proteins

  19. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Yu-Wei [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Simmons, Blake A. [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Singer, Steven W. [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2015-10-29

    The recovery of genomes from metagenomic datasets is a critical step to defining the functional roles of the underlying uncultivated populations. We previously developed MaxBin, an automated binning approach for high-throughput recovery of microbial genomes from metagenomes. Here, we present an expanded binning algorithm, MaxBin 2.0, which recovers genomes from co-assembly of a collection of metagenomic datasets. Tests on simulated datasets revealed that MaxBin 2.0 is highly accurate in recovering individual genomes, and the application of MaxBin 2.0 to several metagenomes from environmental samples demonstrated that it could achieve two complementary goals: recovering more bacterial genomes compared to binning a single sample as well as comparing the microbial community composition between different sampling environments. Availability and implementation: MaxBin 2.0 is freely available at http://sourceforge.net/projects/maxbin/ under BSD license. Supplementary information: Supplementary data are available at Bioinformatics online.

  20. The mitochondrial genomes of sponges provide evidence for multiple invasions by Repetitive Hairpin-forming Elements (RHE

    Directory of Open Access Journals (Sweden)

    Lavrov Dennis V

    2009-12-01

    Full Text Available Abstract Background The mitochondrial (mt genomes of sponges possess a variety of features, which appear to be intermediate between those of Eumetazoa and non-metazoan opisthokonts. Among these features is the presence of long intergenic regions, which are common in other eukaryotes, but generally absent in Eumetazoa. Here we analyse poriferan mitochondrial intergenic regions, paying particular attention to repetitive sequences within them. In this context we introduce the mitochondrial genome of Ircinia strobilina (Lamarck, 1816; Demospongiae: Dictyoceratida and compare it with mtDNA of other sponges. Results Mt genomes of dictyoceratid sponges are identical in gene order and content but display major differences in size and organization of intergenic regions. An even higher degree of diversity in the structure of intergenic regions was found among different orders of demosponges. One interesting observation made from such comparisons was of what appears to be recurrent invasions of sponge mitochondrial genomes by repetitive hairpin-forming elements, which cause large genome size differences even among closely related taxa. These repetitive hairpin-forming elements are structurally and compositionally divergent and display a scattered distribution throughout various groups of demosponges. Conclusion Large intergenic regions of poriferan mt genomes are targets for insertions of repetitive hairpin- forming elements, similar to the ones found in non-metazoan opisthokonts. Such elements were likely present in some lineages early in animal mitochondrial genome evolution but were subsequently lost during the reduction of intergenic regions, which occurred in the Eumetazoa lineage after the split of Porifera. Porifera acquired their elements in several independent events. Patterns of their intra-genomic dispersal can be seen in the mt genome of Vaceletia sp.

  1. The human homolog of S. cerevisiae CDC27, CDC27 Hs, is encoded by a highly conserved intronless gene present in multiple copies in the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Devor, E.J.; Dill-Devor, R.M. [Univ. of Iowa College of Medicine, Iowa City (United States)

    1994-09-01

    We have obtained a number of unique sequences via PCR amplification of human genomic DNA using degenerate primers under low stringency (42{degrees}C). One of these, an 853 bp product, has been identified as a partial genomic sequence of the human homolog of the S. cerevisiae CDC27 gene, CDC27Hs (GenBank No. U00001). This gene, reported by Turgendreich et al. is also designated EST00556 from Adams et al. We have undertaken a more detailed examination of our sequence, MCP34N, and have found that: 1. the genomic sequence is nearly identical to CDC27Hs over its entire 853 bp length; 2. an MCP34N-specific PCR assay of several non-human primate species reveals amplification products in chimpanzee and gorilla genomes having greater than 90% sequence identity with CDC27Hs; and 3. an MCP34N-specific PCR assay of the BIOS hybrid cell line panel gives a discordancy pattern suggesting multiple loci. Based upon these data, we present the following initial characterization: 1. the complete MCP34N sequence identity with CDC27Hs indicates that the latter is encoded by an intronless gene; 2. CDC27Hs is highly conserved among higher primates; and 3. CDC27Hs is present in multiple copies in the human genome. These characteristics, taken together with those initially reported for CDC27Hs, suggest that this is an old gene that carries out an important but, as yet, unknown function in the human brain.

  2. Integrative proteomics, genomics, and translational immunology approaches reveal mutated forms of Proteolipid Protein 1 (PLP1) and mutant-specific immune response in multiple sclerosis.

    Science.gov (United States)

    Qendro, Veneta; Bugos, Grace A; Lundgren, Debbie H; Glynn, John; Han, May H; Han, David K

    2017-03-01

    In order to gain mechanistic insights into multiple sclerosis (MS) pathogenesis, we utilized a multi-dimensional approach to test the hypothesis that mutations in myelin proteins lead to immune activation and central nervous system autoimmunity in MS. Mass spectrometry-based proteomic analysis of human MS brain lesions revealed seven unique mutations of PLP1; a key myelin protein that is known to be destroyed in MS. Surprisingly, in-depth genomic analysis of two MS patients at the genomic DNA and mRNA confirmed mutated PLP1 in RNA, but not in the genomic DNA. Quantification of wild type and mutant PLP RNA levels by qPCR further validated the presence of mutant PLP RNA in the MS patients. To seek evidence linking mutations in abundant myelin proteins and immune-mediated destruction of myelin, specific immune response against mutant PLP1 in MS patients was examined. Thus, we have designed paired, wild type and mutant peptide microarrays, and examined antibody response to multiple mutated PLP1 in sera from MS patients. Consistent with the idea of different patients exhibiting unique mutation profiles, we found that 13 out of 20 MS patients showed antibody responses against specific but not against all the mutant-PLP1 peptides. Interestingly, we found mutant PLP-directed antibody response against specific mutant peptides in the sera of pre-MS controls. The results from integrative proteomic, genomic, and immune analyses reveal a possible mechanism of mutation-driven pathogenesis in human MS. The study also highlights the need for integrative genomic and proteomic analyses for uncovering pathogenic mechanisms of human diseases. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Whole-genome sequencing of Bacillus subtilis XF-1 reveals mechanisms for biological control and multiple beneficial properties in plants.

    Science.gov (United States)

    Guo, Shengye; Li, Xingyu; He, Pengfei; Ho, Honhing; Wu, Yixin; He, Yueqiu

    2015-06-01

    Bacillus subtilis XF-1 is a gram-positive, plant-associated bacterium that stimulates plant growth and produces secondary metabolites that suppress soil-borne plant pathogens. In particular, it is especially highly efficient at controlling the clubroot disease of cruciferous crops. Its 4,061,186-bp genome contains an estimated 3853 protein-coding sequences and the 1155 genes of XF-1 are present in most genome-sequenced Bacillus strains: 3757 genes in B. subtilis 168, and 1164 in B. amyloliquefaciens FZB42. Analysis using the Cluster of Orthologous Groups database of proteins shows that 60 genes control bacterial mobility, 221 genes are related to cell wall and membrane biosynthesis, and more than 112 are genes associated with secondary metabolites. In addition, the genes contributed to the strain's plant colonization, bio-control and stimulation of plant growth. Sequencing of the genome is a fundamental step for developing a desired strain to serve as an efficient biological control agent and plant growth stimulator. Similar to other members of the taxon, XF-1 has a genome that contains giant gene clusters for the non-ribosomal synthesis of antifungal lipopeptides (surfactin and fengycin), the polyketides (macrolactin and bacillaene), the siderophore bacillibactin, and the dipeptide bacilysin. There are two synthesis pathways for volatile growth-promoting compounds. The expression of biosynthesized antibiotic peptides in XF-1 was revealed by matrix-assisted laser desorption/ionization-time of flight mass spectrometry.

  4. Rice-Infecting Pseudomonas Genomes Are Highly Accessorized and Harbor Multiple Putative Virulence Mechanisms to Cause Sheath Brown Rot

    Science.gov (United States)

    Quibod, Ian Lorenzo; Grande, Genelou; Oreiro, Eula Gems; Borja, Frances Nikki; Dossa, Gerbert Sylvestre; Mauleon, Ramil; Cruz, Casiana Vera; Oliva, Ricardo

    2015-01-01

    Sheath rot complex and seed discoloration in rice involve a number of pathogenic bacteria that cannot be associated with distinctive symptoms. These pathogens can easily travel on asymptomatic seeds and therefore represent a threat to rice cropping systems. Among the rice-infecting Pseudomonas, P. fuscovaginae has been associated with sheath brown rot disease in several rice growing areas around the world. The appearance of a similar Pseudomonas population, which here we named P. fuscovaginae-like, represents a perfect opportunity to understand common genomic features that can explain the infection mechanism in rice. We showed that the novel population is indeed closely related to P. fuscovaginae. A comparative genomics approach on eight rice-infecting Pseudomonas revealed heterogeneous genomes and a high number of strain-specific genes. The genomes of P. fuscovaginae-like harbor four secretion systems (Type I, II, III, and VI) and other important pathogenicity machinery that could probably facilitate rice colonization. We identified 123 core secreted proteins, most of which have strong signatures of positive selection suggesting functional adaptation. Transcript accumulation of putative pathogenicity-related genes during rice colonization revealed a concerted virulence mechanism. The study suggests that rice-infecting Pseudomonas causing sheath brown rot are intrinsically diverse and maintain a variable set of metabolic capabilities as a potential strategy to occupy a range of environments. PMID:26422147

  5. Genome-wide meta-analyses of multiancestry cohorts identify multiple new susceptibility loci for refractive error and myopia

    NARCIS (Netherlands)

    Verhoeven, Virginie J. M.; Hysi, Pirro G.; Wojciechowski, Robert; Fan, Qiao; Guggenheim, Jeremy A.; Höhn, René; Macgregor, Stuart; Hewitt, Alex W.; Nag, Abhishek; Cheng, Ching-Yu; Yonova-Doing, Ekaterina; Zhou, Xin; Ikram, M. Kamran; Buitendijk, Gabriëlle H. S.; McMahon, George; Kemp, John P.; Pourcain, Beate St; Simpson, Claire L.; Mäkelä, Kari-Matti; Lehtimäki, Terho; Kähönen, Mika; Paterson, Andrew D.; Hosseini, S. Mohsen; Wong, Hoi Suen; Xu, Liang; Jonas, Jost B.; Pärssinen, Olavi; Wedenoja, Juho; Yip, Shea Ping; Ho, Daniel W. H.; Pang, Chi Pui; Chen, Li Jia; Burdon, Kathryn P.; Craig, Jamie E.; Klein, Barbara E. K.; Klein, Ronald; Haller, Toomas; Metspalu, Andres; Khor, Chiea-Chuen; Tai, E.-Shyong; Aung, Tin; Vithana, Eranga; Tay, Wan-Ting; Barathi, Veluchamy A.; Chen, Peng; Li, Ruoying; Liao, Jiemin; Zheng, Yingfeng; Bergen, Arthur A. B.; Chen, Wei

    2013-01-01

    Refractive error is the most common eye disorder worldwide and is a prominent cause of blindness. Myopia affects over 30% of Western populations and up to 80% of Asians. The CREAM consortium conducted genome-wide meta-analyses, including 37,382 individuals from 27 studies of European ancestry and

  6. Meta-analysis of five genome-wide association studies identifies multiple new loci associated with testicular germ cell tumor

    DEFF Research Database (Denmark)

    Wang, Zhaoming; McGlynn, Katherine A.; Rajpert-De Meyts, Ewa

    2017-01-01

    The international Testicular Cancer Consortium (TECAC) combined five published genome-wide association studies of testicular germ cell tumor (TGCT; 3,558 cases and 13,970 controls) to identify new susceptibility loci. We conducted a fixed-effects meta-analysis, including, to our knowledge, the fi...

  7. Genomic characterisation of Arachis porphyrocalyx (Valls & C.E. Simpson, 2005) (Leguminosae): multiple origin of Arachis species with x = 9

    Science.gov (United States)

    Celeste, Silvestri María; Ortiz, Alejandra Marcela; Robledo, Germán Ariel; Valls, José Francisco Montenegro; Lavia, Graciela Inés

    2017-01-01

    Abstract The genus Arachis Linnaeus, 1753 comprises four species with x = 9, three belong to the section Arachis: Arachis praecox (Krapov. W.C. Greg. & Valls, 1994), Arachis palustris (Krapov. W.C. Greg. & Valls, 1994) and Arachis decora (Krapov. W.C. Greg. & Valls, 1994) and only one belongs to the section Erectoides: Arachis porphyrocalyx (Valls & C.E. Simpson, 2005). Recently, the x = 9 species of section Arachis have been assigned to G genome, the latest described so far. The genomic relationship of Arachis porphyrocalyx with these species is controversial. In the present work, we carried out a karyotypic characterisation of Arachis porphyrocalyx to evaluate its genomic structure and analyse the origin of all x = 9 Arachis species. Arachis porphyrocalyx showed a karyotype formula of 14m+4st, one pair of A chromosomes, satellited chromosomes type 8, one pair of 45S rDNA sites in the SAT chromosomes, one pair of 5S rDNA sites and pericentromeric C-DAPI+ bands in all chromosomes. Karyotype structure indicates that Arachis porphyrocalyx does not share the same genome type with the other three x = 9 species and neither with the remaining Erectoides species. Taking into account the geographic distribution, morphological and cytogenetic features, the origin of species with x = 9 of the genus Arachis cannot be unique; instead, they originated at least twice in the evolutionary history of the genus. PMID:28919947

  8. ISOLATION OF THE GENOME SEQUENCE STRAIN MYCOBACTERIUM AVIUM 104 FROM MULTIPLE PATIENTS OVER A 17-YEAR PERIOD

    Science.gov (United States)

    The genome sequence strain 104 of the opportunistic pathogen Mycobacterium avium was isolated form an adult AIDS patient in Southern California in 1983. Isolates of non-paratuberculosis M. avium from 207 other patients in Southern California and elsewhere were examined for genoty...

  9. Detection of Multiple Parallel Transmission Outbreak of Streptococcus suis Human Infection by Use of Genome Epidemiology, China, 2005.

    Science.gov (United States)

    Du, Pengcheng; Zheng, Han; Zhou, Jieping; Lan, Ruiting; Ye, Changyun; Jing, Huaiqi; Jin, Dong; Cui, Zhigang; Bai, Xuemei; Liang, Jianming; Liu, Jiantao; Xu, Lei; Zhang, Wen; Chen, Chen; Xu, Jianguo

    2017-02-01

    Streptococcus suis sequence type 7 emerged and caused 2 of the largest human infection outbreaks in China in 1998 and 2005. To determine the major risk factors and source of the infections, we analyzed whole genomes of 95 outbreak-associated isolates, identified 160 single nucleotide polymorphisms, and classified them into 6 clades. Molecular clock analysis revealed that clade 1 (responsible for the 1998 outbreak) emerged in October 1997. Clades 2-6 (responsible for the 2005 outbreak) emerged separately during February 2002-August 2004. A total of 41 lineages of S. suis emerged by the end of 2004 and rapidly expanded to 68 genome types through single base mutations when the outbreak occurred in June 2005. We identified 32 identical isolates and classified them into 8 groups, which were distributed in a large geographic area with no transmission link. These findings suggest that persons were infected in parallel in respective geographic sites.

  10. Genome-wide association study of clinically defined gout identifies multiple risk loci and its association with clinical subtypes

    OpenAIRE

    Matsuo, Hirotaka; Yamamoto, Ken; Nakaoka, Hirofumi; Nakayama, Akiyoshi; Sakiyama, Masayuki; Chiba, Toshinori; Takahashi, Atsushi; Nakamura, Takahiro; Nakashima, Hiroshi; Takada, Yuzo; Danjoh, Inaho; Shimizu, Seiko; Abe, Junko; Kawamura, Yusuke; Terashige, Sho

    2015-01-01

    Objective Gout, caused by hyperuricaemia, is a multifactorial disease. Although genome-wide association studies (GWASs) of gout have been reported, they included self-reported gout cases in which clinical information was insufficient. Therefore, the relationship between genetic variation and clinical subtypes of gout remains unclear. Here, we first performed a GWAS of clinically defined gout cases only. Methods A GWAS was conducted with 945 patients with clinically defined gout and 1213 contr...

  11. Multiple-integrations of HPV16 genome and altered transcription of viral oncogenes and cellular genes are associated with the development of cervical cancer.

    Directory of Open Access Journals (Sweden)

    Xulian Lu

    Full Text Available The constitutive expression of the high-risk HPV E6 and E7 viral oncogenes is the major cause of cervical cancer. To comprehensively explore the composition of HPV16 early transcripts and their genomic annotation, cervical squamous epithelial tissues from 40 HPV16-infected patients were collected for analysis of papillomavirus oncogene transcripts (APOT. We observed different transcription patterns of HPV16 oncogenes in progression of cervical lesions to cervical cancer and identified one novel transcript. Multiple-integration events in the tissues of cervical carcinoma (CxCa are significantly more often than those of low-grade squamous intraepithelial lesions (LSIL and high-grade squamous intraepithelial lesions (HSIL. Moreover, most cellular genes within or near these integration sites are cancer-associated genes. Taken together, this study suggests that the multiple-integrations of HPV genome during persistent viral infection, which thereby alters the expression patterns of viral oncogenes and integration-related cellular genes, play a crucial role in progression of cervical lesions to cervix cancer.

  12. Application of single-step genomic best linear unbiased prediction with a multiple-lactation random regression test-day model for Japanese Holsteins.

    Science.gov (United States)

    Baba, Toshimi; Gotoh, Yusaku; Yamaguchi, Satoshi; Nakagawa, Satoshi; Abe, Hayato; Masuda, Yutaka; Kawahara, Takayoshi

    2017-08-01

    This study aimed to evaluate a validation reliability of single-step genomic best linear unbiased prediction (ssGBLUP) with a multiple-lactation random regression test-day model and investigate an effect of adding genotyped cows on the reliability. Two data sets for test-day records from the first three lactations were used: full data from February 1975 to December 2015 (60 850 534 records from 2 853 810 cows) and reduced data cut off in 2011 (53 091 066 records from 2 502 307 cows). We used marker genotypes of 4480 bulls and 608 cows. Genomic enhanced breeding values (GEBV) of 305-day milk yield in all the lactations were estimated for at least 535 young bulls using two marker data sets: bull genotypes only and both bulls and cows genotypes. The realized reliability (R 2 ) from linear regression analysis was used as an indicator of validation reliability. Using only genotyped bulls, R 2 was ranged from 0.41 to 0.46 and it was always higher than parent averages. The very similar R 2 were observed when genotyped cows were added. An application of ssGBLUP to a multiple-lactation random regression model is feasible and adding a limited number of genotyped cows has no significant effect on reliability of GEBV for genotyped bulls. © 2016 Japanese Society of Animal Science.

  13. Comparative genomic analysis reveals multiple long terminal repeats, lineage-specific amplification, and frequent interelement recombination for Cassandra retrotransposon in pear (Pyrus bretschneideri Rehd.).

    Science.gov (United States)

    Yin, Hao; Du, Jianchang; Li, Leiting; Jin, Cong; Fan, Lian; Li, Meng; Wu, Jun; Zhang, Shaoling

    2014-06-04

    Cassandra transposable elements belong to a specific group of terminal-repeat retrotransposons in miniature (TRIM). Although Cassandra TRIM elements have been found in almost all vascular plants, detailed investigations on the nature, abundance, amplification timeframe, and evolution have not been performed in an individual genome. We therefore conducted a comprehensive analysis of Cassandra retrotransposons using the newly sequenced pear genome along with four other Rosaceae species, including apple, peach, mei, and woodland strawberry. Our data reveal several interesting findings for this particular retrotransposon family: 1) A large number of the intact copies contain three, four, or five long terminal repeats (LTRs) (∼20% in pear); 2) intact copies and solo LTRs with or without target site duplications are both common (∼80% vs. 20%) in each genome; 3) the elements exhibit an overall unbiased distribution among the chromosomes; 4) the elements are most successfully amplified in pear (5,032 copies); and 5) the evolutionary relationships of these elements vary among different lineages, species, and evolutionary time. These results indicate that Cassandra retrotransposons contain more complex structures (elements with multiple LTRs) than what we have known previously, and that frequent interelement unequal recombination followed by transposition may play a critical role in shaping and reshaping host genomes. Thus this study provides insights into the property, propensity, and molecular mechanisms governing the formation and amplification of Cassandra retrotransposons, and enhances our understanding of the structural variation, evolutionary history, and transposition process of LTR retrotransposons in plants. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Efficient Multiple Genome Modifications Induced by the crRNAs, tracrRNA and Cas9 Protein Complex in Zebrafish

    Science.gov (United States)

    Ohga, Rie; Ota, Satoshi; Kawahara, Atsuo

    2015-01-01

    The type II clustered regularly interspaced short palindromic repeats (CRISPR) associated with Cas9 endonuclease (CRISPR/Cas9) has become a powerful genetic tool for understanding the function of a gene of interest. In zebrafish, the injection of Cas9 mRNA and guide-RNA (gRNA), which are prepared using an in vitro transcription system, efficiently induce DNA double-strand breaks (DSBs) at the targeted genomic locus. Because gRNA was originally constructed by fusing two short RNAs CRISPR RNA (crRNA) and trans-activating crRNA (tracrRNA), we examined the effect of synthetic crRNAs and tracrRNA with Cas9 mRNA or Cas9 protein on the genome editing activity. We previously reported that the disruption of tyrosinase (tyr) by tyr-gRNA/Cas9 mRNA causes a retinal pigment defect, whereas the disruption of spns2 by spns2-gRNA1/Cas9 mRNA leads to a cardiac progenitor migration defect in zebrafish. Here, we found that the injection of spns2-crRNA1, tyr-crRNA and tracrRNA with Cas9 mRNA or Cas9 protein simultaneously caused a migration defect in cardiac progenitors and a pigment defect in retinal epithelial cells. A time course analysis demonstrated that the injection of crRNAs and tracrRNA with Cas9 protein rapidly induced genome modifications compared with the injection of crRNAs and tracrRNA with Cas9 mRNA. We further show that the crRNA-tracrRNA-Cas9 protein complex is functional for the visualization of endogenous gene expression; therefore, this is a very powerful, ready-to-use system in zebrafish. PMID:26010089

  15. Genome-wide analysis of the sox family in the calcareous sponge Sycon ciliatum: multiple genes with unique expression patterns

    Directory of Open Access Journals (Sweden)

    Fortunato Sofia

    2012-07-01

    Full Text Available Abstract Background Sox genes are HMG-domain containing transcription factors with important roles in developmental processes in animals; many of them appear to have conserved functions among eumetazoans. Demosponges have fewer Sox genes than eumetazoans, but their roles remain unclear. The aim of this study is to gain insight into the early evolutionary history of the Sox gene family by identification and expression analysis of Sox genes in the calcareous sponge Sycon ciliatum. Methods Calcaronean Sox related sequences were retrieved by searching recently generated genomic and transcriptome sequence resources and analyzed using variety of phylogenetic methods and identification of conserved motifs. Expression was studied by whole mount in situ hybridization. Results We have identified seven Sox genes and four Sox-related genes in the complete genome of Sycon ciliatum. Phylogenetic and conserved motif analyses showed that five of Sycon Sox genes represent groups B, C, E, and F present in cnidarians and bilaterians. Two additional genes are classified as Sox genes but cannot be assigned to specific subfamilies, and four genes are more similar to Sox genes than to other HMG-containing genes. Thus, the repertoire of Sox genes is larger in this representative of calcareous sponges than in the demosponge Amphimedon queenslandica. It remains unclear whether this is due to the expansion of the gene family in Sycon or a secondary reduction in the Amphimedon genome. In situ hybridization of Sycon Sox genes revealed a variety of expression patterns during embryogenesis and in specific cell types of adult sponges. Conclusions In this study, we describe a large family of Sox genes in Sycon ciliatum with dynamic expression patterns, indicating that Sox genes are regulators in development and cell type determination in sponges, as observed in higher animals. The revealed differences between demosponge and calcisponge Sox genes repertoire highlight the need to

  16. Genome-wide association study identifies multiple loci associated with both mammographic density and breast cancer risk

    Science.gov (United States)

    Lindström, Sara; Thompson, Deborah J.; Paterson, Andrew D.; Li, Jingmei; Gierach, Gretchen L.; Scott, Christopher; Stone, Jennifer; Douglas, Julie A.; dos-Santos-Silva, Isabel; Fernandez-Navarro, Pablo; Verghase, Jajini; Smith, Paula; Brown, Judith; Luben, Robert; Wareham, Nicholas J.; Loos, Ruth J.F.; Heit, John A.; Pankratz, V. Shane; Norman, Aaron; Goode, Ellen L.; Cunningham, Julie M.; deAndrade, Mariza; Vierkant, Robert A.; Czene, Kamila; Fasching, Peter A.; Baglietto, Laura; Southey, Melissa C.; Giles, Graham G.; Shah, Kaanan P.; Chan, Heang-Ping; Helvie, Mark A.; Beck, Andrew H.; Knoblauch, Nicholas W.; Hazra, Aditi; Hunter, David J.; Kraft, Peter; Pollan, Marina; Figueroa, Jonine D.; Couch, Fergus J.; Hopper, John L.; Hall, Per; Easton, Douglas F.; Boyd, Norman F.; Vachon, Celine M.; Tamimi, Rulla M.

    2015-01-01

    Mammographic density reflects the amount of stromal and epithelial tissues in relation to adipose tissue in the breast and is a strong risk factor for breast cancer. Here we report the results from meta-analysis of genome-wide association studies (GWAS) of three mammographic density phenotypes: dense area, non-dense area and percent density in up to 7,916 women in stage 1 and an additional 10,379 women in stage 2. We identify genome-wide significant (P<5×10−8) loci for dense area (AREG, ESR1, ZNF365, LSP1/TNNT3, IGF1, TMEM184B, SGSM3/MKL1), non-dense area (8p11.23) and percent density (PRDM6, 8p11.23, TMEM184B). Four of these regions are known breast cancer susceptibility loci, and four additional regions were found to be associated with breast cancer (P<0.05) in a large meta-analysis. These results provide further evidence of a shared genetic basis between mammographic density and breast cancer and illustrate the power of studying intermediate quantitative phenotypes to identify putative disease susceptibility loci. PMID:25342443

  17. Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies

    Science.gov (United States)

    Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert

    2016-01-01

    The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008–2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0. PMID:27892471

  18. Genome-wide association study in discordant sibships identifies multiple inherited susceptibility alleles linked to lung cancer.

    Science.gov (United States)

    Galvan, Antonella; Falvella, Felicia S; Frullanti, Elisa; Spinola, Monica; Incarbone, Matteo; Nosotti, Mario; Santambrogio, Luigi; Conti, Barbara; Pastorino, Ugo; Gonzalez-Neira, Anna; Dragani, Tommaso A

    2010-03-01

    We analyzed a series of young (median age = 52 years) non-smoker lung cancer patients and their unaffected siblings as controls, using a genome-wide 620 901 single-nucleotide polymorphism (SNP) array analysis and a case-control DNA pooling approach. We identified 82 putatively associated SNPs that were retested by individual genotyping followed by use of the sib transmission disequilibrium test, pointing to 36 SNPs associated with lung cancer risk in the discordant sibs series. Analysis of these 36 SNPs in a polygenic model characterized by additive and interchangeable effects of rare alleles revealed a highly statistically significant dosage-dependent association between risk allele carrier status and proportion of cancer cases. Replication of the same 36 SNPs in a population-based series confirmed the association with lung cancer for three SNPs, suggesting that phenocopies and genetic heterogeneity can play a major role in the complex genetics of lung cancer risk in the general population.

  19. Statistical methods for QTL mapping and genomic prediction of multiple traits and environments: case studies in pepper

    NARCIS (Netherlands)

    Alimi, Nurudeen Adeniyi

    2016-01-01

    In this thesis we describe the results of a number of quantitative techniques that were used to understand the genetics of yield in pepper as an example of complex trait measured in a number of environments. Main objectives were; i) to propose a number of mixed models to detect QTLs for multiple

  20. Genome-wide mRNA and miRNA expression profiling reveal multiple regulatory networks in colorectal cancer

    DEFF Research Database (Denmark)

    Vishnubalaji, R; Hamam, R; Abdulla, M-H

    2015-01-01

    Despite recent advances in cancer management, colorectal cancer (CRC) remains the third most common cancer and a major health-care problem worldwide. MicroRNAs have recently emerged as key regulators of cancer development and progression by targeting multiple cancer-related genes; however, such r...

  1. Multiple sex-associated regions and a putative sex chromosome in zebrafish revealed by RAD mapping and population genomics.

    Directory of Open Access Journals (Sweden)

    Jennifer L Anderson

    Full Text Available Within vertebrates, major sex determining genes can differ among taxa and even within species. In zebrafish (Danio rerio, neither heteromorphic sex chromosomes nor single sex determination genes of large effect, like Sry in mammals, have yet been identified. Furthermore, environmental factors can influence zebrafish sex determination. Although progress has been made in understanding zebrafish gonad differentiation (e.g. the influence of germ cells on gonad fate, the primary genetic basis of zebrafish sex determination remains poorly understood. To identify genetic loci associated with sex, we analyzed F(2 offspring of reciprocal crosses between Oregon *AB and Nadia (NA wild-type zebrafish stocks. Genome-wide linkage analysis, using more than 5,000 sequence-based polymorphic restriction site associated (RAD-tag markers and population genomic analysis of more than 30,000 single nucleotide polymorphisms in our *ABxNA crosses revealed a sex-associated locus on the end of the long arm of chr-4 for both cross families, and an additional locus in the middle of chr-3 in one cross family. Additional sequencing showed that two SNPs in dmrt1 previously suggested to be functional candidates for sex determination in a cross of ABxIndia wild-type zebrafish, are not associated with sex in our AB fish. Our data show that sex determination in zebrafish is polygenic and that different genes may influence sex determination in different strains or that different genes become more important under different environmental conditions. The association of the end of chr-4 with sex is remarkable because, unique in the karyotype, this chromosome arm shares features with known sex chromosomes: it is highly heterochromatic, repetitive, late replicating, and has reduced recombination. Our results reveal that chr-4 has functional and structural properties expected of a sex chromosome.

  2. Integrative Genomics: Quantifying significance of phenotype-genotype relationships from multiple sources of high-throughput data

    Directory of Open Access Journals (Sweden)

    Eric eGamazon

    2013-05-01

    Full Text Available Given recent advances in the generation of high-throughput data such as whole genome genetic variation and transcriptome expression, it is critical to come up with novel methods to integrate these heterogeneous datasets and to assess the significance of identified phenotype-genotype relationships. Recent studies show that genome-wide association findings are likely to fall in loci with gene regulatory effects such as expression quantitative trait loci (eQTLs, demonstrating the utility of such integrative approaches. When genotype and gene expression data are available on the same individuals, we developed methods wherein top phenotype-associated genetic variants are prioritized if they are associated, as eQTLs, with gene expression traits that are themselves associated with the phenotype. Yet there has been no method to determine an overall p-value for the findings that arise specifically from the integrative nature of the approach. We propose a computationally feasible permutation method that accounts for the assimilative nature of the method and the correlation structure among gene expression traits and among genotypes. We apply the method to data from a study of cellular sensitivity to etoposide, one of the most widely used chemotherapeutic drugs. To our knowledge, this study is the first statistically sound quantification of the significance of the genotype-phenotype relationships resulting from applying an integrative approach. This method can be easily extended to cases in which gene expression data are replaced by other molecular phenotypes of interest, e.g., microRNA or proteomic data. This study has important implications for studies seeking to expand on genetic association studies by the use of omics data. Finally, we provide an R code to compute the empirical FDR when p-values for the observed and simulated phenotypes are available.

  3. Distinctive mitochondrial genome of Calanoid copepod Calanus sinicus with multiple large non-coding regions and reshuffled gene order: Useful molecular markers for phylogenetic and population studies

    Science.gov (United States)

    2011-01-01

    Background Copepods are highly diverse and abundant, resulting in extensive ecological radiation in marine ecosystems. Calanus sinicus dominates continental shelf waters in the northwest Pacific Ocean and plays an important role in the local ecosystem by linking primary production to higher trophic levels. A lack of effective molecular markers has hindered phylogenetic and population genetic studies concerning copepods. As they are genome-level informative, mitochondrial DNA sequences can be used as markers for population genetic studies and phylogenetic studies. Results The mitochondrial genome of C. sinicus is distinct from other arthropods owing to the concurrence of multiple non-coding regions and a reshuffled gene arrangement. Further particularities in the mitogenome of C. sinicus include low A + T-content, symmetrical nucleotide composition between strands, abbreviated stop codons for several PCGs and extended lengths of the genes atp6 and atp8 relative to other copepods. The monophyletic Copepoda should be placed within the Vericrustacea. The close affinity between Cyclopoida and Poecilostomatoida suggests reassigning the latter as subordinate to the former. Monophyly of Maxillopoda is rejected. Within the alignment of 11 C. sinicus mitogenomes, there are 397 variable sites harbouring three 'hotspot' variable sites and three microsatellite loci. Conclusion The occurrence of the circular subgenomic fragment during laboratory assays suggests that special caution should be taken when sequencing mitogenomes using long PCR. Such a phenomenon may provide additional evidence of mitochondrial DNA recombination, which appears to have been a prerequisite for shaping the present mitochondrial profile of C. sinicus during its evolution. The lack of synapomorphic gene arrangements among copepods has cast doubt on the utility of gene order as a useful molecular marker for deep phylogenetic analysis. However, mitochondrial genomic sequences have been valuable markers for

  4. Molecular evolution of avian reovirus: evidence for genetic diversity and reassortment of the S-class genome segments and multiple cocirculating lineages

    International Nuclear Information System (INIS)

    Liu, Hung J.; Lee, Long H.; Hsu, Hsiao W.; Kuo, Liam C.; Liao, Ming H.

    2003-01-01

    Nucleotide sequences of the S-class genome segments of 17 field-isolates and vaccine strains of avian reovirus (ARV) isolated over a 23-year period from different hosts, pathotypes, and geographic locations were examined and analyzed to define phylogenetic profiles and evolutionary mechanism. The S1 genome segment showed noticeably higher divergence than the other S-class genes. The σC-encoding gene has evolved into six distinct lineages. In contrast, the other S-class genes showed less divergence than that of the σC-encoding gene and have evolved into two to three major distinct lineages, respectively. Comparative sequence analysis provided evidence indicating extensive sequence divergence between ARV and other orthoreoviruses. The evolutionary trees of each gene were distinct, suggesting that these genes evolve in an independent manner. Furthermore, variable topologies were the result of frequent genetic reassortment among multiple cocirculating lineages. Results showed genetic diversity correlated more closely with date of isolation and geographic sites than with host species and pathotypes. This is the first evidence demonstrating genetic variability among circulating ARVs through a combination of evolutionary mechanisms involving multiple cocirculating lineages and genetic reassortment. The evolutionary rates and patterns of base substitutions were examined. The evolutionary rate for the σC-encoding gene and σC protein was higher than for the other S-class genes and other family of viruses. With the exception of the σC-encoding gene, which nonsynonymous substitutions predominate over synonymous, the evolutionary process of the other S-class genes can be explained by the neutral theory of molecular evolution. Results revealed that synonymous substitutions predominate over nonsynonymous in the S-class genes, even though genetic diversity and substitution rates vary among the viruses

  5. Nuclear topography of the 1q21 genomic region and MCl-1 protein levels associated with pathophysiology of multiple myeloma

    Czech Academy of Sciences Publication Activity Database

    Legartová, Soňa; Krejčí, Jana; Harničarová, Andrea; Hájek, R.; Kozubek, Stanislav; Bártová, Eva

    2009-01-01

    Roč. 56, č. 5 (2009), s. 404-413 ISSN 0028-2685 R&D Projects: GA MŠk(CZ) LC06027; GA AV ČR(CZ) 1QS500040508 Institutional research plan: CEZ:AV0Z50040507; CEZ:AV0Z50040702 Keywords : multiple myeloma * 1q21 * DNA-FISH Subject RIV: BO - Biophysics Impact factor: 1.192, year: 2009

  6. Genome-wide association study of clinically defined gout identifies multiple risk loci and its association with clinical subtypes.

    Science.gov (United States)

    Matsuo, Hirotaka; Yamamoto, Ken; Nakaoka, Hirofumi; Nakayama, Akiyoshi; Sakiyama, Masayuki; Chiba, Toshinori; Takahashi, Atsushi; Nakamura, Takahiro; Nakashima, Hiroshi; Takada, Yuzo; Danjoh, Inaho; Shimizu, Seiko; Abe, Junko; Kawamura, Yusuke; Terashige, Sho; Ogata, Hiraku; Tatsukawa, Seishiro; Yin, Guang; Okada, Rieko; Morita, Emi; Naito, Mariko; Tokumasu, Atsumi; Onoue, Hiroyuki; Iwaya, Keiichi; Ito, Toshimitsu; Takada, Tappei; Inoue, Katsuhisa; Kato, Yukio; Nakamura, Yukio; Sakurai, Yutaka; Suzuki, Hiroshi; Kanai, Yoshikatsu; Hosoya, Tatsuo; Hamajima, Nobuyuki; Inoue, Ituro; Kubo, Michiaki; Ichida, Kimiyoshi; Ooyama, Hiroshi; Shimizu, Toru; Shinomiya, Nariyoshi

    2016-04-01

    Gout, caused by hyperuricaemia, is a multifactorial disease. Although genome-wide association studies (GWASs) of gout have been reported, they included self-reported gout cases in which clinical information was insufficient. Therefore, the relationship between genetic variation and clinical subtypes of gout remains unclear. Here, we first performed a GWAS of clinically defined gout cases only. A GWAS was conducted with 945 patients with clinically defined gout and 1213 controls in a Japanese male population, followed by replication study of 1048 clinically defined cases and 1334 controls. Five gout susceptibility loci were identified at the genome-wide significance level (pgenes (ABCG2 and SLC2A9) and additional genes: rs1260326 (p=1.9×10(-12); OR=1.36) of GCKR (a gene for glucose and lipid metabolism), rs2188380 (p=1.6×10(-23); OR=1.75) of MYL2-CUX2 (genes associated with cholesterol and diabetes mellitus) and rs4073582 (p=6.4×10(-9); OR=1.66) of CNIH-2 (a gene for regulation of glutamate signalling). The latter two are identified as novel gout loci. Furthermore, among the identified single-nucleotide polymorphisms (SNPs), we demonstrated that the SNPs of ABCG2 and SLC2A9 were differentially associated with types of gout and clinical parameters underlying specific subtypes (renal underexcretion type and renal overload type). The effect of the risk allele of each SNP on clinical parameters showed significant linear relationships with the ratio of the case-control ORs for two distinct types of gout (r=0.96 [p=4.8×10(-4)] for urate clearance and r=0.96 [p=5.0×10(-4)] for urinary urate excretion). Our findings provide clues to better understand the pathogenesis of gout and will be useful for development of companion diagnostics. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  7. An Arabidopsis introgression zone studied at high spatio-temporal resolution: interglacial and multiple genetic contact exemplified using whole nuclear and plastid genomes.

    Science.gov (United States)

    Hohmann, Nora; Koch, Marcus A

    2017-10-23

    Gene flow between species, across ploidal levels, and even between evolutionary lineages is a common phenomenon in the genus Arabidopsis. However, apart from two genetically fully stabilized allotetraploid species that have been investigated in detail, the extent and temporal dynamics of hybridization are not well understood. An introgression zone, with tetraploid A. arenosa introgressing into A. lyrata subsp. petraea in the Eastern Austrian Forealps and subsequent expansion towards pannonical lowlands, was described previously based on morphological observations as well as molecular data using microsatellite and plastid DNA markers. Here we investigate the spatio-temporal context of this suture zone, making use of the potential of next-generation sequencing and whole-genome data. By utilizing a combination of nuclear and plastid genomic data, the extent, direction and temporal dynamics of gene flow are elucidated in detail and Late Pleistocene evolutionary processes are resolved. Analysis of nuclear genomic data significantly recognizes the clinal structure of the introgression zone, but also reveals that hybridization and introgression is more common and substantial than previously thought. Also tetraploid A. lyrata and A. arenosa subsp. borbasii from outside the previously defined suture zone show genomic signals of past introgression. A. lyrata is shown to serve usually as the maternal parent in these hybridizations, but one exception is identified from plastome-based phylogenetic reconstruction. Using plastid phylogenomics with secondary time calibration, the origin of A. lyrata and A. arenosa lineages is pre-dating the last three glaciation complexes (approx. 550,000 years ago). Hybridization and introgression followed during the last two glacial-interglacial periods (since approx. 300,000 years ago) with later secondary contact at the northern and southern border of the introgression zone during the Holocene. Footprints of adaptive introgression in the

  8. Comprehensive genetic assessment of the human embryo: can empiric application of microarray comparative genomic hybridization reduce multiple gestation rate by single fresh blastocyst transfer?

    Science.gov (United States)

    Sills, Eric Scott; Yang, Zhihong; Walsh, David J; Salem, Shala A

    2012-09-01

    The unacceptable multiple gestation rate currently associated with in vitro fertilization (IVF) would be substantially alleviated if the routine practice of transferring more than one embryo were reconsidered. While transferring a single embryo is an effective method to reduce the clinical problem of multiple gestation, rigid adherence to this approach has been criticized for negatively impacting clinical pregnancy success in IVF. In general, single embryo transfer is viewed cautiously by IVF patients although greater acceptance would result from a more effective embryo selection method. Selection of one embryo for fresh transfer on the basis of chromosomal normalcy should achieve the dual objective of maintaining satisfactory clinical pregnancy rates and minimizing the multiple gestation problem, because embryo aneuploidy is a major contributing factor in implantation failure and miscarriage in IVF. The initial techniques for preimplantation genetic screening unfortunately lacked sufficient sensitivity and did not yield the expected results in IVF. However, newer molecular genetic methods could be incorporated with standard IVF to bring the goal of single embryo transfer within reach. Aiming to make multiple embryo transfers obsolete and unnecessary, and recognizing that array comparative genomic hybridization (aCGH) will typically require an additional 12 h of laboratory time to complete, we propose adopting aCGH for mainstream use in clinical IVF practice. As aCGH technology continues to develop and becomes increasingly available at lower cost, it may soon be considered unusual for IVF laboratories to select a single embryo for fresh transfer without regard to its chromosomal competency. In this report, we provide a rationale supporting aCGH as the preferred methodology to provide a comprehensive genetic assessment of the single embryo before fresh transfer in IVF. The logistics and cost of integrating aCGH with IVF to enable fresh embryo transfer are also

  9. Comparative Genomic Hybridization of Human Malignant Gliomas Reveals Multiple Amplification Sites and Nonrandom Chromosomal Gains and Losses

    Science.gov (United States)

    Schròck, Evelin; Thiel, Gundula; Lozanova, Tanka; du Manoir, Stanislas; Meffert, Marie-Christine; Jauch, Anna; Speicher, Michael R.; Nürnberg, Peter; Vogel, Siegfried; Janisch, Werner; Donis-Keller, Helen; Ried, Thomas; Witkowski, Regine; Cremer, Thomas

    1994-01-01

    Nine human malignant gliomas (2 astrocytomas grade III and 7 glioblastomas) were analyzed using comparative genomic hybridization (CGH). In addition to the amplification of the EGFR gene at 7p12 in 4 of 9 cases, six new amplification sites were mapped to 1q32, 4q12, 7q21.1, 7q21.2-3, 12p, and 22q12. Nonrandom chromosomal gains and losses were identified with overrepresentation of chromosome 7 and underrepresentation of chromosome 10 as the most frequent events (1 of 2 astrocytomas, 7 of 7 glioblastomas). Gain of a part or the whole chromosome 19 and losses of chromosome bands 9pter-23 and 22q13 were detected each in five cases. Loss of chromosome band 17p13 and gain of chromosome 20 were revealed each in three cases. The validity of the CGH data was confirmed using interphase cytogenetics with YAC clones, chromosome painting in tumor metaphase spreads, and DNA fingerprinting. A comparison of CGH data with the results of chromosome banding analyses indicates that metaphase spreads accessible in primary tumor cell cultures may not represent the clones predominant in the tumor tissue ImagesFigure 1Figure 4Figure 6 PMID:8203461

  10. IS-seq: a novel high throughput survey of in vivo IS6110 transposition in multiple Mycobacterium tuberculosis genomes

    Directory of Open Access Journals (Sweden)

    Reyes Alejandro

    2012-06-01

    Full Text Available Abstract Background The insertion element IS6110 is one of the main sources of genomic variability in Mycobacterium tuberculosis, the etiological agent of human tuberculosis. Although IS 6110 has been used extensively as an epidemiological marker, the identification of the precise chromosomal insertion sites has been limited by technical challenges. Here, we present IS-seq, a novel method that combines high-throughput sequencing using Illumina technology with efficient combinatorial sample multiplexing to simultaneously probe 519 clinical isolates, identifying almost all the flanking regions of the element in a single experiment. Results We identified a total of 6,976 IS6110 flanking regions on the different isolates. When validated using reference strains, the method had 100% specificity and 98% positive predictive value. The insertions mapped to both coding and non-coding regions, and in some cases interrupted genes thought to be essential for virulence or in vitro growth. Strains were classified into families using insertion sites, and high agreement with previous studies was observed. Conclusions This high-throughput IS-seq method, which can also be used to map insertions in other organisms, extends previous surveys of in vivo interrupted loci and provides a baseline for probing the consequences of disruptions in M. tuberculosis strains.

  11. Cre/lox-based multiple markerless gene disruption in the genome of the extreme thermophile Thermus thermophilus.

    Science.gov (United States)

    Togawa, Yoichiro; Nunoshiba, Tatsuo; Hiratsu, Keiichiro

    2018-02-01

    Markerless gene-disruption technology is particularly useful for effective genetic analyses of Thermus thermophilus (T. thermophilus), which have a limited number of selectable markers. In an attempt to develop a novel system for the markerless disruption of genes in T. thermophilus, we applied a Cre/lox system to construct a triple gene disruptant. To achieve this, we constructed two genetic tools, a loxP-htk-loxP cassette and cre-expressing plasmid, pSH-Cre, for gene disruption and removal of the selectable marker by Cre-mediated recombination. We found that the Cre/lox system was compatible with the proliferation of the T. thermophilus HB27 strain at the lowest growth temperature (50 °C), and thus succeeded in establishing a triple gene disruptant, the (∆TTC1454::loxP, ∆TTC1535KpnI::loxP, ∆TTC1576::loxP) strain, without leaving behind a selectable marker. During the process of the sequential disruption of multiple genes, we observed the undesired deletion and inversion of the chromosomal region between multiple loxP sites that were induced by Cre-mediated recombination. Therefore, we examined the effects of a lox66-htk-lox71 cassette by exploiting the mutant lox sites, lox66 and lox71, instead of native loxP sites. We successfully constructed a (∆TTC1535::lox72, ∆TTC1537::lox72) double gene disruptant without inducing the undesired deletion of the 0.7-kbp region between the two directly oriented lox72 sites created by the Cre-mediated recombination of the lox66-htk-lox71 cassette. This is the first demonstration of a Cre/lox system being applicable to extreme thermophiles in a genetic manipulation. Our results indicate that this system is a powerful tool for multiple markerless gene disruption in T. thermophilus.

  12. Genome-wide association study of multiple sclerosis confirms a novel locus at 5p13.1.

    Directory of Open Access Journals (Sweden)

    Fuencisla Matesanz

    Full Text Available Multiple Sclerosis (MS is the most common progressive and disabling neurological condition affecting young adults in the world today. From a genetic point of view, MS is a complex disorder resulting from the combination of genetic and non-genetic factors. We aimed to identify previously unidentified loci conducting a new GWAS of Multiple Sclerosis (MS in a sample of 296 MS cases and 801 controls from the Spanish population. Meta-analysis of our data in combination with previous GWAS was done. A total of 17 GWAS-significant SNPs, corresponding to three different loci were identified:HLA, IL2RA, and 5p13.1. All three have been previously reported as GWAS-significant. We confirmed our observation in 5p13.1 for rs9292777 using two additional independent Spanish samples to make a total of 4912 MS cases and 7498 controls (ORpooled = 0.84; 95%CI: 0.80-0.89; p = 1.36 × 10-9. This SNP differs from the one reported within this locus in a recent GWAS. Although it is unclear whether both signals are tapping the same genetic association, it seems clear that this locus plays an important role in the pathogenesis of MS.

  13. Chlamydomonas chloroplasts can use short dispersed repeats and multiple pathways to repair a double-strand break in the genome.

    Science.gov (United States)

    Odom, Obed W; Baek, Kwang-Hyun; Dani, Radhika N; Herrin, David L

    2008-03-01

    Certain group I introns insert into intronless DNA via an endonuclease that creates a double-strand break (DSB). There are two models for intron homing in phage: synthesis-dependent strand annealing (SDSA) and double-strand break repair (DSBR). The Cr.psbA4 intron homes efficiently from a plasmid into the chloroplast psbA gene in Chlamydomonas, but little is known about the mechanism. Analysis of co-transformants selected using a spectinomycin-resistant 16S gene (16S(spec)) provided evidence for both pathways. We also examined the consequences of the donor DNA having only one-sided or no homology with the psbA gene. When there was no homology with the donor DNA, deletions of up to 5 kb involving direct repeats that flank the psbA gene were obtained. Remarkably, repeats as short as 15 bp were used for this repair, which is consistent with the single-strand annealing (SSA) pathway. When the donor had one-sided homology, the DSB in most co-transformants was repaired using two DNAs, the donor and the 16S(spec) plasmid, which, coincidentally, contained a region that is repeated upstream of psbA. DSB repair using two separate DNAs provides further evidence for the SDSA pathway. These data show that the chloroplast can repair a DSB using short dispersed repeats located proximally, distally, or even on separate molecules relative to the DSB. They also provide a rationale for the extensive repertoire of repeated sequences in this genome.

  14. Genome-Wide Association Mapping Reveals Multiple QTLs Governing Tolerance Response for Seedling Stage Chilling Stress in Indica Rice

    Directory of Open Access Journals (Sweden)

    Sharat K. Pradhan

    2017-04-01

    Full Text Available Rice crop is sensitive to cold stress at seedling stage. A panel of population representing 304 shortlisted germplasm lines was studied for seedling stage chilling tolerance in indica rice. Six phenotypic classes were exposed to six low temperature stress regimes under control phenotyping facility to investigate response pattern. A panel of 66 genotypes representing all phenotypic classes was used for ensuring genetic diversity, population structure and association mapping for the trait using 58 simple sequence repeat (SSR and 2 direct trait linked markers. A moderate level of genetic diversity was detected in the panel population for the trait. Deviation of Hardy-Weinberg's expectation was detected in the studied population using Wright's F statistic. The panel showed 30% variation among population and 70% among individuals. The entire population was categorized into three sub-populations through STRUCTURE analysis. This revealed tolerance for the trait had a common primary ancestor for each sub-population with few admix individuals. The panel population showed the presence of many QTLs for cold stress tolerance in the individuals representing like genome-wide expression of the trait. Nineteen SSR markers were significantly associated at chilling stress of 8°C to 4°C for 7–21 days duration. Thus, the primers linked to the seedling stage cold tolerance QTLs namely qCTS9, qCTS-2, qCTS6.1, qSCT2, qSCT11, qSCT1a, qCTS-3.1, qCTS11.1, qCTS12.1, qCTS-1b, and CTB2 need to be pyramided for development of strongly chilling tolerant variety.

  15. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Directory of Open Access Journals (Sweden)

    Huaiyong Luo

    Full Text Available The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  16. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Science.gov (United States)

    Luo, Huaiyong; Wang, Xiaojie; Zhan, Gangming; Wei, Guorong; Zhou, Xinli; Zhao, Jing; Huang, Lili; Kang, Zhensheng

    2015-01-01

    The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst) causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs) are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  17. Translocation t(11;14 (q13;q32 and genomic imbalances in multi-ethnic multiple myeloma patients: a Malaysian study

    Directory of Open Access Journals (Sweden)

    Ivyna Bong Pau Ni

    2012-09-01

    Full Text Available More than 50% of myeloma cases have normal karyotypes under conventional cytogenetic analysis due to low mitotic activity and content of plasma cells in the bone marrow. We used a polymerase chain reaction (PCR-based translocation detection assay to detect BCL1/JH t(11;14 (q13;q32 in 105 myeloma patients, and randomly selected 8 translocation positive samples for array comparative genomic hybridization (aCGH analysis. Our findings revealed 14.3% of myeloma samples were positive for BCL1/JH t(11;14 (q13;q32 translocation (n=15 of 105. We found no significant correlation between this translocation with age (P=0.420, gender (P=0.317, ethnicity (P=0.066 or new/relapsed status of multiple myeloma (P=0.412 at 95% confidence interval level by x2 test. In addition, aCGH results showed genomic imbalances in all samples analyzed. Frequent chromosomal gains were identified at regions 1q, 2q, 3p, 3q, 4p, 4q, 5q, 7q, 9q, 11q, 13q, 15q, 21q, 22q and Xq, while chromosomal losses were detected at 4q and 14q. Copy number variations at genetic loci that contain NAMPT, IVNS1ABP and STK17B genes are new findings that have not previously been reported in myeloma patients. Besides fluorescence in situ hybridization, PCR is another rapid, sensitive and simple technique that can be used for detecting BCL1/JH t(11;14(q13;q32 translocation in multiple myeloma patients. Genes located in the chromosomal aberration regions in our study, such as NAMPT, IVNS1ABP, IRF2BP2, PICALM, STAT1, STK17B, FBXL5, ACSL1, LAMP2, SAMSN1 and ATP8B4 might be potential prognostic markers and therapeutic targets in the treatment and management of multiple myeloma patients positive for BCL1/JH t(11;14 (q13;q32 translocation.

  18. A genomic library-based amplification approach (GL-PCR) for the mapping of multiple IS6110 insertion sites and strain differentiation of Mycobacterium tuberculosis.

    Science.gov (United States)

    Namouchi, Amine; Mardassi, Helmi

    2006-11-01

    Evidence suggests that insertion of the IS6110 element is not without consequence to the biology of Mycobacterium tuberculosis complex strains. Thus, mapping of multiple IS6110 insertion sites in the genome of biomedically relevant clinical isolates would result in a better understanding of the role of this mobile element, particularly with regard to transmission, adaptability and virulence. In the present paper, we describe a versatile strategy, referred to as GL-PCR, that amplifies IS6110-flanking sequences based on the construction of a genomic library. M. tuberculosis chromosomal DNA is fully digested with HincII and then ligated into a plasmid vector between T7 and T3 promoter sequences. The ligation reaction product is transformed into Escherichia coli and selective PCR amplification targeting both 5' and 3' IS6110-flanking sequences are performed on the plasmid library DNA. For this purpose, four separate PCR reactions are performed, each combining an outward primer specific for one IS6110 end with either T7 or T3 primer. Determination of the nucleotide sequence of the PCR products generated from a single ligation reaction allowed mapping of 21 out of the 24 IS6110 copies of two 12 banded M. tuberculosis strains, yielding an overall sensitivity of 87,5%. Furthermore, by simply comparing the migration pattern of GL-PCR-generated products, the strategy proved to be as valuable as IS6110 RFLP for molecular typing of M. tuberculosis complex strains. Importantly, GL-PCR was able to discriminate between strains differing by a single IS6110 band.

  19. Outbreak of Invasive Wound Mucormycosis in a Burn Unit Due to Multiple Strains of Mucor circinelloides f. circinelloides Resolved by Whole-Genome Sequencing

    Directory of Open Access Journals (Sweden)

    Dea Garcia-Hermoso

    2018-04-01

    Full Text Available Mucorales are ubiquitous environmental molds responsible for mucormycosis in diabetic, immunocompromised, and severely burned patients. Small outbreaks of invasive wound mucormycosis (IWM have already been reported in burn units without extensive microbiological investigations. We faced an outbreak of IWM in our center and investigated the clinical isolates with whole-genome sequencing (WGS analysis. We analyzed M. circinelloides isolates from patients in our burn unit (BU1, Héééôèéééûéôèôèôôèéôééôéôôèôpital Saint-Louis, Paris, France together with nonoutbreak isolates from Burn Unit 2 (BU2, Paris area and from France over a 2-year period (2013 to 2015. A total of 21 isolates, including 14 isolates from six BU1 patients, were analyzed by whole-genome sequencing (WGS. Phylogenetic classification based on de novo assembly and assembly free approaches showed that the clinical isolates clustered in four highly divergent clades. Clade 1 contained at least one of the strains from the six epidemiologically linked BU1 patients. The clinical isolates were specific to each patient. Two patients were infected with more than two strains from different clades, suggesting that an environmental reservoir of clonally unrelated isolates was the source of contamination. Only two patients from BU1 shared one strain, which could correspond to direct transmission or contamination with the same environmental source. In conclusion, WGS of several isolates per patients coupled with precise epidemiological data revealed a complex situation combining potential cross-transmission between patients and multiple contaminations with a heterogeneous pool of strains from a cryptic environmental reservoir.

  20. A Novel Bifunctional Amino Acid Racemase With Multiple Substrate Specificity, MalY From Lactobacillus sakei LT-13: Genome-Based Identification and Enzymological Characterization

    Directory of Open Access Journals (Sweden)

    Shiro Kato

    2018-03-01

    Full Text Available The Lactobacillus sakei strain LK-145 isolated from Moto, a starter of sake, produces potentially large amounts of three D-amino acids, D-Ala, D-Glu, and D-Asp, in a medium containing amylase-digested rice as a carbon source. The comparison of metabolic pathways deduced from the complete genome sequence of strain LK-145 to the type culture strain of Lactobacillus sakei strain LT-13 showed that the L- and D-amino acid metabolic pathways are similar between the two strains. However, a marked difference was observed in the putative cysteine/methionine metabolic pathways of strain LK-145 and LT-13. The cystathionine β-lyase homolog gene malY was annotated only in the genome of strain LT-13. Cystathionine β-lyase is an important enzyme in the cysteine/methionine metabolic pathway that catalyzes the conversion of L-cystathionine into L-homocysteine. In addition to malY, most genome-sequenced strains of L. sakei including LT-13 lacked the homologous genes encoding other putative enzymes in this pathway. Accordingly, the cysteine/methionine metabolic pathway likely does not function well in almost all strains of L. sakei. We succeeded in cloning and expressing the malY gene from strain LT-13 (Ls-malY in the cells of Escherichia coli BL21 (DE3 and characterized the enzymological properties of Ls-MalY. Spectral analysis of purified Ls-MalY showed that Ls-MalY contained a pyridoxal 5′-phosphate (PLP as a cofactor, and this observation agreed well with the prediction based on its primary structure. Ls-MalY showed amino acid racemase activity and cystathionine β-lyase activity. Ls-MalY showed amino acid racemase activities in various amino acids, such as Ala, Arg, Asn, Glu, Gln, His, Leu, Lys, Met, Ser, Thr, Trp, and Val. Mutational analysis revealed that the -amino group of Lys233 in the primary structure of Ls-MalY likely bound to PLP, and Lys233 was an essential residue for Ls-MalY to catalyze both the amino acid racemase and β-lyase reactions. In

  1. Genomic and cranial phenotype data support multiple modern human dispersals from Africa and a southern route into Asia.

    Science.gov (United States)

    Reyes-Centeno, Hugo; Ghirotto, Silvia; Détroit, Florent; Grimaud-Hervé, Dominique; Barbujani, Guido; Harvati, Katerina

    2014-05-20

    Despite broad consensus on Africa as the main place of origin for anatomically modern humans, their dispersal pattern out of the continent continues to be intensely debated. In extant human populations, the observation of decreasing genetic and phenotypic diversity at increasing distances from sub-Saharan Africa has been interpreted as evidence for a single dispersal, accompanied by a series of founder effects. In such a scenario, modern human genetic and phenotypic variation was primarily generated through successive population bottlenecks and drift during a rapid worldwide expansion out of Africa in the Late Pleistocene. However, recent genetic studies, as well as accumulating archaeological and paleoanthropological evidence, challenge this parsimonious model. They suggest instead a "southern route" dispersal into Asia as early as the late Middle Pleistocene, followed by a separate dispersal into northern Eurasia. Here we test these competing out-of-Africa scenarios by modeling hypothetical geographical migration routes and assessing their correlation with neutral population differentiation, as measured by genetic polymorphisms and cranial shape variables of modern human populations from Africa and Asia. We show that both lines of evidence support a multiple-dispersals model in which Australo-Melanesian populations are relatively isolated descendants of an early dispersal, whereas other Asian populations are descended from, or highly admixed with, members of a subsequent migration event.

  2. Statistical Methods in Integrative Genomics

    Science.gov (United States)

    Richardson, Sylvia; Tseng, George C.; Sun, Wei

    2016-01-01

    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531

  3. Impact of Genomics Platform and Statistical Filtering on Transcriptional Benchmark Doses (BMD and Multiple Approaches for Selection of Chemical Point of Departure (PoD.

    Directory of Open Access Journals (Sweden)

    A Francina Webster

    Full Text Available Many regulatory agencies are exploring ways to integrate toxicogenomic data into their chemical risk assessments. The major challenge lies in determining how to distill the complex data produced by high-content, multi-dose gene expression studies into quantitative information. It has been proposed that benchmark dose (BMD values derived from toxicogenomics data be used as point of departure (PoD values in chemical risk assessments. However, there is limited information regarding which genomics platforms are most suitable and how to select appropriate PoD values. In this study, we compared BMD values modeled from RNA sequencing-, microarray-, and qPCR-derived gene expression data from a single study, and explored multiple approaches for selecting a single PoD from these data. The strategies evaluated include several that do not require prior mechanistic knowledge of the compound for selection of the PoD, thus providing approaches for assessing data-poor chemicals. We used RNA extracted from the livers of female mice exposed to non-carcinogenic (0, 2 mg/kg/day, mkd and carcinogenic (4, 8 mkd doses of furan for 21 days. We show that transcriptional BMD values were consistent across technologies and highly predictive of the two-year cancer bioassay-based PoD. We also demonstrate that filtering data based on statistically significant changes in gene expression prior to BMD modeling creates more conservative BMD values. Taken together, this case study on mice exposed to furan demonstrates that high-content toxicogenomics studies produce robust data for BMD modelling that are minimally affected by inter-technology variability and highly predictive of cancer-based PoD doses.

  4. Use of Multiple Sequencing Technologies To Produce a High-Quality Genome of the Fungus Pseudogymnoascus destructans, the Causative Agent of Bat White-Nose Syndrome.

    Science.gov (United States)

    Drees, Kevin P; Palmer, Jonathan M; Sebra, Robert; Lorch, Jeffrey M; Chen, Cynthia; Wu, Cheng-Cang; Bok, Jin Woo; Keller, Nancy P; Blehert, David S; Cuomo, Christina A; Lindner, Daniel L; Foster, Jeffrey T

    2016-06-30

    White-nose syndrome has recently emerged as one of the most devastating wildlife diseases recorded, causing widespread mortality in numerous bat species throughout eastern North America. Here, we present an improved reference genome of the fungal pathogen Pseudogymnoascus destructans for use in comparative genomic studies. Copyright © 2016 Drees et al.

  5. Use of multiple sequencing technologies to produce a high-quality genome of the fungus Pseudogymnoascus destructans, the causative agent of bat White-Nose syndrome

    Science.gov (United States)

    Drees, Kevin P.; Palmer, Jonathan M.; Sebra, Robert; Lorch, Jeffrey M.; Chen, Cynthia; Wu, ChengCang; Bok, Jin Woo; Keller, Nancy F.; Blehert, David; Cuomo, Christina A.; Linder, Daniel L.; Foster, Jeffrey T.

    2016-01-01

    White-Nose syndrome has recently emerged as one of the most devastating wildlife diseases recorded, causing widespread mortality in numerous bat species throughout eastern North America. Here, we present an improvised reference genome of the fungal pathogen Pseudogymnoascus destructans for use in comparative genomic studies.

  6. A Large-Scale Multi-ancestry Genome-wide Study Accounting for Smoking Behavior Identifies Multiple Significant Loci for Blood Pressure

    DEFF Research Database (Denmark)

    Sung, Yun J; Winkler, Thomas W; de Las Fuentes, Lisa

    2018-01-01

    Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed genom...

  7. Employment of Near Full-Length Ribosome Gene TA-Cloning and Primer-Blast to Detect Multiple Species in a Natural Complex Microbial Community Using Species-Specific Primers Designed with Their Genome Sequences.

    Science.gov (United States)

    Zhang, Huimin; He, Hongkui; Yu, Xiujuan; Xu, Zhaohui; Zhang, Zhizhou

    2016-11-01

    It remains an unsolved problem to quantify a natural microbial community by rapidly and conveniently measuring multiple species with functional significance. Most widely used high throughput next-generation sequencing methods can only generate information mainly for genus-level taxonomic identification and quantification, and detection of multiple species in a complex microbial community is still heavily dependent on approaches based on near full-length ribosome RNA gene or genome sequence information. In this study, we used near full-length rRNA gene library sequencing plus Primer-Blast to design species-specific primers based on whole microbial genome sequences. The primers were intended to be specific at the species level within relevant microbial communities, i.e., a defined genomics background. The primers were tested with samples collected from the Daqu (also called fermentation starters) and pit mud of a traditional Chinese liquor production plant. Sixteen pairs of primers were found to be suitable for identification of individual species. Among them, seven pairs were chosen to measure the abundance of microbial species through quantitative PCR. The combination of near full-length ribosome RNA gene library sequencing and Primer-Blast may represent a broadly useful protocol to quantify multiple species in complex microbial population samples with species-specific primers.

  8. Comparative Genomic Analyses of Multiple Pseudomonas Strains Infecting Corylus avellana Trees Reveal the Occurrence of Two Genetic Clusters with Both Common and Distinctive Virulence and Fitness Traits

    Science.gov (United States)

    Marcelletti, Simone; Scortichini, Marco

    2015-01-01

    The European hazelnut (Corylus avellana) is threatened in Europe by several pseudomonads which cause symptoms ranging from twig dieback to tree death. A comparison of the draft genomes of nine Pseudomonas strains isolated from symptomatic C. avellana trees was performed to identify common and distinctive genomic traits. The thorough assessment of genetic relationships among the strains revealed two clearly distinct clusters: P. avellanae and P. syringae. The latter including the pathovars avellanae, coryli and syringae. Between these two clusters, no recombination event was found. A genomic island of approximately 20 kb, containing the hrp/hrc type III secretion system gene cluster, was found to be present without any genomic difference in all nine pseudomonads. The type III secretion system effector repertoires were remarkably different in the two groups, with P. avellanae showing a higher number of effectors. Homologue genes of the antimetabolite mangotoxin and ice nucleation activity clusters were found solely in all P. syringae pathovar strains, whereas the siderophore yersiniabactin was only present in P. avellanae. All nine strains have genes coding for pectic enzymes and sucrose metabolism. By contrast, they do not have genes coding for indolacetic acid and anti-insect toxin. Collectively, this study reveals that genomically different Pseudomonas can converge on the same host plant by suppressing the host defence mechanisms with the use of different virulence weapons. The integration into their genomes of a horizontally acquired genomic island could play a fundamental role in their evolution, perhaps giving them the ability to exploit new ecological niches. PMID:26147218

  9. Meta-genome-wide association studies identify a locus on chromosome 1 and multiple variants in the MHC region for serum C-peptide in type 1 diabetes.

    Science.gov (United States)

    Roshandel, Delnaz; Gubitosi-Klug, Rose; Bull, Shelley B; Canty, Angelo J; Pezzolesi, Marcus G; King, George L; Keenan, Hillary A; Snell-Bergeon, Janet K; Maahs, David M; Klein, Ronald; Klein, Barbara E K; Orchard, Trevor J; Costacou, Tina; Weedon, Michael N; Oram, Richard A; Paterson, Andrew D

    2018-05-01

    The aim of this study was to identify genetic variants associated with beta cell function in type 1 diabetes, as measured by serum C-peptide levels, through meta-genome-wide association studies (meta-GWAS). We performed a meta-GWAS to combine the results from five studies in type 1 diabetes with cross-sectionally measured stimulated, fasting or random C-peptide levels, including 3479 European participants. The p values across studies were combined, taking into account sample size and direction of effect. We also performed separate meta-GWAS for stimulated (n = 1303), fasting (n = 2019) and random (n = 1497) C-peptide levels. In the meta-GWAS for stimulated/fasting/random C-peptide levels, a SNP on chromosome 1, rs559047 (Chr1:238753916, T>A, minor allele frequency [MAF] 0.24-0.26), was associated with C-peptide (p = 4.13 × 10 -8 ), meeting the genome-wide significance threshold (p C>T, MAF 0.07-0.10, p = 8.43 × 10 -8 ). In the stimulated C-peptide meta-GWAS, rs61211515 (Chr6:30100975, T/-, MAF 0.17-0.19) in the MHC region was associated with stimulated C-peptide (β [SE] = - 0.39 [0.07], p = 9.72 × 10 -8 ). rs61211515 was also associated with the rate of stimulated C-peptide decline over time in a subset of individuals (n = 258) with annual repeated measures for up to 6 years (p = 0.02). In the meta-GWAS of random C-peptide, another MHC region, SNP rs3135002 (Chr6:32668439, C>A, MAF 0.02-0.06), was associated with C-peptide (p = 3.49 × 10 -8 ). Conditional analyses suggested that the three identified variants in the MHC region were independent of each other. rs9260151 and rs3135002 have been associated with type 1 diabetes, whereas rs559047 and rs61211515 have not been associated with a risk of developing type 1 diabetes. We identified a locus on chromosome 1 and multiple variants in the MHC region, at least some of which were distinct from type 1 diabetes risk loci, that were associated with C

  10. A genome-wide association meta-analysis of circulating sex hormone-binding globulin reveals multiple Loci implicated in sex steroid hormone regulation.

    Directory of Open Access Journals (Sweden)

    Andrea D Coviello

    Full Text Available Sex hormone-binding globulin (SHBG is a glycoprotein responsible for the transport and biologic availability of sex steroid hormones, primarily testosterone and estradiol. SHBG has been associated with chronic diseases including type 2 diabetes (T2D and with hormone-sensitive cancers such as breast and prostate cancer. We performed a genome-wide association study (GWAS meta-analysis of 21,791 individuals from 10 epidemiologic studies and validated these findings in 7,046 individuals in an additional six studies. We identified twelve genomic regions (SNPs associated with circulating SHBG concentrations. Loci near the identified SNPs included SHBG (rs12150660, 17p13.1, p = 1.8 × 10(-106, PRMT6 (rs17496332, 1p13.3, p = 1.4 × 10(-11, GCKR (rs780093, 2p23.3, p = 2.2 × 10(-16, ZBTB10 (rs440837, 8q21.13, p = 3.4 × 10(-09, JMJD1C (rs7910927, 10q21.3, p = 6.1 × 10(-35, SLCO1B1 (rs4149056, 12p12.1, p = 1.9 × 10(-08, NR2F2 (rs8023580, 15q26.2, p = 8.3 × 10(-12, ZNF652 (rs2411984, 17q21.32, p = 3.5 × 10(-14, TDGF3 (rs1573036, Xq22.3, p = 4.1 × 10(-14, LHCGR (rs10454142, 2p16.3, p = 1.3 × 10(-07, BAIAP2L1 (rs3779195, 7q21.3, p = 2.7 × 10(-08, and UGT2B15 (rs293428, 4q13.2, p = 5.5 × 10(-06. These genes encompass multiple biologic pathways, including hepatic function, lipid metabolism, carbohydrate metabolism and T2D, androgen and estrogen receptor function, epigenetic effects, and the biology of sex steroid hormone-responsive cancers including breast and prostate cancer. We found evidence of sex-differentiated genetic influences on SHBG. In a sex-specific GWAS, the loci 4q13.2-UGT2B15 was significant in men only (men p = 2.5 × 10(-08, women p = 0.66, heterogeneity p = 0.003. Additionally, three loci showed strong sex-differentiated effects: 17p13.1-SHBG and Xq22.3-TDGF3 were stronger in men, whereas 8q21.12-ZBTB10 was stronger in women. Conditional analyses identified additional signals at the SHBG gene that together almost double the proportion

  11. Genome-wide comparison and taxonomic relatedness of multiple Xylella fastidiosa strains reveal the occurrence of three subspecies and a new Xylella species.

    Science.gov (United States)

    Marcelletti, Simone; Scortichini, Marco

    2016-10-01

    A total of 21 Xylella fastidiosa strains were assessed by comparing their genomes to infer their taxonomic relationships. The whole-genome-based average nucleotide identity and tetranucleotide frequency correlation coefficient analyses were performed. In addition, a consensus tree based on comparisons of 956 core gene families, and a genome-wide phylogenetic tree and a Neighbor-net network were constructed with 820,088 nucleotides (i.e., approximately 30-33 % of the entire X. fastidiosa genome). All approaches revealed the occurrence of three well-demarcated genetic clusters that represent X. fastidiosa subspecies fastidiosa, multiplex and pauca, with the latter appeared to diverge. We suggest that the proposed but never formally described subspecies 'sandyi' and 'morus' are instead members of the subspecies fastidiosa. These analyses support the view that the Xylella strain isolated from Pyrus pyrifolia in Taiwan is likely to be a new species. A widely used multilocus sequence typing analysis yielded conflicting results.

  12. A Large-Scale Multi-ancestry Genome-wide Study Accounting for Smoking Behavior Identifies Multiple Significant Loci for Blood Pressure

    NARCIS (Netherlands)

    Sung, Yun J.; Winkler, Thomas W.; de las Fuentes, Lisa; Bentley, Amy R.; Brown, Michael R.; Kraja, Aldi T.; Schwander, Karen; Ntalla, Ioanna; Guo, Xiuqing; Franceschini, Nora; Lu, Yingchang; Cheng, Ching-Yu; Sim, Xueling; Vojinovic, Dina; Marten, Jonathan; Musani, Solomon K.; Li, Changwei; Feitosa, Mary F.; Kilpelainen, Tuomas O.; Richard, Melissa A.; Noordam, Raymond; Aslibekyan, Stella; Aschard, Hugues; Bartz, Traci M.; Dorajoo, Rajkumar; Liu, Yongmei; Manning, Alisa K.; Rankinen, Tuomo; Smith, Albert Vernon; Tajuddin, Salman M.; Tayo, Bamidele O.; Warren, Helen R.; Zhao, Wei; Zhou, Yanhua; Matoba, Nana; Sofer, Tamar; Alver, Maris; Amini, Marzyeh; Boissel, Mathilde; Chai, Jin Fang; Chen, Xu; Divers, Jasmin; Gandin, Ilaria; Gao, Chuan; Giulianini, Franco; Goel, Anuj; Harris, Sarah E.; Hartwig, Fernando Pires; Horimoto, Andrea R. V. R.; Hsu, Fang-Chi; Jackson, Anne U.; Kahonen, Mika; Kasturiratne, Anuradhani; Kuhnel, Brigitte; Leander, Karin; Lee, Wen-Jane; Lin, Keng-Hung; Luan, Jian' an; McKenzie, Colin A.; He Meian,; Nelson, Christopher P.; Rauramaa, Rainer; Schupf, Nicole; Scott, Robert A.; Sheu, Wayne H. H.; Stancakova, Alena; Takeuchi, Fumihiko; van der Most, Peter J.; Varga, Tibor V.; Wang, Heming; Wang, Yajuan; Ware, Erin B.; Weiss, Stefan; Wen, Wanqing; Yanek, Lisa R.; Zhang, Weihua; Zhao, Jing Hua; Afaq, Saima; Alfred, Tamuno; Amin, Najaf; Arking, Dan; Aung, Tin; Barr, R. Graham; Bielak, Lawrence F.; Boerwinkle, Eric; Bottinger, Erwin P.; Braund, Peter S.; Brody, Jennifer A.; Broeckel, Ulrich; Cabrera, Claudia P.; Cade, Brian; Yu Caizheng,; Campbell, Archie; Canouil, Mickael; Chakravarti, Aravinda; Chauhan, Ganesh; Christensen, Kaare; Cocca, Massimiliano; Collins, Francis S.; Connell, John M.; de Mutsert, Renee; de Silva, H. Janaka; Debette, Stephanie; Dorr, Marcus; Duan, Qing; Eaton, Charles B.; Ehret, Georg; Evangelou, Evangelos; Faul, Jessica D.; Fisher, Virginia A.; Forouhi, Nita G.; Franco, Oscar H.; Friedlander, Yechiel; Gao, He; Gigante, Bruna; Graff, Misa; Gu, C. Charles; Gu, Dongfeng; Gupta, Preeti; Hagenaars, Saskia P.; Harris, Tamara B.; He, Jiang; Heikkinen, Sami; Heng, Chew-Kiat; Hirata, Makoto; Hofman, Albert; Howard, Barbara V.; Hunt, Steven; Irvin, Marguerite R.; Jia, Yucheng; Joehanes, Roby; Justice, Anne E.; Katsuya, Tomohiro; Kaufman, Joel; Kerrison, Nicola D.; Khor, Chiea Chuen; Koh, Woon-Puay; Koistinen, Heikki A.; Komulainen, Pirjo; Kooperberg, Charles; Krieger, Jose E.; Kubo, Michiaki; Kuusisto, Johanna; Langefeld, Carl D.; Langenberg, Claudia; Launer, Lenore J.; Lehne, Benjamin; Lewis, Cora E.; Li, Yize; Lim, Sing Hui; Lin, Shiow; Liu, Ching-Ti; Liu, Jianjun; Liu, Jingmin; Liu, Kiang; Liu, Yeheng; Loh, Marie; Lohman, Kurt K.; Long, Jirong; Louie, Tin; Magi, Reedik; Mahajan, Anubha; Meitinger, Thomas; Metspalu, Andres; Milani, Lili; Momozawa, Yukihide; Morris, Andrew P.; Mosley, Thomas H.; Munson, Peter; Murray, Alison D.; Nalls, Mike A.; Nasri, Ubaydah; Norris, Jill M.; North, Kari; Ogunniyi, Adesola; Padmanabhan, Sandosh; Palmas, Walter R.; Palmer, Nicholette D.; Pankow, James S.; Pedersen, Nancy L.; Peters, Annette; Peyser, Patricia A.; Polasek, Ozren; Raitakari, Olli T.; Renstrom, Frida; Rice, Treva K.; Ridker, Paul M.; Robino, Antonietta; Robinson, Jennifer G.; Rose, Lynda M.; Rudan, Igor; Sabanayagam, Charumathi; Salako, Babatunde L.; Sandow, Kevin; Schmidt, Carsten O.; Schreiner, Pamela J.; Scott, William R.; Seshadri, Sudha; Sever, Peter; Sitlani, Colleen M.; Smith, Jennifer A.; Snieder, Harold; Starr, John M.; Strauch, Konstantin; Tang, Hua; Taylor, Kent D.; Teo, Yik Ying; Tham, Yih Chung; Ultterlinden, Andre G.; Waldenberger, Melanie; Wang, Lihua; Wang, Ya X.; Bin Wei, Wen; Williams, Christine; Wilson, Gregory; Wojczynski, Mary K.; Yao, Jie; Yuan, Jian-Min; Zonderman, Alan B.; Becker, Diane M.; Boehnke, Michael; Bowden, Donald W.; Chambers, John C.; Chen, Yii-Der Ida; de Faire, Ulf; Deary, Ian J.; Esko, Tonu; Farrall, Martin; Forrester, Terrence; Franks, Paul W.; Freedman, Barry I.; Froguel, Philippe; Gasparini, Paolo; Gieger, Christian; Horta, Bernardo Lessa; Hung, Yi-Jen; Jonas, Jost B.; Kato, Norihiro; Kooner, Jaspal S.; Laakso, Markku; Lehtimaki, Terho; Liang, Kae-Woei; Magnusson, Patrik K. E.; Newman, Anne B.; Oldehinkel, Albertine J.; Pereira, Alexandre C.; Redline, Susan; Rettig, Rainer; Samani, Nilesh J.; Scott, James; Shu, Xiao-Ou; van der Harst, Pim; Wagenknecht, Lynne E.; Wareham, Nicholas J.; Watkins, Hugh; Weir, David R.; Wickremasinghe, Ananda R.; Wu, Tangchun; Zheng, Wei; Kamatani, Yoichiro; Laurie, Cathy C.; Bouchard, Claude; Cooper, Richard S.; Evans, Michele K.; Gudnason, Vilmundur; Kardia, Sharon L. R.; Kritchevsky, Stephen B.; Levy, Daniel; O'Connell, Jeff R.; Psaty, Bruce M.; van Dam, Rob M.; Sims, Mario; Arnett, Donna K.; Mook-Kanamori, Dennis O.; Kelly, Tanika N.; Fox, Ervin R.; Hayward, Caroline; Fornage, Myriam; Rotimi, Charles N.; Province, Michael A.; van Duijn, Cornelia M.; Tai, E. Shyong; Wong, Tien Yin; Loos, Ruth J. F.; Reiner, Alex P.; Rotter, Jerome I.; Zhu, Xiaofeng; Bierut, Laura J.; Gauderman, W. James; Caulfield, Mark J.; Elliott, Paul; Rice, Kenneth; Munroe, Patricia B.; Morrison, Alanna C.; Cupples, L. Adrienne; Rao, Dabeeru C.; Chasman, Daniel I.; Study, Lifelines Cohort

    2018-01-01

    Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed

  13. The Sequenced Angiosperm Genomes and Genome Databases.

    Science.gov (United States)

    Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

    2018-01-01

    Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.

  14. Genomics and fish adaptation

    Directory of Open Access Journals (Sweden)

    Agostinho Antunes

    2015-12-01

    Full Text Available The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of varied fish species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

  15. Genome-wide association study of multiple congenital heart disease phenotypes identifies a susceptibility locus for atrial septal defect at chromosome 4p16

    Science.gov (United States)

    Cordell, Heather J.; Bentham, Jamie; Topf, Ana; Zelenika, Diana; Heath, Simon; Mamasoula, Chrysovalanto; Cosgrove, Catherine; Blue, Gillian; Granados-Riveron, Javier; Setchfield, Kerry; Thornborough, Chris; Breckpot, Jeroen; Soemedi, Rachel; Martin, Ruairidh; Rahman, Thahira J.; Hall, Darroch; van Engelen, Klaartje; Moorman, Antoon F.M.; Zwinderman, Aelko H; Barnett, Phil; Koopmann, Tamara T.; Adriaens, Michiel E.; Varro, Andras; George, Alfred L.; dos Remedios, Christobal; Bishopric, Nanette H.; Bezzina, Connie R.; O’Sullivan, John; Gewillig, Marc; Bu’Lock, Frances A.; Winlaw, David; Bhattacharya, Shoumo; Devriendt, Koen; Brook, J. David; Mulder, Barbara J.M.; Mital, Seema; Postma, Alex V.; Lathrop, G. Mark; Farrall, Martin; Goodship, Judith A.; Keavney, Bernard D.

    2013-01-01

    We carried out a genome-wide association study (GWAS) of congenital heart disease (CHD). Our discovery cohort comprised 1,995 CHD cases and 5,159 controls, and included patients from each of the three major clinical CHD categories (septal, obstructive and cyanotic defects). When all CHD phenotypes were considered together, no regions achieved genome-wide significant association. However, a region on chromosome 4p16, adjacent to the MSX1 and STX18 genes, was associated (P=9.5×10−7) with the risk of ostium secundum atrial septal defect (ASD) in the discovery cohort (N=340 cases), and this was replicated in a further 417 ASD cases and 2520 controls (replication P=5.0×10−5; OR in replication cohort 1.40 [95% CI 1.19-1.65]; combined P=2.6×10−10). Genotype accounted for ~9% of the population attributable risk of ASD. PMID:23708191

  16. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems.

    Directory of Open Access Journals (Sweden)

    Len J Wade

    Full Text Available The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7-40.7 Mb and on chromosome 8 (20.3-21.9 Mb. Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions.

  17. Prosecutor: parameter-free inference of gene function for prokaryotes using DNA microarray data, genomic context and multiple gene annotation sources

    Directory of Open Access Journals (Sweden)

    van Hijum Sacha AFT

    2008-10-01

    Full Text Available Abstract Background Despite a plethora of functional genomic efforts, the function of many genes in sequenced genomes remains unknown. The increasing amount of microarray data for many species allows employing the guilt-by-association principle to predict function on a large scale: genes exhibiting similar expression patterns are more likely to participate in shared biological processes. Results We developed Prosecutor, an application that enables researchers to rapidly infer gene function based on available gene expression data and functional annotations. Our parameter-free functional prediction method uses a sensitive algorithm to achieve a high association rate of linking genes with unknown function to annotated genes. Furthermore, Prosecutor utilizes additional biological information such as genomic context and known regulatory mechanisms that are specific for prokaryotes. We analyzed publicly available transcriptome data sets and used literature sources to validate putative functions suggested by Prosecutor. We supply the complete results of our analysis for 11 prokaryotic organisms on a dedicated website. Conclusion The Prosecutor software and supplementary datasets available at http://www.prosecutor.nl allow researchers working on any of the analyzed organisms to quickly identify the putative functions of their genes of interest. A de novo analysis allows new organisms to be studied.

  18. Chromatin dynamics in genome stability

    DEFF Research Database (Denmark)

    Nair, Nidhi; Shoaib, Muhammad; Sørensen, Claus Storgaard

    2017-01-01

    Genomic DNA is compacted into chromatin through packaging with histone and non-histone proteins. Importantly, DNA accessibility is dynamically regulated to ensure genome stability. This is exemplified in the response to DNA damage where chromatin relaxation near genomic lesions serves to promote...... access of relevant enzymes to specific DNA regions for signaling and repair. Furthermore, recent data highlight genome maintenance roles of chromatin through the regulation of endogenous DNA-templated processes including transcription and replication. Here, we review research that shows the importance...... of chromatin structure regulation in maintaining genome integrity by multiple mechanisms including facilitating DNA repair and directly suppressing endogenous DNA damage....

  19. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    OpenAIRE

    Wolf Yuri I; Novichkov Pavel S; Sorokin Alexander V; Makarova Kira S; Koonin Eugene V

    2007-01-01

    Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs ...

  20. Multiple Genes Related to Muscle Identified through a Joint Analysis of a Two-stage Genome-wide Association Study for Racing Performance of 1,156 Thoroughbreds

    Directory of Open Access Journals (Sweden)

    Dong-Hyun Shin

    2015-06-01

    Full Text Available Thoroughbred, a relatively recent horse breed, is best known for its use in horse racing. Although myostatin (MSTN variants have been reported to be highly associated with horse racing performance, the trait is more likely to be polygenic in nature. The purpose of this study was to identify genetic variants strongly associated with racing performance by using estimated breeding value (EBV for race time as a phenotype. We conducted a two-stage genome-wide association study to search for genetic variants associated with the EBV. In the first stage of genome-wide association study, a relatively large number of markers (~54,000 single-nucleotide polymorphisms, SNPs were evaluated in a small number of samples (240 horses. In the second stage, a relatively small number of markers identified to have large effects (170 SNPs were evaluated in a much larger number of samples (1,156 horses. We also validated the SNPs related to MSTN known to have large effects on racing performance and found significant associations in the stage two analysis, but not in stage one. We identified 28 significant SNPs related to 17 genes. Among these, six genes have a function related to myogenesis and five genes are involved in muscle maintenance. To our knowledge, these genes are newly reported for the genetic association with racing performance of Thoroughbreds. It complements a recent horse genome-wide association studies of racing performance that identified other SNPs and genes as the most significant variants. These results will help to expand our knowledge of the polygenic nature of racing performance in Thoroughbreds.

  1. Whole-genome comparison of two Campylobacter jejuni isolates of the same sequence type reveals multiple loci of different ancestral lineage.

    Directory of Open Access Journals (Sweden)

    Patrick J Biggs

    Full Text Available Campylobacter jejuni ST-474 is the most important human enteric pathogen in New Zealand, and yet this genotype is rarely found elsewhere in the world. Insight into the evolution of this organism was gained by a whole genome comparison of two ST-474, flaA SVR-14 isolates and other available C. jejuni isolates and genomes. The two isolates were collected from different sources, human (H22082 and retail poultry (P110b, at the same time and from the same geographical location. Solexa sequencing of each isolate resulted in ~1.659 Mb (H22082 and ~1.656 Mb (P110b of assembled sequences within 28 (H22082 and 29 (P110b contigs. We analysed 1502 genes for which we had sequences within both ST-474 isolates and within at least one of 11 C. jejuni reference genomes. Although 94.5% of genes were identical between the two ST-474 isolates, we identified 83 genes that differed by at least one nucleotide, including 55 genes with non-synonymous substitutions. These covered 101 kb and contained 672 point differences. We inferred that 22 (3.3% of these differences were due to mutation and 650 (96.7% were imported via recombination. Our analysis estimated 38 recombinant breakpoints within these 83 genes, which correspond to recombination events affecting at least 19 loci regions and gives a tract length estimate of ~2 kb. This includes a ~12 kb region displaying non-homologous recombination in one of the ST-474 genomes, with the insertion of two genes, including ykgC, a putative oxidoreductase, and a conserved hypothetical protein of unknown function. Furthermore, our analysis indicates that the source of this recombined DNA is more likely to have come from C. jejuni strains that are more closely related to ST-474. This suggests that the rates of recombination and mutation are similar in order of magnitude, but that recombination has been much more important for generating divergence between the two ST-474 isolates.

  2. Analysis of the Genome and Mobilome of a Dissimilatory Arsenate Reducing Aeromonas sp. O23A Reveals Multiple Mechanisms for Heavy Metal Resistance and Metabolism

    Directory of Open Access Journals (Sweden)

    Witold Uhrynowski

    2017-05-01

    Full Text Available Aeromonas spp. are among the most ubiquitous microorganisms, as they have been isolated from different environmental niches including waters, soil, as well as wounds and digestive tracts of poikilothermic animals and humans. Although much attention has been paid to the pathogenicity of Aeromonads, the role of these bacteria in environmentally important processes, such as transformation of heavy metals, remains to be discovered. Therefore, the aim of this study was a detailed genomic characterization of Aeromonas sp. O23A, the first representative of this genus capable of dissimilatory arsenate reduction. The strain was isolated from microbial mats from the Zloty Stok mine (SW Poland, an environment strongly contaminated with arsenic. Previous physiological studies indicated that O23A may be involved in both mobilization and immobilization of this metalloid in the environment. To discover the molecular basis of the mechanisms behind the observed abilities, the genome of O23A (∼5.0 Mbp was sequenced and annotated, and genes for arsenic respiration, heavy metal resistance (hmr and other phenotypic traits, including siderophore production, were identified. The functionality of the indicated gene modules was assessed in a series of minimal inhibitory concentration analyses for various metals and metalloids, as well as mineral dissolution experiments. Interestingly, comparative analyses revealed that O23A is related to a fish pathogen Aeromonas salmonicida subsp. salmonicida A449 which, however, does not carry genes for arsenic respiration. This indicates that the dissimilatory arsenate reduction ability may have been lost during genome reduction in pathogenic strains, or acquired through horizontal gene transfer. Therefore, particular emphasis was placed upon the mobilome of O23A, consisting of four plasmids, a phage, and numerous transposable elements, which may play a role in the dissemination of hmr and arsenic metabolism genes in the

  3. Analysis of the Genome and Mobilome of a Dissimilatory Arsenate Reducing Aeromonas sp. O23A Reveals Multiple Mechanisms for Heavy Metal Resistance and Metabolism.

    Science.gov (United States)

    Uhrynowski, Witold; Decewicz, Przemyslaw; Dziewit, Lukasz; Radlinska, Monika; Krawczyk, Pawel S; Lipinski, Leszek; Adamska, Dorota; Drewniak, Lukasz

    2017-01-01

    Aeromonas spp. are among the most ubiquitous microorganisms, as they have been isolated from different environmental niches including waters, soil, as well as wounds and digestive tracts of poikilothermic animals and humans. Although much attention has been paid to the pathogenicity of Aeromonads, the role of these bacteria in environmentally important processes, such as transformation of heavy metals, remains to be discovered. Therefore, the aim of this study was a detailed genomic characterization of Aeromonas sp. O23A, the first representative of this genus capable of dissimilatory arsenate reduction. The strain was isolated from microbial mats from the Zloty Stok mine (SW Poland), an environment strongly contaminated with arsenic. Previous physiological studies indicated that O23A may be involved in both mobilization and immobilization of this metalloid in the environment. To discover the molecular basis of the mechanisms behind the observed abilities, the genome of O23A (∼5.0 Mbp) was sequenced and annotated, and genes for arsenic respiration, heavy metal resistance ( hmr ) and other phenotypic traits, including siderophore production, were identified. The functionality of the indicated gene modules was assessed in a series of minimal inhibitory concentration analyses for various metals and metalloids, as well as mineral dissolution experiments. Interestingly, comparative analyses revealed that O23A is related to a fish pathogen Aeromonas salmonicida subsp. salmonicida A449 which, however, does not carry genes for arsenic respiration. This indicates that the dissimilatory arsenate reduction ability may have been lost during genome reduction in pathogenic strains, or acquired through horizontal gene transfer. Therefore, particular emphasis was placed upon the mobilome of O23A, consisting of four plasmids, a phage, and numerous transposable elements, which may play a role in the dissemination of hmr and arsenic metabolism genes in the environment. The obtained

  4. Analysis of the Genome and Mobilome of a Dissimilatory Arsenate Reducing Aeromonas sp. O23A Reveals Multiple Mechanisms for Heavy Metal Resistance and Metabolism

    Science.gov (United States)

    Uhrynowski, Witold; Decewicz, Przemyslaw; Dziewit, Lukasz; Radlinska, Monika; Krawczyk, Pawel S.; Lipinski, Leszek; Adamska, Dorota; Drewniak, Lukasz

    2017-01-01

    Aeromonas spp. are among the most ubiquitous microorganisms, as they have been isolated from different environmental niches including waters, soil, as well as wounds and digestive tracts of poikilothermic animals and humans. Although much attention has been paid to the pathogenicity of Aeromonads, the role of these bacteria in environmentally important processes, such as transformation of heavy metals, remains to be discovered. Therefore, the aim of this study was a detailed genomic characterization of Aeromonas sp. O23A, the first representative of this genus capable of dissimilatory arsenate reduction. The strain was isolated from microbial mats from the Zloty Stok mine (SW Poland), an environment strongly contaminated with arsenic. Previous physiological studies indicated that O23A may be involved in both mobilization and immobilization of this metalloid in the environment. To discover the molecular basis of the mechanisms behind the observed abilities, the genome of O23A (∼5.0 Mbp) was sequenced and annotated, and genes for arsenic respiration, heavy metal resistance (hmr) and other phenotypic traits, including siderophore production, were identified. The functionality of the indicated gene modules was assessed in a series of minimal inhibitory concentration analyses for various metals and metalloids, as well as mineral dissolution experiments. Interestingly, comparative analyses revealed that O23A is related to a fish pathogen Aeromonas salmonicida subsp. salmonicida A449 which, however, does not carry genes for arsenic respiration. This indicates that the dissimilatory arsenate reduction ability may have been lost during genome reduction in pathogenic strains, or acquired through horizontal gene transfer. Therefore, particular emphasis was placed upon the mobilome of O23A, consisting of four plasmids, a phage, and numerous transposable elements, which may play a role in the dissemination of hmr and arsenic metabolism genes in the environment. The obtained

  5. Integrating genome-wide association study and expression quantitative trait loci data identifies multiple genes and gene set associated with neuroticism.

    Science.gov (United States)

    Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng

    2017-08-01

    Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10 -10 ), MGC57346 (p value=6.92×10 -7 ), BLK (p value=1.01×10 -6 ), XKR6 (p value=1.11×10 -6 ), C17ORF69 (p value=1.12×10 -6 ) and KIAA1267 (p value=4.00×10 -6 ). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.

  6. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  7. Genome-Wide Meta-Analyses of Breast, Ovarian, and Prostate Cancer Association Studies Identify Multiple New Susceptibility Loci Shared by at Least Two Cancer Types

    DEFF Research Database (Denmark)

    Kar, Siddhartha P; Beesley, Jonathan; Amin Al Olama, Ali

    2016-01-01

    UNLABELLED: Breast, ovarian, and prostate cancers are hormone-related and may have a shared genetic basis, but this has not been investigated systematically by genome-wide association (GWA) studies. Meta-analyses combining the largest GWA meta-analysis data sets for these cancers totaling 112...... (rs200182588/9q31/SMC2; rs8037137/15q26/RCCD1), and two breast and prostate cancer risk loci (rs5013329/1p34/NSUN4; rs9375701/6q23/L3MBTL3). Index variants in five additional regions previously associated with only one cancer also showed clear association with a second cancer type. Cell......-type-specific expression quantitative trait locus and enhancer-gene interaction annotations suggested target genes with potential cross-cancer roles at the new loci. Pathway analysis revealed significant enrichment of death receptor signaling genes near loci with P cancer meta-analysis. SIGNIFICANCE...

  8. NCI collaborates with Multiple Myeloma Research Foundation

    Science.gov (United States)

    The National Cancer Institute (NCI) announced a collaboration with the Multiple Myeloma Research Foundation (MMRF) to incorporate MMRF's wealth of genomic and clinical data on the disease into the NCI Genomic Data Commons (GDC), a publicly available datab

  9. Extreme genomes

    OpenAIRE

    DeLong, Edward F

    2000-01-01

    The complete genome sequence of Thermoplasma acidophilum, an acid- and heat-loving archaeon, has recently been reported. Comparative genomic analysis of this 'extremophile' is providing new insights into the metabolic machinery, ecology and evolution of thermophilic archaea.

  10. Grass genomes

    OpenAIRE

    Bennetzen, Jeffrey L.; SanMiguel, Phillip; Chen, Mingsheng; Tikhonov, Alexander; Francki, Michael; Avramova, Zoya

    1998-01-01

    For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that in...

  11. A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus.

    Science.gov (United States)

    Lack, Justin B; Lange, Jeremy D; Tang, Alison D; Corbett-Detig, Russell B; Pool, John E

    2016-12-01

    The Drosophila Genome Nexus is a population genomic resource that provides D. melanogaster genomes from multiple sources. To facilitate comparisons across data sets, genomes are aligned using a common reference alignment pipeline which involves two rounds of mapping. Regions of residual heterozygosity, identity-by-descent, and recent population admixture are annotated to enable data filtering based on the user's needs. Here, we present a significant expansion of the Drosophila Genome Nexus, which brings the current data object to a total of 1,121 wild-derived genomes. New additions include 305 previously unpublished genomes from inbred lines representing six population samples in Egypt, Ethiopia, France, and South Africa, along with another 193 genomes added from recently-published data sets. We also provide an aligned D. simulans genome to facilitate divergence comparisons. This improved resource will broaden the range of population genomic questions that can addressed from multi-population allele frequencies and haplotypes in this model species. The larger set of genomes will also enhance the discovery of functionally relevant natural variation that exists within and between populations. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  12. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  13. Genome-wide Meta-analyses of Breast, Ovarian and Prostate Cancer Association Studies Identify Multiple New Susceptibility Loci Shared by At Least Two Cancer Types

    Science.gov (United States)

    Kar, Siddhartha P.; Beesley, Jonathan; Al Olama, Ali Amin; Michailidou, Kyriaki; Tyrer, Jonathan; Kote-Jarai, ZSofia; Lawrenson, Kate; Lindstrom, Sara; Ramus, Susan J.; Thompson, Deborah J.; Kibel, Adam S.; Dansonka-Mieszkowska, Agnieszka; Michael, Agnieszka; Dieffenbach, Aida K.; Gentry-Maharaj, Aleksandra; Whittemore, Alice S.; Wolk, Alicja; Monteiro, Alvaro; Peixoto, Ana; Kierzek, Andrzej; Cox, Angela; Rudolph, Anja; Gonzalez-Neira, Anna; Wu, Anna H.; Lindblom, Annika; Swerdlow, Anthony; Ziogas, Argyrios; Ekici, Arif B.; Burwinkel, Barbara; Karlan, Beth Y.; Nordestgaard, Børge G.; Blomqvist, Carl; Phelan, Catherine; McLean, Catriona; Pearce, Celeste Leigh; Vachon, Celine; Cybulski, Cezary; Slavov, Chavdar; Stegmaier, Christa; Maier, Christiane; Ambrosone, Christine B.; Høgdall, Claus K.; Teerlink, Craig C.; Kang, Daehee; Tessier, Daniel C.; Schaid, Daniel J.; Stram, Daniel O.; Cramer, Daniel W.; Neal, David E.; Eccles, Diana; Flesch-Janys, Dieter; Velez Edwards, Digna R.; Wokozorczyk, Dominika; Levine, Douglas A.; Yannoukakos, Drakoulis; Sawyer, Elinor J.; Bandera, Elisa V.; Poole, Elizabeth M.; Goode, Ellen L.; Khusnutdinova, Elza; Høgdall, Estrid; Song, Fengju; Bruinsma, Fiona; Heitz, Florian; Modugno, Francesmary; Hamdy, Freddie C.; Wiklund, Fredrik; Giles, Graham G.; Olsson, Håkan; Wildiers, Hans; Ulmer, Hans-Ulrich; Pandha, Hardev; Risch, Harvey A.; Darabi, Hatef; Salvesen, Helga B.; Nevanlinna, Heli; Gronberg, Henrik; Brenner, Hermann; Brauch, Hiltrud; Anton-Culver, Hoda; Song, Honglin; Lim, Hui-Yi; McNeish, Iain; Campbell, Ian; Vergote, Ignace; Gronwald, Jacek; Lubiński, Jan; Stanford, Janet L.; Benítez, Javier; Doherty, Jennifer A.; Permuth, Jennifer B.; Chang-Claude, Jenny; Donovan, Jenny L.; Dennis, Joe; Schildkraut, Joellen M.; Schleutker, Johanna; Hopper, John L.; Kupryjanczyk, Jolanta; Park, Jong Y.; Figueroa, Jonine; Clements, Judith A.; Knight, Julia A.; Peto, Julian; Cunningham, Julie M.; Pow-Sang, Julio; Batra, Jyotsna; Czene, Kamila; Lu, Karen H.; Herkommer, Kathleen; Khaw, Kay-Tee; Matsuo, Keitaro; Muir, Kenneth; Offitt, Kenneth; Chen, Kexin; Moysich, Kirsten B.; Aittomäki, Kristiina; Odunsi, Kunle; Kiemeney, Lambertus A.; Massuger, Leon F.A.G.; Fitzgerald, Liesel M.; Cook, Linda S.; Cannon-Albright, Lisa; Hooning, Maartje J.; Pike, Malcolm C.; Bolla, Manjeet K.; Luedeke, Manuel; Teixeira, Manuel R.; Goodman, Marc T.; Schmidt, Marjanka K.; Riggan, Marjorie; Aly, Markus; Rossing, Mary Anne; Beckmann, Matthias W.; Moisse, Matthieu; Sanderson, Maureen; Southey, Melissa C.; Jones, Michael; Lush, Michael; Hildebrandt, Michelle A. T.; Hou, Ming-Feng; Schoemaker, Minouk J.; Garcia-Closas, Montserrat; Bogdanova, Natalia; Rahman, Nazneen; Le, Nhu D.; Orr, Nick; Wentzensen, Nicolas; Pashayan, Nora; Peterlongo, Paolo; Guénel, Pascal; Brennan, Paul; Paulo, Paula; Webb, Penelope M.; Broberg, Per; Fasching, Peter A.; Devilee, Peter; Wang, Qin; Cai, Qiuyin; Li, Qiyuan; Kaneva, Radka; Butzow, Ralf; Kopperud, Reidun Kristin; Schmutzler, Rita K.; Stephenson, Robert A.; MacInnis, Robert J.; Hoover, Robert N.; Winqvist, Robert; Ness, Roberta; Milne, Roger L.; Travis, Ruth C.; Benlloch, Sara; Olson, Sara H.; McDonnell, Shannon K.; Tworoger, Shelley S.; Maia, Sofia; Berndt, Sonja; Lee, Soo Chin; Teo, Soo-Hwang; Thibodeau, Stephen N.; Bojesen, Stig E.; Gapstur, Susan M.; Kjær, Susanne Krüger; Pejovic, Tanja; Tammela, Teuvo L.J.; Dörk, Thilo; Brüning, Thomas; Wahlfors, Tiina; Key, Tim J.; Edwards, Todd L.; Menon, Usha; Hamann, Ute; Mitev, Vanio; Kosma, Veli-Matti; Setiawan, Veronica Wendy; Kristensen, Vessela; Arndt, Volker; Vogel, Walther; Zheng, Wei; Sieh, Weiva; Blot, William J.; Kluzniak, Wojciech; Shu, Xiao-Ou; Gao, Yu-Tang; Schumacher, Fredrick; Freedman, Matthew L.; Berchuck, Andrew; Dunning, Alison M.; Simard, Jacques; Haiman, Christopher A.; Spurdle, Amanda; Sellers, Thomas A.; Hunter, David J.; Henderson, Brian E.; Kraft, Peter; Chanock, Stephen J.; Couch, Fergus J.; Hall, Per; Gayther, Simon A.; Easton, Douglas F.; Chenevix-Trench, Georgia; Eeles, Rosalind; Pharoah, Paul D.P.; Lambrechts, Diether

    2016-01-01

    Breast, ovarian, and prostate cancers are hormone-related and may have a shared genetic basis but this has not been investigated systematically by genome-wide association (GWA) studies. Meta-analyses combining the largest GWA meta-analysis data sets for these cancers totaling 112,349 cases and 116,421 controls of European ancestry, all together and in pairs, identified at P cancer loci: three associated with susceptibility to all three cancers (rs17041869/2q13/BCL2L11; rs7937840/11q12/INCENP; rs1469713/19p13/GATAD2A), two breast and ovarian cancer risk loci (rs200182588/9q31/SMC2; rs8037137/15q26/RCCD1), and two breast and prostate cancer risk loci (rs5013329/1p34/NSUN4; rs9375701/6q23/L3MBTL3). Index variants in five additional regions previously associated with only one cancer also showed clear association with a second cancer type. Cell-type specific expression quantitative trait locus and enhancer-gene interaction annotations suggested target genes with potential cross-cancer roles at the new loci. Pathway analysis revealed significant enrichment of death receptor signaling genes near loci with P cancer meta-analysis. PMID:27432226

  14. Genome engineering in Vibrio cholerae

    DEFF Research Database (Denmark)

    Val, Marie-Eve; Skovgaard, Ole; Ducos-Galand, Magaly

    2012-01-01

    Although bacteria with multipartite genomes are prevalent, our knowledge of the mechanisms maintaining their genome is very limited, and much remains to be learned about the structural and functional interrelationships of multiple chromosomes. Owing to its bi-chromosomal genome architecture and its....... This difficulty was surmounted using a unique and powerful strategy based on massive rearrangement of prokaryotic genomes. We developed a site-specific recombination-based engineering tool, which allows targeted, oriented, and reciprocal DNA exchanges. Using this genetic tool, we obtained a panel of V. cholerae...

  15. Replication and Relevance of Multiple Susceptibility Loci Discovered from Genome Wide Association Studies for Type 2 Diabetes in an Indian Population.

    Directory of Open Access Journals (Sweden)

    Nagaraja M Phani

    Full Text Available Several genetic variants for type 2 diabetes (T2D have been identified through genome wide association studies (GWAS from Caucasian population; however replication studies were not consistent across various ethnicities. Objective of the current study is to examine the possible correlation of 9 most significant GWAS single nucleotide polymorphisms (SNPs for T2D susceptibility as well as the interactive effect of these variants on the risk of T2D in an Indian population.Case-control cohorts of 1156 individuals were genotyped for 9 SNPs from an Indian population. Association analyses were performed using logistic regression after adjusting for covariates. Multifactor dimensionality reduction (MDR analysis was adopted to determine gene-gene interactions and discriminatory power of combined SNP effect was assessed by grouping individuals based on the number of risk alleles and by calculating area under the receiver-operator characteristic curve (AUC.We confirm the association of TCF7L2 (rs7903146 and SLC30A8 (rs13266634 with T2D. MDR analysis showed statistically significant interactions among four SNPs of SLC30A8 (rs13266634, IGF2BP2 (rs4402960, HHEX (rs1111875 and CDKN2A (rs10811661 genes. Cumulative analysis showed an increase in odds ratio against the baseline group of individuals carrying 5 to 6 risk alleles and discriminatory power of genetic test based on 9 variants showed higher AUC value when analyzed along with body mass index (BMI.These results provide a strong evidence for independent association between T2D and SNPs for in TCF7L2 and SLC30A8. MDR analysis demonstrates that independently non-significant variants may interact with one another resulting in increased disease susceptibility in the population tested.

  16. Genome update: the 1000th genome - a cautionary tale

    DEFF Research Database (Denmark)

    Lagesen, Karin; Ussery, David; Wassenaar, Gertrude Maria

    2010-01-01

    conclusions for example about the largest bacterial genome sequenced. Biological diversity is far greater than many have thought. For example, analysis of multiple Escherichia coli genomes has led to an estimate of around 45 000 gene families more genes than are recognized in the human genome. Moreover......There are now more than 1000 sequenced prokaryotic genomes deposited in public databases and available for analysis. Currently, although the sequence databases GenBank, DNA Database of Japan and EMBL are synchronized continually, there are slight differences in content at the genomes level...... for a variety of logistical reasons, including differences in format and loading errors, such as those caused by file transfer protocol interruptions. This means that the 1000th genome will be different in the various databases. Some of the data on the highly accessed web pages are inaccurate, leading to false...

  17. Comparative genomics of multidrug resistance-encoding IncA/C plasmids from commensal and pathogenic Escherichia coli from multiple animal sources.

    Science.gov (United States)

    Fernández-Alarcón, Claudia; Singer, Randall S; Johnson, Timothy J

    2011-01-01

    Incompatibility group A/C (IncA/C) plasmids have received recent attention for their broad host range and ability to confer resistance to multiple antimicrobial agents. Due to the potential spread of multidrug resistance (MDR) phenotypes from foodborne pathogens to human pathogens, the dissemination of these plasmids represents a public health risk. In this study, four animal-source IncA/C plasmids isolated from Escherichia coli were sequenced and analyzed, including isolates from commercial dairy cows, pigs and turkeys in the U.S. and Chile. These plasmids were initially selected because they either contained the floR and tetA genes encoding for florfenicol and tetracycline resistance, respectively, and/or the bla(CMY-2) gene encoding for extended spectrum β-lactamase resistance. Overall, sequence analysis revealed that each of the four plasmids retained a remarkably stable and conserved backbone sequence, with differences observed primarily within their accessory regions, which presumably have evolved via horizontal gene transfer events involving multiple modules. Comparison of these plasmids with other available IncA/C plasmid sequences further defined the core and accessory elements of these plasmids in E. coli and Salmonella. Our results suggest that the bla(CMY-2) plasmid lineage appears to have derived from an ancestral IncA/C plasmid type harboring floR-tetAR-strAB and Tn21-like accessory modules. Evidence is mounting that IncA/C plasmids are widespread among enteric bacteria of production animals and these emergent plasmids have flexibility in their acquisition of MDR-encoding modules, necessitating further study to understand the evolutionary mechanisms involved in their dissemination and stability in bacterial populations.

  18. Comparative genomics of multidrug resistance-encoding IncA/C plasmids from commensal and pathogenic Escherichia coli from multiple animal sources.

    Directory of Open Access Journals (Sweden)

    Claudia Fernández-Alarcón

    Full Text Available Incompatibility group A/C (IncA/C plasmids have received recent attention for their broad host range and ability to confer resistance to multiple antimicrobial agents. Due to the potential spread of multidrug resistance (MDR phenotypes from foodborne pathogens to human pathogens, the dissemination of these plasmids represents a public health risk. In this study, four animal-source IncA/C plasmids isolated from Escherichia coli were sequenced and analyzed, including isolates from commercial dairy cows, pigs and turkeys in the U.S. and Chile. These plasmids were initially selected because they either contained the floR and tetA genes encoding for florfenicol and tetracycline resistance, respectively, and/or the bla(CMY-2 gene encoding for extended spectrum β-lactamase resistance. Overall, sequence analysis revealed that each of the four plasmids retained a remarkably stable and conserved backbone sequence, with differences observed primarily within their accessory regions, which presumably have evolved via horizontal gene transfer events involving multiple modules. Comparison of these plasmids with other available IncA/C plasmid sequences further defined the core and accessory elements of these plasmids in E. coli and Salmonella. Our results suggest that the bla(CMY-2 plasmid lineage appears to have derived from an ancestral IncA/C plasmid type harboring floR-tetAR-strAB and Tn21-like accessory modules. Evidence is mounting that IncA/C plasmids are widespread among enteric bacteria of production animals and these emergent plasmids have flexibility in their acquisition of MDR-encoding modules, necessitating further study to understand the evolutionary mechanisms involved in their dissemination and stability in bacterial populations.

  19. Genome Imprinting

    Indian Academy of Sciences (India)

    the cell nucleus (mitochondrial and chloroplast genomes), and. (3) traits governed ... tively good embryonic development but very poor development of membranes and ... Human homologies for the type of situation described above are naturally ..... imprint; (b) New modifications of the paternal genome in germ cells of each ...

  20. Baculovirus Genomics

    NARCIS (Netherlands)

    Oers, van M.M.; Vlak, J.M.

    2007-01-01

    Baculovirus genomes are covalently closed circles of double stranded-DNA varying in size between 80 and 180 kilobase-pair. The genomes of more than fourty-one baculoviruses have been sequenced to date. The majority of these (37) are pathogenic to lepidopteran hosts; three infect sawflies

  1. Genomic Testing

    Science.gov (United States)

    ... this database. Top of Page Evaluation of Genomic Applications in Practice and Prevention (EGAPP™) In 2004, the Centers for Disease Control and Prevention launched the EGAPP initiative to establish and test a ... and other applications of genomic technology that are in transition from ...

  2. Ancient genomes

    OpenAIRE

    Hoelzel, A Rus

    2005-01-01

    Ever since its invention, the polymerase chain reaction has been the method of choice for work with ancient DNA. In an application of modern genomic methods to material from the Pleistocene, a recent study has instead undertaken to clone and sequence a portion of the ancient genome of the cave bear.

  3. An overview of wheat genome sequencing and its implications for ...

    Indian Academy of Sciences (India)

    National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110 067, India ... Wheat (Triticum aestivum L.) serves as the staple food for. 30% of the global .... bread wheat genome is a product of multiple rounds of hybrid.

  4. Healthcare- and Community-Associated Methicillin-Resistant Staphylococcus aureus (MRSA) and Fatal Pneumonia with Pediatric Deaths in Krasnoyarsk, Siberian Russia: Unique MRSA's Multiple Virulence Factors, Genome, and Stepwise Evolution

    Science.gov (United States)

    Khokhlova, Olga E.; Hung, Wei-Chun; Wan, Tsai-Wen; Iwao, Yasuhisa; Takano, Tomomi; Higuchi, Wataru; Yachenko, Svetlana V.; Teplyakova, Olga V.; Kamshilova, Vera V.; Kotlovsky, Yuri V.; Nishiyama, Akihito; Reva, Ivan V.; Sidorenko, Sergey V.; Peryanova, Olga V.; Reva, Galina V.; Teng, Lee-Jene; Salmina, Alla B.; Yamamoto, Tatsuo

    2015-01-01

    Methicillin-resistant Staphylococcus aureus (MRSA) is a common multidrug-resistant (MDR) pathogen. We herein discussed MRSA and its infections in Krasnoyarsk, Siberian Russia between 2007 and 2011. The incidence of MRSA in 3,662 subjects was 22.0% and 2.9% for healthcare- and community-associated MRSA (HA- and CA-MRSA), respectively. The 15-day mortality rates for MRSA hospital- and community-acquired pneumonia (HAP and CAP) were 6.5% and 50%, respectively. MRSA CAP cases included pediatric deaths; of the MRSA pneumonia episodes available, ≥27.3% were associated with bacteremia. Most cases of HA-MRSA examined exhibited ST239/spa3(t037)/SCCmecIII.1.1.2 (designated as ST239Kras), while all CA-MRSA cases examined were ST8/spa1(t008)/SCCmecIV.3.1.1(IVc) (designated as ST8Kras). ST239Kras and ST8Kras strongly expressed cytolytic peptide (phenol-soluble modulin α, PSMα; and δ-hemolysin, Hld) genes, similar to CA-MRSA. ST239Kras pneumonia may have been attributed to a unique set of multiple virulence factors (MVFs): toxic shock syndrome toxin-1 (TSST-1), elevated PSMα/Hld expression, α-hemolysin, the staphylococcal enterotoxin SEK/SEQ, the immune evasion factor SCIN/SAK, and collagen adhesin. Regarding ST8Kras, SEA was included in MVFs, some of which were common to ST239Kras. The ST239Kras (strain OC3) genome contained: a completely unique phage, φSa7-like (W), with no att repetition; S. aureus pathogenicity island SaPI2R, the first TSST-1 gene-positive (tst+) SaPI in the ST239 lineage; and a super copy of IS256 (≥22 copies/genome). ST239Kras carried the Brazilian SCCmecIII.1.1.2 and United Kingdom-type tst. ST239Kras and ST8Kras were MDR, with the same levofloxacin resistance mutations; small, but transmissible chloramphenicol resistance plasmids spread widely enough to not be ignored. These results suggest that novel MDR and MVF+ HA- and CA-MRSA (ST239Kras and ST8Kras) emerged in Siberian Russia (Krasnoyarsk) associated with fatal pneumonia, and also with ST

  5. An integrative genomic approach reveals coordinated expression of intronic miR-335, miR-342, and miR-561 with deregulated host genes in multiple myeloma

    Directory of Open Access Journals (Sweden)

    Agnelli Luca

    2008-08-01

    Full Text Available Abstract Background The role of microRNAs (miRNAs in multiple myeloma (MM has yet to be fully elucidated. To identify miRNAs that are potentially deregulated in MM, we investigated those mapping within transcription units, based on evidence that intronic miRNAs are frequently coexpressed with their host genes. To this end, we monitored host transcript expression values in a panel of 20 human MM cell lines (HMCLs and focused on transcripts whose expression varied significantly across the dataset. Methods miRNA expression was quantified by Quantitative Real-Time PCR. Gene expression and genome profiling data were generated on Affymetrix oligonucleotide microarrays. Significant Analysis of Microarrays algorithm was used to investigate differentially expressed transcripts. Conventional statistics were used to test correlations for significance. Public libraries were queried to predict putative miRNA targets. Results We identified transcripts specific to six miRNA host genes (CCPG1, GULP1, EVL, TACSTD1, MEST, and TNIK whose average changes in expression varied at least 2-fold from the mean of the examined dataset. We evaluated the expression levels of the corresponding intronic miRNAs and identified a significant correlation between the expression levels of MEST, EVL, and GULP1 and those of the corresponding miRNAs miR-335, miR-342-3p, and miR-561, respectively. Genome-wide profiling of the 20 HMCLs indicated that the increased expression of the three host genes and their corresponding intronic miRNAs was not correlated with local copy number variations. Notably, miRNAs and their host genes were overexpressed in a fraction of primary tumors with respect to normal plasma cells; however, this finding was not correlated with known molecular myeloma groups. The predicted putative miRNA targets and the transcriptional profiles associated with the primary tumors suggest that MEST/miR-335 and EVL/miR-342-3p may play a role in plasma cell homing and

  6. Herbarium genomics

    DEFF Research Database (Denmark)

    Bakker, Freek T.; Lei, Di; Yu, Jiaying

    2016-01-01

    Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens...... up to 146 years old. We use genome skimming and an automated assembly pipeline, Iterative Organelle Genome Assembly, that assembles paired-end reads into a series of candidate assemblies, the best one of which is selected based on likelihood estimation. We used 93 specimens from 12 different...... correlation between plastome coverage and nuclear genome size (C value) in our samples, but the range of C values included is limited. Finally, we conclude that routine plastome sequencing from herbarium specimens is feasible and cost-effective (compared with Sanger sequencing or plastome...

  7. A universal genomic coordinate translator for comparative genomics.

    Science.gov (United States)

    Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G

    2014-06-30

    Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across

  8. Theory of microbial genome evolution

    Science.gov (United States)

    Koonin, Eugene

    Bacteria and archaea have small genomes tightly packed with protein-coding genes. This compactness is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. By fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. Thus, the number of genes in prokaryotic genomes seems to reflect the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias. New genes acquired by microbial genomes, on average, appear to be adaptive. Evolution of bacterial and archaeal genomes involves extensive horizontal gene transfer and gene loss. Many microbes have open pangenomes, where each newly sequenced genome contains more than 10% `ORFans', genes without detectable homologues in other species. A simple, steady-state evolutionary model reveals two sharply distinct classes of microbial genes, one of which (ORFans) is characterized by effectively instantaneous gene replacement, whereas the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of at least a billion distinct genes in the prokaryotic genomic universe.

  9. Ebolavirus comparative genomics

    Science.gov (United States)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  10. The Nostoc punctiforme Genome

    Energy Technology Data Exchange (ETDEWEB)

    John C. Meeks

    2001-12-31

    Nostoc punctiforme is a filamentous cyanobacterium with extensive phenotypic characteristics and a relatively large genome, approaching 10 Mb. The phenotypic characteristics include a photoautotrophic, diazotrophic mode of growth, but N. punctiforme is also facultatively heterotrophic; its vegetative cells have multiple development alternatives, including terminal differentiation into nitrogen-fixing heterocysts and transient differentiation into spore-like akinetes or motile filaments called hormogonia; and N. punctiforme has broad symbiotic competence with fungi and terrestrial plants, including bryophytes, gymnosperms and an angiosperm. The shotgun-sequencing phase of the N. punctiforme strain ATCC 29133 genome has been completed by the Joint Genome Institute. Annotation of an 8.9 Mb database yielded 7432 open reading frames, 45% of which encode proteins with known or probable known function and 29% of which are unique to N. punctiforme. Comparative analysis of the sequence indicates a genome that is highly plastic and in a state of flux, with numerous insertion sequences and multilocus repeats, as well as genes encoding transposases and DNA modification enzymes. The sequence also reveals the presence of genes encoding putative proteins that collectively define almost all characteristics of cyanobacteria as a group. N. punctiforme has an extensive potential to sense and respond to environmental signals as reflected by the presence of more than 400 genes encoding sensor protein kinases, response regulators and other transcriptional factors. The signal transduction systems and any of the large number of unique genes may play essential roles in the cell differentiation and symbiotic interaction properties of N. punctiforme.

  11. Cephalopod genomics

    DEFF Research Database (Denmark)

    Albertin, Caroline B.; Bonnaud, Laure; Brown, C. Titus

    2012-01-01

    The Cephalopod Sequencing Consortium (CephSeq Consortium) was established at a NESCent Catalysis Group Meeting, ``Paths to Cephalopod Genomics-Strategies, Choices, Organization,'' held in Durham, North Carolina, USA on May 24-27, 2012. Twenty-eight participants representing nine countries (Austria......, Australia, China, Denmark, France, Italy, Japan, Spain and the USA) met to address the pressing need for genome sequencing of cephalopod mollusks. This group, drawn from cephalopod biologists, neuroscientists, developmental and evolutionary biologists, materials scientists, bioinformaticians and researchers...... active in sequencing, assembling and annotating genomes, agreed on a set of cephalopod species of particular importance for initial sequencing and developed strategies and an organization (CephSeq Consortium) to promote this sequencing. The conclusions and recommendations of this meeting are described...

  12. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  13. Creating a platform for collaborative genomic research

    Directory of Open Access Journals (Sweden)

    Mark Smithson

    2017-04-01

    The developed genomics informatics platform provides a step-change in this type of genetic research, accelerating reproducible collaborative research across multiple disparate organisations and data sources, of varying type and complexity.

  14. The genome portal of the Department of Energy Joint Genome Institute: 2014 updates

    Energy Technology Data Exchange (ETDEWEB)

    Nordberg, Henrik [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Cantor, Michael [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Dusheyko, Serge [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Hua, Susan [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Poliakov, Alexander [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Shabalov, Igor [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Smirnova, Tatyana [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Grigoriev, Igor V. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Dubchak, Inna [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)

    2013-11-12

    The U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a national user facility, serves the diverse scientific community by providing integrated high-throughput sequencing and computational analysis to enable system-based scientific approaches in support of DOE missions related to clean energy generation and environmental characterization. The JGI Genome Portal (http://genome.jgi.doe.gov) provides unified access to all JGI genomic databases and analytical tools. The JGI maintains extensive data management systems and specialized analytical capabilities to manage and interpret complex genomic data. A user can search, download and explore multiple data sets available for all DOE JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. In this paper, we describe major updates of the Genome Portal in the past 2 years with a specific emphasis on efficient handling of the rapidly growing amount of diverse genomic data accumulated in JGI.

  15. Genome chaos: survival strategy during crisis.

    Science.gov (United States)

    Liu, Guo; Stevens, Joshua B; Horne, Steven D; Abdallah, Batoul Y; Ye, Karen J; Bremer, Steven W; Ye, Christine J; Chen, David J; Heng, Henry H

    2014-01-01

    Genome chaos, a process of complex, rapid genome re-organization, results in the formation of chaotic genomes, which is followed by the potential to establish stable genomes. It was initially detected through cytogenetic analyses, and recently confirmed by whole-genome sequencing efforts which identified multiple subtypes including "chromothripsis", "chromoplexy", "chromoanasynthesis", and "chromoanagenesis". Although genome chaos occurs commonly in tumors, both the mechanism and detailed aspects of the process are unknown due to the inability of observing its evolution over time in clinical samples. Here, an experimental system to monitor the evolutionary process of genome chaos was developed to elucidate its mechanisms. Genome chaos occurs following exposure to chemotherapeutics with different mechanisms, which act collectively as stressors. Characterization of the karyotype and its dynamic changes prior to, during, and after induction of genome chaos demonstrates that chromosome fragmentation (C-Frag) occurs just prior to chaotic genome formation. Chaotic genomes seem to form by random rejoining of chromosomal fragments, in part through non-homologous end joining (NHEJ). Stress induced genome chaos results in increased karyotypic heterogeneity. Such increased evolutionary potential is demonstrated by the identification of increased transcriptome dynamics associated with high levels of karyotypic variance. In contrast to impacting on a limited number of cancer genes, re-organized genomes lead to new system dynamics essential for cancer evolution. Genome chaos acts as a mechanism of rapid, adaptive, genome-based evolution that plays an essential role in promoting rapid macroevolution of new genome-defined systems during crisis, which may explain some unwanted consequences of cancer treatment.

  16. Comparative Genomics

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 11; Issue 8. Comparative Genomics - A Powerful New Tool in Biology. Anand K Bachhawat. General Article Volume 11 Issue 8 August 2006 pp 22-40. Fulltext. Click here to view fulltext PDF. Permanent link:

  17. Single virus genomics: a new tool for virus discovery.

    Directory of Open Access Journals (Sweden)

    Lisa Zeigler Allen

    Full Text Available Whole genome amplification and sequencing of single microbial cells has significantly influenced genomics and microbial ecology by facilitating direct recovery of reference genome data. However, viral genomics continues to suffer due to difficulties related to the isolation and characterization of uncultivated viruses. We report here on a new approach called 'Single Virus Genomics', which enabled the isolation and complete genome sequencing of the first single virus particle. A mixed assemblage comprised of two known viruses; E. coli bacteriophages lambda and T4, were sorted using flow cytometric methods and subsequently immobilized in an agarose matrix. Genome amplification was then achieved in situ via multiple displacement amplification (MDA. The complete lambda phage genome was recovered with an average depth of coverage of approximately 437X. The isolation and genome sequencing of uncultivated viruses using Single Virus Genomics approaches will enable researchers to address questions about viral diversity, evolution, adaptation and ecology that were previously unattainable.

  18. RPAN: rice pan-genome browser for ∼3000 rice genomes.

    Science.gov (United States)

    Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun

    2017-01-25

    A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Genomic Diversity and Evolution of the Lyssaviruses

    Science.gov (United States)

    Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

    2008-01-01

    Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239

  20. Genomic diversity and evolution of the lyssaviruses.

    Directory of Open Access Journals (Sweden)

    Olivier Delmas

    2008-04-01

    Full Text Available Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as 'Lagos Bat'. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses.

  1. Genomic Feature Models

    DEFF Research Database (Denmark)

    Sørensen, Peter; Edwards, Stefan McKinnon; Rohde, Palle Duun

    -additive genetic mechanisms. These modeling approaches have proven to be highly useful to determine population genetic parameters as well as prediction of genetic risk or value. We present a series of statistical modelling approaches that use prior biological information for evaluating the collective action......Whole-genome sequences and multiple trait phenotypes from large numbers of individuals will soon be available in many populations. Well established statistical modeling approaches enable the genetic analyses of complex trait phenotypes while accounting for a variety of additive and non...... regions and gene ontologies) that provide better model fit and increase predictive ability of the statistical model for this trait....

  2. Personal genomics services: whose genomes?

    Science.gov (United States)

    Gurwitz, David; Bregman-Eschet, Yael

    2009-07-01

    New companies offering personal whole-genome information services over the internet are dynamic and highly visible players in the personal genomics field. For fees currently ranging from US$399 to US$2500 and a vial of saliva, individuals can now purchase online access to their individual genetic information regarding susceptibility to a range of chronic diseases and phenotypic traits based on a genome-wide SNP scan. Most of the companies offering such services are based in the United States, but their clients may come from nearly anywhere in the world. Although the scientific validity, clinical utility and potential future implications of such services are being hotly debated, several ethical and regulatory questions related to direct-to-consumer (DTC) marketing strategies of genetic tests have not yet received sufficient attention. For example, how can we minimize the risk of unauthorized third parties from submitting other people's DNA for testing? Another pressing question concerns the ownership of (genotypic and phenotypic) information, as well as the unclear legal status of customers regarding their own personal information. Current legislation in the US and Europe falls short of providing clear answers to these questions. Until the regulation of personal genomics services catches up with the technology, we call upon commercial providers to self-regulate and coordinate their activities to minimize potential risks to individual privacy. We also point out some specific steps, along the trustee model, that providers of DTC personal genomics services as well as regulators and policy makers could consider for addressing some of the concerns raised below.

  3. Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

    Science.gov (United States)

    Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

    2018-02-01

    This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.

  4. Visualization for genomics: the Microbial Genome Viewer.

    NARCIS (Netherlands)

    Kerkhoven, R.; Enckevort, F.H.J. van; Boekhorst, J.; Molenaar, D; Siezen, R.J.

    2004-01-01

    SUMMARY: A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a

  5. Aligning the unalignable: bacteriophage whole genome alignments.

    Science.gov (United States)

    Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

    2016-01-13

    In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).

  6. The Saccharomyces Genome Database Variant Viewer.

    Science.gov (United States)

    Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael

    2016-01-04

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. One bacterial cell, one complete genome.

    Directory of Open Access Journals (Sweden)

    Tanja Woyke

    2010-04-01

    Full Text Available While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200-900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA. Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs, indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  8. One Bacterial Cell, One Complete Genome

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos; Clum, Alicia; Copeland, Alex; Schackwitz, Wendy; Lapidus, Alla; Wu, Dongying; McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A.; Bristow, James; Cheng, Jan-Fang

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  9. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen

    2015-01-01

    throughput of next generation sequencing platforms and the ability to target short and degraded DNA molecules. Many ancient specimens previously unsuitable for DNA analyses because of extensive degradation can now successfully be used as source materials. Additionally, the analytical power obtained...... by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans...

  10. Marine genomics

    DEFF Research Database (Denmark)

    Oliveira Ribeiro, Ângela Maria; Foote, Andrew David; Kupczok, Anne

    2017-01-01

    Marine ecosystems occupy 71% of the surface of our planet, yet we know little about their diversity. Although the inventory of species is continually increasing, as registered by the Census of Marine Life program, only about 10% of the estimated two million marine species are known. This lag......-throughput sequencing approaches have been helping to improve our knowledge of marine biodiversity, from the rich microbial biota that forms the base of the tree of life to a wealth of plant and animal species. In this review, we present an overview of the applications of genomics to the study of marine life, from...

  11. Multiple Perspectives / Multiple Readings

    Directory of Open Access Journals (Sweden)

    Simon Biggs

    2005-01-01

    Full Text Available People experience things from their own physical point of view. What they see is usually a function of where they are and what physical attitude they adopt relative to the subject. With augmented vision (periscopes, mirrors, remote cameras, etc we are able to see things from places where we are not present. With time-shifting technologies, such as the video recorder, we can also see things from the past; a time and a place we may never have visited.In recent artistic work I have been exploring the implications of digital technology, interactivity and internet connectivity that allow people to not so much space/time-shift their visual experience of things but rather see what happens when everybody is simultaneously able to see what everybody else can see. This is extrapolated through the remote networking of sites that are actual installation spaces; where the physical movements of viewers in the space generate multiple perspectives, linked to other similar sites at remote locations or to other viewers entering the shared data-space through a web based version of the work.This text explores the processes involved in such a practice and reflects on related questions regarding the non-singularity of being and the sense of self as linked to time and place.

  12. Genome-wide identification of the regulatory targets of a transcription factor using biochemical characterization and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Jolly Emmitt R

    2005-11-01

    Full Text Available Abstract Background A major challenge in computational genomics is the development of methodologies that allow accurate genome-wide prediction of the regulatory targets of a transcription factor. We present a method for target identification that combines experimental characterization of binding requirements with computational genomic analysis. Results Our method identified potential target genes of the transcription factor Ndt80, a key transcriptional regulator involved in yeast sporulation, using the combined information of binding affinity, positional distribution, and conservation of the binding sites across multiple species. We have also developed a mathematical approach to compute the false positive rate and the total number of targets in the genome based on the multiple selection criteria. Conclusion We have shown that combining biochemical characterization and computational genomic analysis leads to accurate identification of the genome-wide targets of a transcription factor. The method can be extended to other transcription factors and can complement other genomic approaches to transcriptional regulation.

  13. Multiple sclerosis

    Science.gov (United States)

    ... indwelling catheter Osteoporosis or thinning of the bones Pressure sores Side effects of medicines used to treat the ... Daily bowel care program Multiple sclerosis - discharge Preventing pressure ulcers Swallowing problems Images Multiple sclerosis MRI of the ...

  14. Ensembl Genomes 2016: more genomes, more complexity.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

    2016-01-04

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Rodent malaria parasites : genome organization & comparative genomics

    NARCIS (Netherlands)

    Kooij, Taco W.A.

    2006-01-01

    The aim of the studies described in this thesis was to investigate the genome organization of rodent malaria parasites (RMPs) and compare the organization and gene content of the genomes of RMPs and the human malaria parasite P. falciparum. The release of the complete genome sequence of P.

  16. Punctuated evolution of prostate cancer genomes.

    Science.gov (United States)

    Baca, Sylvan C; Prandi, Davide; Lawrence, Michael S; Mosquera, Juan Miguel; Romanel, Alessandro; Drier, Yotam; Park, Kyung; Kitabayashi, Naoki; MacDonald, Theresa Y; Ghandi, Mahmoud; Van Allen, Eliezer; Kryukov, Gregory V; Sboner, Andrea; Theurillat, Jean-Philippe; Soong, T David; Nickerson, Elizabeth; Auclair, Daniel; Tewari, Ashutosh; Beltran, Himisha; Onofrio, Robert C; Boysen, Gunther; Guiducci, Candace; Barbieri, Christopher E; Cibulskis, Kristian; Sivachenko, Andrey; Carter, Scott L; Saksena, Gordon; Voet, Douglas; Ramos, Alex H; Winckler, Wendy; Cipicchio, Michelle; Ardlie, Kristin; Kantoff, Philip W; Berger, Michael F; Gabriel, Stacey B; Golub, Todd R; Meyerson, Matthew; Lander, Eric S; Elemento, Olivier; Getz, Gad; Demichelis, Francesca; Rubin, Mark A; Garraway, Levi A

    2013-04-25

    The analysis of exonic DNA from prostate cancers has identified recurrently mutated genes, but the spectrum of genome-wide alterations has not been profiled extensively in this disease. We sequenced the genomes of 57 prostate tumors and matched normal tissues to characterize somatic alterations and to study how they accumulate during oncogenesis and progression. By modeling the genesis of genomic rearrangements, we identified abundant DNA translocations and deletions that arise in a highly interdependent manner. This phenomenon, which we term "chromoplexy," frequently accounts for the dysregulation of prostate cancer genes and appears to disrupt multiple cancer genes coordinately. Our modeling suggests that chromoplexy may induce considerable genomic derangement over relatively few events in prostate cancer and other neoplasms, supporting a model of punctuated cancer evolution. By characterizing the clonal hierarchy of genomic lesions in prostate tumors, we charted a path of oncogenic events along which chromoplexy may drive prostate carcinogenesis. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Genomics of Colorectal Cancer in African Americans

    OpenAIRE

    Brim, Hassan; Ashktorab, Hassan

    2016-01-01

    Genome-wide studies are increasingly becoming a must, especially for complex diseases such as cancer where multiple genes and diverse molecular mechanisms are known to be involved in genes’ function alteration. In this review, we report our latest genomic and epigenomic findings in African-American colorectal cancer patients. This population suffers a higher burden of the disease and most investigators in this field are looking for the underlying genetic and epigenetic targets that might be r...

  18. Genome-wide comparative analysis of four Indian Drosophila species.

    Science.gov (United States)

    Mohanty, Sujata; Khanna, Radhika

    2017-12-01

    Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.

  19. Funding Opportunity: Genomic Data Centers

    Science.gov (United States)

    Funding Opportunity CCG, Funding Opportunity Center for Cancer Genomics, CCG, Center for Cancer Genomics, CCG RFA, Center for cancer genomics rfa, genomic data analysis network, genomic data analysis network centers,

  20. Genomic instability and radiation

    Energy Technology Data Exchange (ETDEWEB)

    Little, John B [Harvard School of Public Health, Boston, MA 02115 (United States)

    2003-06-01

    Genomic instability is a hallmark of cancer cells, and is thought to be involved in the process of carcinogenesis. Indeed, a number of rare genetic disorders associated with a predisposition to cancer are characterised by genomic instability occurring in somatic cells. Of particular interest is the observation that transmissible instability can be induced in somatic cells from normal individuals by exposure to ionising radiation, leading to a persistent enhancement in the rate at which mutations and chromosomal aberrations arise in the progeny of the irradiated cells after many generations of replication. If such induced instability is involved in radiation carcinogenesis, it would imply that the initial carcinogenic event may not be a rare mutation occurring in a specific gene or set of genes. Rather, radiation may induce a process of instability in many cells in a population, enhancing the rate at which the multiple gene mutations necessary for the development of cancer may arise in a given cell lineage. Furthermore, radiation could act at any stage in the development of cancer by facilitating the accumulation of the remaining genetic events required to produce a fully malignant tumour. The experimental evidence for such induced instability is reviewed. (review)

  1. Genomic instability and radiation

    International Nuclear Information System (INIS)

    Little, John B

    2003-01-01

    Genomic instability is a hallmark of cancer cells, and is thought to be involved in the process of carcinogenesis. Indeed, a number of rare genetic disorders associated with a predisposition to cancer are characterised by genomic instability occurring in somatic cells. Of particular interest is the observation that transmissible instability can be induced in somatic cells from normal individuals by exposure to ionising radiation, leading to a persistent enhancement in the rate at which mutations and chromosomal aberrations arise in the progeny of the irradiated cells after many generations of replication. If such induced instability is involved in radiation carcinogenesis, it would imply that the initial carcinogenic event may not be a rare mutation occurring in a specific gene or set of genes. Rather, radiation may induce a process of instability in many cells in a population, enhancing the rate at which the multiple gene mutations necessary for the development of cancer may arise in a given cell lineage. Furthermore, radiation could act at any stage in the development of cancer by facilitating the accumulation of the remaining genetic events required to produce a fully malignant tumour. The experimental evidence for such induced instability is reviewed. (review)

  2. MULTIPLE OBJECTS

    Directory of Open Access Journals (Sweden)

    A. A. Bosov

    2015-04-01

    Full Text Available Purpose. The development of complicated techniques of production and management processes, information systems, computer science, applied objects of systems theory and others requires improvement of mathematical methods, new approaches for researches of application systems. And the variety and diversity of subject systems makes necessary the development of a model that generalizes the classical sets and their development – sets of sets. Multiple objects unlike sets are constructed by multiple structures and represented by the structure and content. The aim of the work is the analysis of multiple structures, generating multiple objects, the further development of operations on these objects in application systems. Methodology. To achieve the objectives of the researches, the structure of multiple objects represents as constructive trio, consisting of media, signatures and axiomatic. Multiple object is determined by the structure and content, as well as represented by hybrid superposition, composed of sets, multi-sets, ordered sets (lists and heterogeneous sets (sequences, corteges. Findings. In this paper we study the properties and characteristics of the components of hybrid multiple objects of complex systems, proposed assessments of their complexity, shown the rules of internal and external operations on objects of implementation. We introduce the relation of arbitrary order over multiple objects, we define the description of functions and display on objects of multiple structures. Originality.In this paper we consider the development of multiple structures, generating multiple objects.Practical value. The transition from the abstract to the subject of multiple structures requires the transformation of the system and multiple objects. Transformation involves three successive stages: specification (binding to the domain, interpretation (multiple sites and particularization (goals. The proposed describe systems approach based on hybrid sets

  3. Genomic and Epigenomic Alterations in Cancer.

    Science.gov (United States)

    Chakravarthi, Balabhadrapatruni V S K; Nepal, Saroj; Varambally, Sooryanarayana

    2016-07-01

    Multiple genetic and epigenetic events characterize tumor progression and define the identity of the tumors. Advances in high-throughput technologies, like gene expression profiling, next-generation sequencing, proteomics, and metabolomics, have enabled detailed molecular characterization of various tumors. The integration and analyses of these high-throughput data have unraveled many novel molecular aberrations and network alterations in tumors. These molecular alterations include multiple cancer-driving mutations, gene fusions, amplification, deletion, and post-translational modifications, among others. Many of these genomic events are being used in cancer diagnosis, whereas others are therapeutically targeted with small-molecule inhibitors. Multiple genes/enzymes that play a role in DNA and histone modifications are also altered in various cancers, changing the epigenomic landscape during cancer initiation and progression. Apart from protein-coding genes, studies are uncovering the critical regulatory roles played by noncoding RNAs and noncoding regions of the genome during cancer progression. Many of these genomic and epigenetic events function in tandem to drive tumor development and metastasis. Concurrent advances in genome-modulating technologies, like gene silencing and genome editing, are providing ability to understand in detail the process of cancer initiation, progression, and signaling as well as opening up avenues for therapeutic targeting. In this review, we discuss some of the recent advances in cancer genomic and epigenomic research. Copyright © 2016 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.

  4. Genome organization, instabilities, stem cells, and cancer

    Directory of Open Access Journals (Sweden)

    Senthil Kumar Pazhanisamy

    2009-01-01

    Full Text Available It is now widely recognized that advances in exploring genome organization provide remarkable insights on the induction and progression of chromosome abnormalities. Much of what we know about how mutations evolve and consequently transform into genome instabilities has been characterized in the spatial organization context of chromatin. Nevertheless, many underlying concepts of impact of the chromatin organization on perpetuation of multiple mutations and on propagation of chromosomal aberrations remain to be investigated in detail. Genesis of genome instabilities from accumulation of multiple mutations that drive tumorigenesis is increasingly becoming a focal theme in cancer studies. This review focuses on structural alterations evolve to raise a variety of genome instabilities that are manifested at the nucleotide, gene or sub-chromosomal, and whole chromosome level of genome. Here we explore an underlying connection between genome instability and cancer in the light of genome architecture. This review is limited to studies directed towards spatial organizational aspects of origin and propagation of aberrations into genetically unstable tumors.

  5. GAAP: Genome-organization-framework-Assisted Assembly Pipeline for prokaryotic genomes.

    Science.gov (United States)

    Yuan, Lina; Yu, Yang; Zhu, Yanmin; Li, Yulai; Li, Changqing; Li, Rujiao; Ma, Qin; Siu, Gilman Kit-Hang; Yu, Jun; Jiang, Taijiao; Xiao, Jingfa; Kang, Yu

    2017-01-25

    Next-generation sequencing (NGS) technologies have greatly promoted the genomic study of prokaryotes. However, highly fragmented assemblies due to short reads from NGS are still a limiting factor in gaining insights into the genome biology. Reference-assisted tools are promising in genome assembly, but tend to result in false assembly when the assigned reference has extensive rearrangements. Herein, we present GAAP, a genome assembly pipeline for scaffolding based on core-gene-defined Genome Organizational Framework (cGOF) described in our previous study. Instead of assigning references, we use the multiple-reference-derived cGOFs as indexes to assist in order and orientation of the scaffolds and build a skeleton structure, and then use read pairs to extend scaffolds, called local scaffolding, and distinguish between true and chimeric adjacencies in the scaffolds. In our performance tests using both empirical and simulated data of 15 genomes in six species with diverse genome size, complexity, and all three categories of cGOFs, GAAP outcompetes or achieves comparable results when compared to three other reference-assisted programs, AlignGraph, Ragout and MeDuSa. GAAP uses both cGOF and pair-end reads to create assemblies in genomic scale, and performs better than the currently available reference-assisted assembly tools as it recovers more assemblies and makes fewer false locations, especially for species with extensive rearranged genomes. Our method is a promising solution for reconstruction of genome sequence from short reads of NGS.

  6. The accuracy of prediction of genomic selection in elite hybrid rye populations surpasses the accuracy of marker-assisted selection and is equally augmented by multiple field evaluation locations and test years.

    Science.gov (United States)

    Wang, Yu; Mette, Michael Florian; Miedaner, Thomas; Gottwald, Marlen; Wilde, Peer; Reif, Jochen C; Zhao, Yusheng

    2014-07-04

    Marker-assisted selection (MAS) and genomic selection (GS) based on genome-wide marker data provide powerful tools to predict the genotypic value of selection material in plant breeding. However, case-to-case optimization of these approaches is required to achieve maximum accuracy of prediction with reasonable input. Based on extended field evaluation data for grain yield, plant height, starch content and total pentosan content of elite hybrid rye derived from testcrosses involving two bi-parental populations that were genotyped with 1048 molecular markers, we compared the accuracy of prediction of MAS and GS in a cross-validation approach. MAS delivered generally lower and in addition potentially over-estimated accuracies of prediction than GS by ridge regression best linear unbiased prediction (RR-BLUP). The grade of relatedness of the plant material included in the estimation and test sets clearly affected the accuracy of prediction of GS. Within each of the two bi-parental populations, accuracies differed depending on the relatedness of the respective parental lines. Across populations, accuracy increased when both populations contributed to estimation and test set. In contrast, accuracy of prediction based on an estimation set from one population to a test set from the other population was low despite that the two bi-parental segregating populations under scrutiny shared one parental line. Limiting the number of locations or years in field testing reduced the accuracy of prediction of GS equally, supporting the view that to establish robust GS calibration models a sufficient number of test locations is of similar importance as extended testing for more than one year. In hybrid rye, genomic selection is superior to marker-assisted selection. However, it achieves high accuracies of prediction only for selection candidates closely related to the plant material evaluated in field trials, resulting in a rather pessimistic prognosis for distantly related material

  7. Exploring Other Genomes: Bacteria.

    Science.gov (United States)

    Flannery, Maura C.

    2001-01-01

    Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)

  8. Genomics With Cloud Computing

    OpenAIRE

    Sukhamrit Kaur; Sandeep Kaur

    2015-01-01

    Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computin...

  9. Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.

    Science.gov (United States)

    Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N

    2014-07-01

    Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  10. Dynamics of genome rearrangement in bacterial populations.

    Directory of Open Access Journals (Sweden)

    Aaron E Darling

    2008-07-01

    Full Text Available Genome structure variation has profound impacts on phenotype in organisms ranging from microbes to humans, yet little is known about how natural selection acts on genome arrangement. Pathogenic bacteria such as Yersinia pestis, which causes bubonic and pneumonic plague, often exhibit a high degree of genomic rearrangement. The recent availability of several Yersinia genomes offers an unprecedented opportunity to study the evolution of genome structure and arrangement. We introduce a set of statistical methods to study patterns of rearrangement in circular chromosomes and apply them to the Yersinia. We constructed a multiple alignment of eight Yersinia genomes using Mauve software to identify 78 conserved segments that are internally free from genome rearrangement. Based on the alignment, we applied Bayesian statistical methods to infer the phylogenetic inversion history of Yersinia. The sampling of genome arrangement reconstructions contains seven parsimonious tree topologies, each having different histories of 79 inversions. Topologies with a greater number of inversions also exist, but were sampled less frequently. The inversion phylogenies agree with results suggested by SNP patterns. We then analyzed reconstructed inversion histories to identify patterns of rearrangement. We confirm an over-representation of "symmetric inversions"-inversions with endpoints that are equally distant from the origin of chromosomal replication. Ancestral genome arrangements demonstrate moderate preference for replichore balance in Yersinia. We found that all inversions are shorter than expected under a neutral model, whereas inversions acting within a single replichore are much shorter than expected. We also found evidence for a canonical configuration of the origin and terminus of replication. Finally, breakpoint reuse analysis reveals that inversions with endpoints proximal to the origin of DNA replication are nearly three times more frequent. Our findings

  11. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  12. Evolution of genes and genomes on the Drosophila phylogeny

    DEFF Research Database (Denmark)

    Clark, Andrew G; Eisen, Michael B; Smith, Douglas R

    2007-01-01

    Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the ......Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here...... tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila...

  13. Multiple sclerosis

    International Nuclear Information System (INIS)

    Grunwald, I.Q.; Kuehn, A.L.; Backens, M.; Papanagiotou, P.; Shariat, K.; Kostopoulos, P.

    2008-01-01

    Multiple sclerosis is the most common chronic inflammatory disease of myelin with interspersed lesions in the white matter of the central nervous system. Magnetic resonance imaging (MRI) plays a key role in the diagnosis and monitoring of white matter diseases. This article focuses on key findings in multiple sclerosis as detected by MRI. (orig.) [de

  14. JGI Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.

    2011-03-14

    Genomes of energy and environment fungi are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 50 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such 'parts' suggested by comparative genomics and functional analysis in these areas are presented here

  15. Genomic Encyclopedia of Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-08-10

    Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 150 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.

  16. Genomics With Cloud Computing

    Directory of Open Access Journals (Sweden)

    Sukhamrit Kaur

    2015-04-01

    Full Text Available Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computing to genomics are like easy access and sharing of data security of data less cost to pay for resources but still there are some demerits like large time needed to transfer data less network bandwidth.

  17. Evolution of linear chromosomes and multipartite genomes in yeast mitochondria

    Science.gov (United States)

    Valach, Matus; Farkas, Zoltan; Fricova, Dominika; Kovac, Jakub; Brejova, Brona; Vinar, Tomas; Pfeiffer, Ilona; Kucsera, Judit; Tomaska, Lubomir; Lang, B. Franz; Nosek, Jozef

    2011-01-01

    Mitochondrial genome diversity in closely related species provides an excellent platform for investigation of chromosome architecture and its evolution by means of comparative genomics. In this study, we determined the complete mitochondrial DNA sequences of eight Candida species and analyzed their molecular architectures. Our survey revealed a puzzling variability of genome architecture, including circular- and linear-mapping and multipartite linear forms. We propose that the arrangement of large inverted repeats identified in these genomes plays a crucial role in alterations of their molecular architectures. In specific arrangements, the inverted repeats appear to function as resolution elements, allowing genome conversion among different topologies, eventually leading to genome fragmentation into multiple linear DNA molecules. We suggest that molecular transactions generating linear mitochondrial DNA molecules with defined telomeric structures may parallel the evolutionary emergence of linear chromosomes and multipartite genomes in general and may provide clues for the origin of telomeres and pathways implicated in their maintenance. PMID:21266473

  18. Multiple homicides.

    Science.gov (United States)

    Copeland, A R

    1989-09-01

    A study of multiple homicides or multiple deaths involving a solitary incident of violence by another individual was performed on the case files of the Office of the Medical Examiner of Metropolitan Dade County in Miami, Florida, during 1983-1987. A total of 107 multiple homicides were studied: 88 double, 17 triple, one quadruple, and one quintuple. The 236 victims were analyzed regarding age, race, sex, cause of death, toxicologic data, perpetrator, locale of the incident, and reason for the incident. This article compares this type of slaying with other types of homicide including those perpetrated by serial killers. Suggestions for future research in this field are offered.

  19. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Science.gov (United States)

    Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

    2014-01-01

    Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738

  20. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Directory of Open Access Journals (Sweden)

    Zheng Ping

    2014-01-01

    Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.

  1. MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

    Science.gov (United States)

    Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

    2015-01-01

    The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Multiple Sclerosis

    Science.gov (United States)

    Multiple sclerosis (MS) is a nervous system disease that affects your brain and spinal cord. It damages the myelin sheath, the material that surrounds and protects your nerve cells. This damage slows down ...

  3. Multiple myeloma.

    LENUS (Irish Health Repository)

    Collins, Conor D

    2012-02-01

    Advances in the imaging and treatment of multiple myeloma have occurred over the past decade. This article summarises the current status and highlights how an understanding of both is necessary for optimum management.

  4. Multiple mononeuropathy

    Science.gov (United States)

    ... with multiple mononeuropathy are prone to new nerve injuries at pressure points such as the knees and elbows. They should avoid putting pressure on these areas, for example, by not leaning on the elbows, crossing the knees, ...

  5. pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

    Science.gov (United States)

    Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

    2013-08-01

    With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  6. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2002-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies.

  7. Comparative Genome Analysis of Enterobacter cloacae

    Science.gov (United States)

    Liu, Wing-Yee; Wong, Chi-Fat; Chung, Karl Ming-Kar; Jiang, Jing-Wei; Leung, Frederick Chi-Ching

    2013-01-01

    The Enterobacter cloacae species includes an extremely diverse group of bacteria that are associated with plants, soil and humans. Publication of the complete genome sequence of the plant growth-promoting endophytic E. cloacae subsp. cloacae ENHKU01 provided an opportunity to perform the first comparative genome analysis between strains of this dynamic species. Examination of the pan-genome of E. cloacae showed that the conserved core genome retains the general physiological and survival genes of the species, while genomic factors in plasmids and variable regions determine the virulence of the human pathogenic E. cloacae strain; additionally, the diversity of fimbriae contributes to variation in colonization and host determination of different E. cloacae strains. Comparative genome analysis further illustrated that E. cloacae strains possess multiple mechanisms for antagonistic action against other microorganisms, which involve the production of siderophores and various antimicrobial compounds, such as bacteriocins, chitinases and antibiotic resistance proteins. The presence of Type VI secretion systems is expected to provide further fitness advantages for E. cloacae in microbial competition, thus allowing it to survive in different environments. Competition assays were performed to support our observations in genomic analysis, where E. cloacae subsp. cloacae ENHKU01 demonstrated antagonistic activities against a wide range of plant pathogenic fungal and bacterial species. PMID:24069314

  8. Genome Sequence of the Palaeopolyploid soybean

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  9. Genome-wide association study of clinical dimensions of schizophrenia

    DEFF Research Database (Denmark)

    Fanous, Ayman H; Zhou, Baiyu; Aggen, Steven H

    2012-01-01

    Multiple sources of evidence suggest that genetic factors influence variation in clinical features of schizophrenia. The authors present the first genome-wide association study (GWAS) of dimensional symptom scores among individuals with schizophrenia.......Multiple sources of evidence suggest that genetic factors influence variation in clinical features of schizophrenia. The authors present the first genome-wide association study (GWAS) of dimensional symptom scores among individuals with schizophrenia....

  10. Genomic Data Commons launches

    Science.gov (United States)

    The Genomic Data Commons (GDC), a unified data system that promotes sharing of genomic and clinical data between researchers, launched today with a visit from Vice President Joe Biden to the operations center at the University of Chicago.

  11. Rat Genome Database (RGD)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Rat Genome Database (RGD) is a collaborative effort between leading research institutions involved in rat genetic and genomic research to collect, consolidate,...

  12. Visualization for genomics: the Microbial Genome Viewer.

    Science.gov (United States)

    Kerkhoven, Robert; van Enckevort, Frank H J; Boekhorst, Jos; Molenaar, Douwe; Siezen, Roland J

    2004-07-22

    A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in scalable vector graphics (SVG) format, which is suitable for creating high-quality scalable images and dynamic Web representations. Gene-related data such as transcriptome and time-course microarray experiments can be superimposed on the maps for visual inspection. The Microbial Genome Viewer 1.0 is freely available at http://www.cmbi.kun.nl/MGV

  13. Genomic prediction using subsampling

    OpenAIRE

    Xavier, Alencar; Xu, Shizhong; Muir, William; Rainey, Katy Martin

    2017-01-01

    Background Genome-wide assisted selection is a critical tool for the?genetic improvement of plants and animals. Whole-genome regression models in Bayesian framework represent the main family of prediction methods. Fitting such models with a large number of observations involves a prohibitive computational burden. We propose the use of subsampling bootstrap Markov chain in genomic prediction. Such method consists of fitting whole-genome regression models by subsampling observations in each rou...

  14. GFVO: the Genomic Feature and Variation Ontology

    KAUST Repository

    Baran, Joachim

    2015-05-05

    Falling costs in genomic laboratory experiments have led to a steady increase of genomic feature and variation data. Multiple genomic data formats exist for sharing these data, and whilst they are similar, they are addressing slightly different data viewpoints and are consequently not fully compatible with each other. The fragmentation of data format specifications makes it hard to integrate and interpret data for further analysis with information from multiple data providers. As a solution, a new ontology is presented here for annotating and representing genomic feature and variation dataset contents. The Genomic Feature and Variation Ontology (GFVO) specifically addresses genomic data as it is regularly shared using the GFF3 (incl. FASTA), GTF, GVF and VCF file formats. GFVO simplifies data integration and enables linking of genomic annotations across datasets through common semantics of genomic types and relations. Availability and implementation. The latest stable release of the ontology is available via its base URI; previous and development versions are available at the ontology’s GitHub repository: https://github.com/BioInterchange/Ontologies; versions of the ontology are indexed through BioPortal (without external class-/property-equivalences due to BioPortal release 4.10 limitations); examples and reference documentation is provided on a separate web-page: http://www.biointerchange.org/ontologies.html. GFVO version 1.0.2 is licensed under the CC0 1.0 Universal license (https://creativecommons.org/publicdomain/zero/1.0) and therefore de facto within the public domain; the ontology can be appropriated without attribution for commercial and non-commercial use.

  15. Ebolavirus comparative genomics

    DEFF Research Database (Denmark)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms...

  16. [Multiple meningiomas].

    Science.gov (United States)

    Terrier, L-M; François, P

    2016-06-01

    Multiple meningiomas (MMs) or meningiomatosis are defined by the presence of at least 2 lesions that appear simultaneously or not, at different intracranial locations, without the association of neurofibromatosis. They present 1-9 % of meningiomas with a female predominance. The occurrence of multiple meningiomas is not clear. There are 2 main hypotheses for their development, one that supports the independent evolution of these tumors and the other, completely opposite, that suggests the propagation of tumor cells of a unique clone transformation, through cerebrospinal fluid. NF2 gene mutation is an important intrinsic risk factor in the etiology of multiple meningiomas and some exogenous risk factors have been suspected but only ionizing radiation exposure has been proven. These tumors can grow anywhere in the skull but they are more frequently observed in supratentorial locations. Their histologic types are similar to unique meningiomas of psammomatous, fibroblastic, meningothelial or transitional type and in most cases are benign tumors. The prognosis of these tumors is eventually good and does not differ from the unique tumors except for the cases of radiation-induced multiple meningiomas, in the context of NF2 or when diagnosed in children where the outcome is less favorable. Each meningioma lesion should be dealt with individually and their multiple character should not justify their resection at all costs. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  17. Draft genome sequences of seven isolates of Phytophthora ramorum EU2 from Northern Ireland

    Directory of Open Access Journals (Sweden)

    Lourdes de la Mata Saez

    2015-12-01

    Full Text Available Here we present draft-quality genome sequence assemblies for the oomycete Phytophthora ramorum genetic lineage EU2. We sequenced genomes of seven isolates collected in Northern Ireland between 2010 and 2012. Multiple genome sequences from P. ramorum EU2 will be valuable for identifying genetic variation within the clonal lineage that can be useful for tracking its spread.

  18. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding

    DEFF Research Database (Denmark)

    Xue, Yali; Prado-Martinez, Javier; Sudmant, Peter H

    2015-01-01

    Mountain gorillas are an endangered great ape subspecies and a prominent focus for conservation, yet we know little about their genomic diversity and evolutionary past. We sequenced whole genomes from multiple wild individuals and compared the genomes of all four Gorilla subspecies. We found that...

  19. Multiple different defense mechanisms are activated in the young transgenic tobacco plants which express the full length genome of the Tobacco mosaic virus, and are resistant against this virus.

    Science.gov (United States)

    Jada, Balaji; Soitamo, Arto J; Siddiqui, Shahid Aslam; Murukesan, Gayatri; Aro, Eva-Mari; Salakoski, Tapio; Lehto, Kirsi

    2014-01-01

    Previously described transgenic tobacco lines express the full length infectious Tobacco mosaic virus (TMV) genome under the 35S promoter (Siddiqui et al., 2007. Mol Plant Microbe Interact, 20: 1489-1494). Through their young stages these plants exhibit strong resistance against both the endogenously expressed and exogenously inoculated TMV, but at the age of about 7-8 weeks they break into TMV infection, with typical severe virus symptoms. Infections with some other viruses (Potato viruses Y, A, and X) induce the breaking of the TMV resistance and lead to synergistic proliferation of both viruses. To deduce the gene functions related to this early resistance, we have performed microarray analysis of the transgenic plants during the early resistant stage, and after the resistance break, and also of TMV-infected wild type tobacco plants. Comparison of these transcriptomes to those of corresponding wild type healthy plants indicated that 1362, 1150 and 550 transcripts were up-regulated in the transgenic plants before and after the resistance break, and in the TMV-infected wild type tobacco plants, respectively, and 1422, 1200 and 480 transcripts were down-regulated in these plants, respectively. These transcriptome alterations were distinctly different between the three types of plants, and it appears that several different mechanisms, such as the enhanced expression of the defense, hormone signaling and protein degradation pathways contributed to the TMV-resistance in the young transgenic plants. In addition to these alterations, we also observed a distinct and unique gene expression alteration in these plants, which was the strong suppression of the translational machinery. This may also contribute to the resistance by slowing down the synthesis of viral proteins. Viral replication potential may also be suppressed, to some extent, by the reduction of the translation initiation and elongation factors eIF-3 and eEF1A and B, which are required for the TMV replication

  20. gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances.

    Science.gov (United States)

    Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

    2016-01-01

    Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos).

  1. Genomic characterization reconfirms the taxonomic status of Lactobacillus parakefiri

    Science.gov (United States)

    TANIZAWA, Yasuhiro; KOBAYASHI, Hisami; KAMINUMA, Eli; SAKAMOTO, Mitsuo; OHKUMA, Moriya; NAKAMURA, Yasukazu; ARITA, Masanori; TOHNO, Masanori

    2017-01-01

    Whole-genome sequencing was performed for Lactobacillus parakefiri JCM 8573T to confirm its hitherto controversial taxonomic position. Here, we report its first reliable reference genome. Genome-wide metrics, such as average nucleotide identity and digital DNA-DNA hybridization, and phylogenomic analysis based on multiple genes supported its taxonomic status as a distinct species in the genus Lactobacillus. The availability of a reliable genome sequence will aid future investigations on the industrial applications of L. parakefiri in functional foods such as kefir grains. PMID:28748134

  2. Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae

    Directory of Open Access Journals (Sweden)

    Yuan Huang

    2017-06-01

    Full Text Available Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three Salix species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus Populus, which most likely results from homoplasy. By comparing three Salix chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of P. trichocarpa at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in

  3. Multiple sclerosis

    DEFF Research Database (Denmark)

    Stenager, Egon; Stenager, E N; Knudsen, Lone

    1994-01-01

    In a cross-sectional study of 117 randomly selected patients (52 men, 65 women) with definite multiple sclerosis, it was found that 76 percent were married or cohabitant, 8 percent divorced. Social contacts remained unchanged for 70 percent, but outgoing social contacts were reduced for 45 percent......, need for structural changes in home and need for pension became greater with increasing physical handicap. No significant differences between gender were found. It is concluded that patients and relatives are under increased social strain, when multiple sclerosis progresses to a moderate handicap...

  4. The evolution of genome size in ants

    Directory of Open Access Journals (Sweden)

    Spagna Joseph C

    2008-02-01

    Full Text Available Abstract Background Despite the economic and ecological importance of ants, genomic tools for this family (Formicidae remain woefully scarce. Knowledge of genome size, for example, is a useful and necessary prerequisite for the development of many genomic resources, yet it has been reported for only one ant species (Solenopsis invicta, and the two published estimates for this species differ by 146.7 Mb (0.15 pg. Results Here, we report the genome size for 40 species of ants distributed across 10 of the 20 currently recognized subfamilies, thus making Formicidae the 4th most surveyed insect family and elevating the Hymenoptera to the 5th most surveyed insect order. Our analysis spans much of the ant phylogeny, from the less derived Amblyoponinae and Ponerinae to the more derived Myrmicinae, Formicinae and Dolichoderinae. We include a number of interesting and important taxa, including the invasive Argentine ant (Linepithema humile, Neotropical army ants (genera Eciton and Labidus, trapjaw ants (Odontomachus, fungus-growing ants (Apterostigma, Atta and Sericomyrmex, harvester ants (Messor, Pheidole and Pogonomyrmex, carpenter ants (Camponotus, a fire ant (Solenopsis, and a bulldog ant (Myrmecia. Our results show that ants possess small genomes relative to most other insects, yet genome size varies three-fold across this insect family. Moreover, our data suggest that two whole-genome duplications may have occurred in the ancestors of the modern Ectatomma and Apterostigma. Although some previous studies of other taxa have revealed a relationship between genome size and body size, our phylogenetically-controlled analysis of this correlation did not reveal a significant relationship. Conclusion This is the first analysis of genome size in ants (Formicidae and the first across multiple species of social insects. We show that genome size is a variable trait that can evolve gradually over long time spans, as well as rapidly, through processes that may

  5. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  6. Multiple myeloma

    International Nuclear Information System (INIS)

    Sohn, Jeong Ick; Ha, Choon Ho; Choi, Karp Shik

    1994-01-01

    Multiple myeloma is a malignant plasma cell tumor that is thought to originate proliferation of a single clone of abnormal plasma cell resulting production of a whole monoclonal paraprotein. The authors experienced a case of multiple myeloma with severe mandibular osteolytic lesions in 46-year-old female. As a result of careful analysis of clinical, radiological, histopathological features, and laboratory findings, we diagnosed it as multiple myeloma, and the following results were obtained. 1. Main clinical symptoms were intermittent dull pain on the mandibular body area, abnormal sensation of lip and pain due to the fracture on the right clavicle. 2. Laboratory findings revealed M-spike, reversed serum albumin-globulin ratio, markedly elevated ESR and hypercalcemia. 3. Radiographically, multiple osteolytic punched-out radiolucencies were evident on the skull, zygoma, jaw bones, ribs, clavicle and upper extremities. Enlarged liver and increased uptakes on the lesional sites in RN scan were also observed. 4. Histopathologically, markedly hypercellular marrow with sheets of plasmoblasts and megakaryocytes were also observed.

  7. Multiple sclerosis

    DEFF Research Database (Denmark)

    Stenager, E; Jensen, K

    1988-01-01

    Forty-two (12%) of a total of 366 patients with multiple sclerosis (MS) had psychiatric admissions. Of these, 34 (81%) had their first psychiatric admission in conjunction with or after the onset of MS. Classification by psychiatric diagnosis showed that there was a significant positive correlation...

  8. Multiple sclerosis

    DEFF Research Database (Denmark)

    Stenager, E; Knudsen, L; Jensen, K

    1991-01-01

    In a cross-sectional investigation of 116 patients with multiple sclerosis, the social and sparetime activities of the patient were assessed by both patient and his/her family. The assessments were correlated to physical disability which showed that particularly those who were moderately disabled...

  9. Multiple sclerosis

    DEFF Research Database (Denmark)

    Stenager, E; Jensen, K

    1990-01-01

    An investigation on the correlation between ability to read TV subtitles and the duration of visual evoked potential (VEP) latency in 14 patients with definite multiple sclerosis (MS), indicated that VEP latency in patients unable to read the TV subtitles was significantly delayed in comparison...

  10. Multiple sclerosis

    DEFF Research Database (Denmark)

    Stenager, E; Knudsen, L; Jensen, K

    1994-01-01

    In a cross-sectional study of 94 patients (42 males, 52 females) with definite multiple sclerosis (MS) in the age range 25-55 years, the correlation of neuropsychological tests with the ability to read TV-subtitles and with the use of sedatives is examined. A logistic regression analysis reveals...

  11. Multiple Sclerosis.

    Science.gov (United States)

    Plummer, Nancy; Michael, Nancy, Ed.

    This module on multiple sclerosis is intended for use in inservice or continuing education programs for persons who administer medications in long-term care facilities. Instructor information, including teaching suggestions, and a listing of recommended audiovisual materials and their sources appear first. The module goal and objectives are then…

  12. Parenting Multiples

    Science.gov (United States)

    ... when your babies do. Though it can be hard to let go of the thousand other things you need to do, remember that your well-being is key to your ability to take care of your babies. What Problems Can Happen? It may be hard to tell multiple babies apart when they first ...

  13. SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

    Directory of Open Access Journals (Sweden)

    Davies Jonathan J

    2006-12-01

    Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.

  14. Long identical multispecies elements in plant and animal genomes.

    Science.gov (United States)

    Reneker, Jeff; Lyons, Eric; Conant, Gavin C; Pires, J Chris; Freeling, Michael; Shyu, Chi-Ren; Korkin, Dmitry

    2012-05-08

    Ultraconserved elements (UCEs) are DNA sequences that are 100% identical (no base substitutions, insertions, or deletions) and located in syntenic positions in at least two genomes. Although hundreds of UCEs have been found in animal genomes, little is known about the incidence of ultraconservation in plant genomes. Using an alignment-free information-retrieval approach, we have comprehensively identified all long identical multispecies elements (LIMEs), which include both syntenic and nonsyntenic regions, of at least 100 identical base pairs shared by at least two genomes. Among six animal genomes, we found the previously known syntenic UCEs as well as previously undescribed nonsyntenic elements. In contrast, among six plant genomes, we only found nonsyntenic LIMEs. LIMEs can also be classified as either simple (repetitive) or complex (nonrepetitive), they may occur in multiple copies in a genome, and they are often spread across multiple chromosomes. Although complex LIMEs were found in both animal and plant genomes, they differed significantly in their composition and copy number. Further analyses of plant LIMEs revealed their functional diversity, encompassing elements found near rRNA and enzyme-coding genes, as well as those found in transposons and noncoding DNA. We conclude that despite the common presence of LIMEs in both animal and plant lineages, the evolutionary processes involved in the creation and maintenance of these elements differ in the two groups and are likely attributable to several mechanisms, including transfer of genetic material from organellar to nuclear genomes, de novo sequence manufacturing, and purifying selection.

  15. Short and long-term genome stability analysis of prokaryotic genomes.

    Science.gov (United States)

    Brilli, Matteo; Liò, Pietro; Lacroix, Vincent; Sagot, Marie-France

    2013-05-08

    Gene organization dynamics is actively studied because it provides useful evolutionary information, makes functional annotation easier and often enables to characterize pathogens. There is therefore a strong interest in understanding the variability of this trait and the possible correlations with life-style. Two kinds of events affect genome organization: on one hand translocations and recombinations change the relative position of genes shared by two genomes (i.e. the backbone gene order); on the other, insertions and deletions leave the backbone gene order unchanged but they alter the gene neighborhoods by breaking the syntenic regions. A complete picture about genome organization evolution therefore requires to account for both kinds of events. We developed an approach where we model chromosomes as graphs on which we compute different stability estimators; we consider genome rearrangements as well as the effect of gene insertions and deletions. In a first part of the paper, we fit a measure of backbone gene order conservation (hereinafter called backbone stability) against phylogenetic distance for over 3000 genome comparisons, improving existing models for the divergence in time of backbone stability. Intra- and inter-specific comparisons were treated separately to focus on different time-scales. The use of multiple genomes of a same species allowed to identify genomes with diverging gene order with respect to their conspecific. The inter-species analysis indicates that pathogens are more often unstable with respect to non-pathogens. In a second part of the text, we show that in pathogens, gene content dynamics (insertions and deletions) have a much more dramatic effect on genome organization stability than backbone rearrangements. In this work, we studied genome organization divergence taking into account the contribution of both genome order rearrangements and genome content dynamics. By studying species with multiple sequenced genomes available, we were

  16. Advances in Genomics of Entomopathogenic Fungi.

    Science.gov (United States)

    Wang, J B; St Leger, R J; Wang, C

    2016-01-01

    Fungi are the commonest pathogens of insects and crucial regulators of insect populations. The rapid advance of genome technologies has revolutionized our understanding of entomopathogenic fungi with multiple Metarhizium spp. sequenced, as well as Beauveria bassiana, Cordyceps militaris, and Ophiocordyceps sinensis among others. Phylogenomic analysis suggests that the ancestors of many of these fungi were plant endophytes or pathogens, with entomopathogenicity being an acquired characteristic. These fungi now occupy a wide range of habitats and hosts, and their genomes have provided a wealth of information on the evolution of virulence-related characteristics, as well as the protein families and genomic structure associated with ecological and econutritional heterogeneity, genome evolution, and host range diversification. In particular, their evolutionary transition from plant pathogens or endophytes to insect pathogens provides a novel perspective on how new functional mechanisms important for host switching and virulence are acquired. Importantly, genomic resources have helped make entomopathogenic fungi ideal model systems for answering basic questions in parasitology, entomology, and speciation. At the same time, identifying the selective forces that act upon entomopathogen fitness traits could underpin both the development of new mycoinsecticides and further our understanding of the natural roles of these fungi in nature. These roles frequently include mutualistic relationships with plants. Genomics has also facilitated the rapid identification of genes encoding biologically useful molecules, with implications for the development of pharmaceuticals and the use of these fungi as bioreactors. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Endogenous viral elements in animal genomes.

    Directory of Open Access Journals (Sweden)

    Aris Katzourakis

    2010-11-01

    Full Text Available Integration into the nuclear genome of germ line cells can lead to vertical inheritance of retroviral genes as host alleles. For other viruses, germ line integration has only rarely been documented. Nonetheless, we identified endogenous viral elements (EVEs derived from ten non-retroviral families by systematic in silico screening of animal genomes, including the first endogenous representatives of double-stranded RNA, reverse-transcribing DNA, and segmented RNA viruses, and the first endogenous DNA viruses in mammalian genomes. Phylogenetic and genomic analysis of EVEs across multiple host species revealed novel information about the origin and evolution of diverse virus groups. Furthermore, several of the elements identified here encode intact open reading frames or are expressed as mRNA. For one element in the primate lineage, we provide statistically robust evidence for exaptation. Our findings establish that genetic material derived from all known viral genome types and replication strategies can enter the animal germ line, greatly broadening the scope of paleovirological studies and indicating a more significant evolutionary role for gene flow from virus to animal genomes than has previously been recognized.

  18. Efficient oligonucleotide probe selection for pan-genomic tiling arrays

    Directory of Open Access Journals (Sweden)

    Zhang Wei

    2009-09-01

    Full Text Available Abstract Background Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome. Results This paper presents a new probe selection algorithm (PanArray that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pan-genome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage. Conclusion PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on

  19. Multiple sclerosis

    International Nuclear Information System (INIS)

    Sadashima, Hiromichi; Kusaka, Hirofumi; Imai, Terukuni; Takahashi, Ryosuke; Matsumoto, Sadayuki; Yamamoto, Toru; Yamasaki, Masahiro; Maya, Kiyomi

    1986-01-01

    Eleven patients with a definite diagnosis of multiple sclerosis were examined in terms of correlations between the clinical features and the results of cranial computed tomography (CT), and magnetic resonance imaging (MRI). Results: In 5 of the 11 patients, both CT and MRI demonstrated lesions consistent with a finding of multiple sclerosis. In 3 patients, only MRI demonstrated lesions. In the remaining 3 patients, neither CT nor MRI revealed any lesion in the brain. All 5 patients who showed abnormal findings on both CT and MRI had clinical signs either of cerebral or brainstem - cerebellar lesions. On the other hand, two of the 3 patients with normal CT and MRI findings had optic-nerve and spinal-cord signs. Therefore, our results suggested relatively good correlations between the clinical features, CT, and MRI. MRI revealed cerebral lesions in two of the four patients with clinical signs of only optic-nerve and spinal-cord lesions. MRI demonstrated sclerotic lesions in 3 of the 6 patients whose plaques were not detected by CT. In conclusion, MRI proved to be more helpful in the demonstration of lesions attributable to chronic multiple sclerosis. (author)

  20. Multiplex Genome Editing in Escherichia coli

    DEFF Research Database (Denmark)

    Ingemann Jensen, Sheila; Nielsen, Alex Toftgaard

    2018-01-01

    Lambda Red recombineering is an easy and efficient method for generating genetic modifications in Escherichia coli. For gene deletions, lambda Red recombineering is combined with the use of selectable markers, which are removed through the action of, e.g., flippase (Flp) recombinase. This PCR......-based engineering method has also been applied to a number of other bacteria. In this chapter, we describe a recently developed one plasmid-based method as well as the use of a strain with genomically integrated recombineering genes, which significantly speeds up the engineering of strains with multiple genomic...

  1. Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

    Directory of Open Access Journals (Sweden)

    Keeling Patrick J

    2007-09-01

    Full Text Available Abstract Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements

  2. ChloroMitoCU: Codon patterns across organelle genomes for functional genomics and evolutionary applications.

    Science.gov (United States)

    Sablok, Gaurav; Chen, Ting-Wen; Lee, Chi-Ching; Yang, Chi; Gan, Ruei-Chi; Wegrzyn, Jill L; Porta, Nicola L; Nayak, Kinshuk C; Huang, Po-Jung; Varotto, Claudio; Tang, Petrus

    2017-06-01

    Organelle genomes are widely thought to have arisen from reduction events involving cyanobacterial and archaeal genomes, in the case of chloroplasts, or α-proteobacterial genomes, in the case of mitochondria. Heterogeneity in base composition and codon preference has long been the subject of investigation of topics ranging from phylogenetic distortion to the design of overexpression cassettes for transgenic expression. From the overexpression point of view, it is critical to systematically analyze the codon usage patterns of the organelle genomes. In light of the importance of codon usage patterns in the development of hyper-expression organelle transgenics, we present ChloroMitoCU, the first-ever curated, web-based reference catalog of the codon usage patterns in organelle genomes. ChloroMitoCU contains the pre-compiled codon usage patterns of 328 chloroplast genomes (29,960 CDS) and 3,502 mitochondrial genomes (49,066 CDS), enabling genome-wide exploration and comparative analysis of codon usage patterns across species. ChloroMitoCU allows the phylogenetic comparison of codon usage patterns across organelle genomes, the prediction of codon usage patterns based on user-submitted transcripts or assembled organelle genes, and comparative analysis with the pre-compiled patterns across species of interest. ChloroMitoCU can increase our understanding of the biased patterns of codon usage in organelle genomes across multiple clades. ChloroMitoCU can be accessed at: http://chloromitocu.cgu.edu.tw/. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  3. GenoSets: visual analytic methods for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Aurora A Cain

    Full Text Available Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.

  4. A quantitative account of genomic island acquisitions in prokaryotes

    Directory of Open Access Journals (Sweden)

    Roos Tom E

    2011-08-01

    Full Text Available Abstract Background Microbial genomes do not merely evolve through the slow accumulation of mutations, but also, and often more dramatically, by taking up new DNA in a process called horizontal gene transfer. These innovation leaps in the acquisition of new traits can take place via the introgression of single genes, but also through the acquisition of large gene clusters, which are termed Genomic Islands. Since only a small proportion of all the DNA diversity has been sequenced, it can be hard to find the appropriate donors for acquired genes via sequence alignments from databases. In contrast, relative oligonucleotide frequencies represent a remarkably stable genomic signature in prokaryotes, which facilitates compositional comparisons as an alignment-free alternative for phylogenetic relatedness. In this project, we test whether Genomic Islands identified in individual bacterial genomes have a similar genomic signature, in terms of relative dinucleotide frequencies, and can therefore be expected to originate from a common donor species. Results When multiple Genomic Islands are present within a single genome, we find that up to 28% of these are compositionally very similar to each other, indicative of frequent recurring acquisitions from the same donor to the same acceptor. Conclusions This represents the first quantitative assessment of common directional transfer events in prokaryotic evolutionary history. We suggest that many of the resident Genomic Islands per prokaryotic genome originated from the same source, which may have implications with respect to their regulatory interactions, and for the elucidation of the common origins of these acquired gene clusters.

  5. Insights into bilaterian evolution from three spiralian genomes

    Energy Technology Data Exchange (ETDEWEB)

    Simakov, Oleg; Marletaz, Ferdinand; Cho, Sung-Jin; Edsinger-Gonzales, Eric; Havlak, Paul; Hellsten, Uffe; Kuo, Dian-Han; Larsson, Tomas; Lv, Jie; Arendt, Detlev; Savage, Robert; Osoegawa, Kazutoyo; de Jong, Pieter; Grimwood, Jane; Chapman, Jarrod A.; Shapiro, Harris; Otillar, Robert P.; Terry, Astrid Y.; Boore, Jeffrey L.; Grigoriev, Igor V.; Lindberg, David R.; Seaver, Elaine C.; Weisblat, David A.; Putnam, Nicholas H.; Rokhsar, Daniel S.; Aerts, Andrea

    2012-01-07

    Current genomic perspectives on animal diversity neglect two prominent phyla, the molluscs and annelids, that together account for nearly one-third of known marine species and are important both ecologically and as experimental systems in classical embryology1, 2, 3. Here we describe the draft genomes of the owl limpet (Lottia gigantea), a marine polychaete (Capitella teleta) and a freshwater leech (Helobdella robusta), and compare them with other animal genomes to investigate the origin and diversification of bilaterians from a genomic perspective. We find that the genome organization, gene structure and functional content of these species are more similar to those of some invertebrate deuterostome genomes (for example, amphioxus and sea urchin) than those of other protostomes that have been sequenced to date (flies, nematodes and flatworms). The conservation of these genomic features enables us to expand the inventory of genes present in the last common bilaterian ancestor, establish the tripartite diversification of bilaterians using multiple genomic characteristics and identify ancient conserved long- and short-range genetic linkages across metazoans. Superimposed on this broadly conserved pan-bilaterian background we find examples of lineage-specific genome evolution, including varying rates of rearrangement, intron gain and loss, expansions and contractions of gene families, and the evolution of clade-specific genes that produce the unique content of each genome.

  6. Bioinformatics decoding the genome

    CERN Multimedia

    CERN. Geneva; Deutsch, Sam; Michielin, Olivier; Thomas, Arthur; Descombes, Patrick

    2006-01-01

    Extracting the fundamental genomic sequence from the DNA From Genome to Sequence : Biology in the early 21st century has been radically transformed by the availability of the full genome sequences of an ever increasing number of life forms, from bacteria to major crop plants and to humans. The lecture will concentrate on the computational challenges associated with the production, storage and analysis of genome sequence data, with an emphasis on mammalian genomes. The quality and usability of genome sequences is increasingly conditioned by the careful integration of strategies for data collection and computational analysis, from the construction of maps and libraries to the assembly of raw data into sequence contigs and chromosome-sized scaffolds. Once the sequence is assembled, a major challenge is the mapping of biologically relevant information onto this sequence: promoters, introns and exons of protein-encoding genes, regulatory elements, functional RNAs, pseudogenes, transposons, etc. The methodological ...

  7. Genomic research in Eucalyptus.

    Science.gov (United States)

    Poke, Fiona S; Vaillancourt, René E; Potts, Brad M; Reid, James B

    2005-09-01

    Eucalyptus L'Hérit. is a genus comprised of more than 700 species that is of vital importance ecologically to Australia and to the forestry industry world-wide, being grown in plantations for the production of solid wood products as well as pulp for paper. With the sequencing of the genomes of Arabidopsis thaliana and Oryza sativa and the recent completion of the first tree genome sequence, Populus trichocarpa, attention has turned to the current status of genomic research in Eucalyptus. For several eucalypt species, large segregating families have been established, high-resolution genetic maps constructed and large EST databases generated. Collaborative efforts have been initiated for the integration of diverse genomic projects and will provide the framework for future research including exploiting the sequence of the entire eucalypt genome which is currently being sequenced. This review summarises the current position of genomic research in Eucalyptus and discusses the direction of future research.

  8. Approaches for Comparative Genomics in Aspergillus and Penicillium

    DEFF Research Database (Denmark)

    Rasmussen, Jane Lind Nybo; Theobald, Sebastian; Brandl, Julian

    2016-01-01

    and applicable for many types of studies. In this chapter, we provide an overview of the state-of-the-art of comparative genomics in these fungi, along with recommended methods. The chapter describes databases for fungal comparative genomics. Based on experience, we suggest strategies for multiple types...... of comparative genomics, ranging from analysis of single genes, over gene clusters and CaZymes to genome-scale comparative genomics. Furthermore, we have examined published comparative genomics papers to summarize the preferred bioinformatic methods and parameters for a given type of analysis, highly useful...... comparative genomics to the development in bacterial genomics, where the comparison of hundreds of genomes has been performed for a while....

  9. The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

    Science.gov (United States)

    Lack, Justin B; Cardeno, Charis M; Crepeau, Marc W; Taylor, William; Corbett-Detig, Russell B; Stevens, Kristian A; Langley, Charles H; Pool, John E

    2015-04-01

    Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets. Copyright © 2015 by the Genetics Society of America.

  10. Genome packaging in viruses

    OpenAIRE

    Sun, Siyang; Rao, Venigalla B.; Rossmann, Michael G.

    2010-01-01

    Genome packaging is a fundamental process in a viral life cycle. Many viruses assemble preformed capsids into which the genomic material is subsequently packaged. These viruses use a packaging motor protein that is driven by the hydrolysis of ATP to condense the nucleic acids into a confined space. How these motor proteins package viral genomes had been poorly understood until recently, when a few X-ray crystal structures and cryo-electron microscopy structures became available. Here we discu...

  11. Genomic sovereignty and the African promise: mining the African genome for the benefit of Africa.

    Science.gov (United States)

    de Vries, Jantina; Pepper, Michael

    2012-08-01

    Scientific interest in genomics in Africa is on the rise with a number of funding initiatives aimed specifically at supporting research in this area. Genomics research on material of African origin raises a number of important ethical issues. A prominent concern relates to sample export, which is increasingly seen by researchers and ethics committees across the continent as being problematic. The concept of genomic sovereignty proposes that unique patterns of genomic variation can be found in human populations, and that these are commercially, scientifically or symbolically valuable and in need of protection against exploitation. Although it is appealing as a response to increasing concerns regarding sample export, there are a number of important conceptual problems relating to the term. It is not clear, for instance, whether it is appropriate that ownership over human genomic samples should rest with national governments. Furthermore, ethnic groups in Africa are frequently spread across multiple nation states, and protection offered in one state may not prevent researchers from accessing the same group elsewhere. Lastly, scientific evidence suggests that the assumption that genomic data is unique for population groups is false. Although the frequency with which particular variants are found can differ between groups, such genes or variants per se are not unique to any population group. In this paper, the authors describe these concerns in detail and argue that the concept of genomic sovereignty alone may not be adequate to protect the genetic resources of people of African descent.

  12. Between Two Fern Genomes

    Science.gov (United States)

    2014-01-01

    Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves. PMID:25324969

  13. Causes of genome instability

    DEFF Research Database (Denmark)

    Langie, Sabine A S; Koppen, Gudrun; Desaulniers, Daniel

    2015-01-01

    function, chromosome segregation, telomere length). The purpose of this review is to describe the crucial aspects of genome instability, to outline the ways in which environmental chemicals can affect this cancer hallmark and to identify candidate chemicals for further study. The overall aim is to make......Genome instability is a prerequisite for the development of cancer. It occurs when genome maintenance systems fail to safeguard the genome's integrity, whether as a consequence of inherited defects or induced via exposure to environmental agents (chemicals, biological agents and radiation). Thus...

  14. Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-03-12

    The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scale genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.

  15. MIPS plant genome information resources.

    Science.gov (United States)

    Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X

    2007-01-01

    The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.

  16. Comparative genomic data of the Avian Phylogenomics Project.

    Science.gov (United States)

    Zhang, Guojie; Li, Bo; Li, Cai; Gilbert, M Thomas P; Jarvis, Erich D; Wang, Jun

    2014-01-01

    The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of

  17. Multiple inflation

    International Nuclear Information System (INIS)

    Murphy, P.J.

    1987-01-01

    The Theory of Inflation, namely, that at some point the entropy content of the universe was greatly increased, has much promise. It may solve the puzzles of homogeneity and the creation of structure. However, no particle physics model has yet been found that can successfully drive inflation. The difficulty in satisfying the constraint that the isotropy of the microwave background places on the effective potential of prospective models is immense. In this work we have codified the requirements of such models in a most general form. We have carefully calculated the amounts of inflation the various problems of the Standard Model need for their solution. We have derived a completely model independent upper bond on the inflationary Hubble parameter. We have developed a general notation with which to probe the possibilities of Multiple Inflation. We have shown that only in very unlikely circumstances will any evidence of an earlier inflation, survive the de Sitter period of its successor. In particular, it is demonstrated that it is most unlikely that two bouts of inflation will yield high amplitudes of density perturbations on small scales and low amplitudes on large. We conclude that, while multiple inflation will be of great theoretical interest, it is unlikely to have any observational impact

  18. Recombinant Vaccinia Virus: Immunization against Multiple Pathogens

    Science.gov (United States)

    Perkus, Marion E.; Piccini, Antonia; Lipinskas, Bernard R.; Paoletti, Enzo

    1985-09-01

    The coding sequences for the hepatitis B virus surface antigen, the herpes simplex virus glycoprotein D, and the influenza virus hemagglutinin were inserted into a single vaccinia virus genome. Rabbits inoculated intravenously or intradermally with this polyvalent vaccinia virus recombinant produced antibodies reactive to all three authentic foreign antigens. In addition, the feasibility of multiple rounds of vaccination with recombinant vaccinia virus was demonstrated.

  19. Cardiovascular Precision Medicine in the Genomics Era

    Directory of Open Access Journals (Sweden)

    Alexandra M. Dainis, BS

    2018-04-01

    Full Text Available Summary: Precision medicine strives to delineate disease using multiple data sources—from genomics to digital health metrics—in order to be more precise and accurate in our diagnoses, definitions, and treatments of disease subtypes. By defining disease at a deeper level, we can treat patients based on an understanding of the molecular underpinnings of their presentations, rather than grouping patients into broad categories with one-size-fits-all treatments. In this review, the authors examine how precision medicine, specifically that surrounding genetic testing and genetic therapeutics, has begun to make strides in both common and rare cardiovascular diseases in the clinic and the laboratory, and how these advances are beginning to enable us to more effectively define risk, diagnose disease, and deliver therapeutics for each individual patient. Key Words: genome sequencing, genomics, precision medicine, targeted therapeutics

  20. Advances and Challenges in Genomic Selection for Disease Resistance.

    Science.gov (United States)

    Poland, Jesse; Rutkoski, Jessica

    2016-08-04

    Breeding for disease resistance is a central focus of plant breeding programs, as any successful variety must have the complete package of high yield, disease resistance, agronomic performance, and end-use quality. With the need to accelerate the development of improved varieties, genomics-assisted breeding is becoming an important tool in breeding programs. With marker-assisted selection, there has been success in breeding for disease resistance; however, much of this work and research has focused on identifying, mapping, and selecting for major resistance genes that tend to be highly effective but vulnerable to breakdown with rapid changes in pathogen races. In contrast, breeding for minor-gene quantitative resistance tends to produce more durable varieties but is a more challenging breeding objective. As the genetic architecture of resistance shifts from single major R genes to a diffused architecture of many minor genes, the best approach for molecular breeding will shift from marker-assisted selection to genomic selection. Genomics-assisted breeding for quantitative resistance will therefore necessitate whole-genome prediction models and selection methodology as implemented for classical complex traits such as yield. Here, we examine multiple case studies testing whole-genome prediction models and genomic selection for disease resistance. In general, whole-genome models for disease resistance can produce prediction accuracy suitable for application in breeding. These models also largely outperform multiple linear regression as would be applied in marker-assisted selection. With the implementation of genomic selection for yield and other agronomic traits, whole-genome marker profiles will be available for the entire set of breeding lines, enabling genomic selection for disease at no additional direct cost. In this context, the scope of implementing genomics selection for disease resistance, and specifically for quantitative resistance and quarantined pathogens

  1. Computational genomics of hyperthermophiles

    NARCIS (Netherlands)

    Werken, van de H.J.G.

    2008-01-01

    With the ever increasing number of completely sequenced prokaryotic genomes and the subsequent use of functional genomics tools, e.g. DNA microarray and proteomics, computational data analysis and the integration of microbial and molecular data is inevitable. This thesis describes the computational

  2. Safeguarding genome integrity

    DEFF Research Database (Denmark)

    Sørensen, Claus Storgaard; Syljuåsen, Randi G

    2012-01-01

    Mechanisms that preserve genome integrity are highly important during the normal life cycle of human cells. Loss of genome protective mechanisms can lead to the development of diseases such as cancer. Checkpoint kinases function in the cellular surveillance pathways that help cells to cope with D...

  3. Human genome I

    International Nuclear Information System (INIS)

    Anon.

    1989-01-01

    An international conference, Human Genome I, was held Oct. 2-4, 1989 in San Diego, Calif. Selected speakers discussed: Current Status of the Genome Project; Technique Innovations; Interesting regions; Applications; and Organization - Different Views of Current and Future Science and Procedures. Posters, consisting of 119 presentations, were displayed during the sessions. 119 were indexed for inclusion to the Energy Data Base

  4. Single-Cell Whole-Genome Amplification and Sequencing: Methodology and Applications.

    Science.gov (United States)

    Huang, Lei; Ma, Fei; Chapman, Alec; Lu, Sijia; Xie, Xiaoliang Sunney

    2015-01-01

    We present a survey of single-cell whole-genome amplification (WGA) methods, including degenerate oligonucleotide-primed polymerase chain reaction (DOP-PCR), multiple displacement amplification (MDA), and multiple annealing and looping-based amplification cycles (MALBAC). The key parameters to characterize the performance of these methods are defined, including genome coverage, uniformity, reproducibility, unmappable rates, chimera rates, allele dropout rates, false positive rates for calling single-nucleotide variations, and ability to call copy-number variations. Using these parameters, we compare five commercial WGA kits by performing deep sequencing of multiple single cells. We also discuss several major applications of single-cell genomics, including studies of whole-genome de novo mutation rates, the early evolution of cancer genomes, circulating tumor cells (CTCs), meiotic recombination of germ cells, preimplantation genetic diagnosis (PGD), and preimplantation genomic screening (PGS) for in vitro-fertilized embryos.

  5. Rumen microbial genomics

    International Nuclear Information System (INIS)

    Morrison, M.; Nelson, K.E.

    2005-01-01

    Improving microbial degradation of plant cell wall polysaccharides remains one of the highest priority goals for all livestock enterprises, including the cattle herds and draught animals of developing countries. The North American Consortium for Genomics of Fibrolytic Ruminal Bacteria was created to promote the sequencing and comparative analysis of rumen microbial genomes, offering the potential to fully assess the genetic potential in a functional and comparative fashion. It has been found that the Fibrobacter succinogenes genome encodes many more endoglucanases and cellodextrinases than previously isolated, and several new processive endoglucanases have been identified by genome and proteomic analysis of Ruminococcus albus, in addition to a variety of strategies for its adhesion to fibre. The ramifications of acquiring genome sequence data for rumen microorganisms are profound, including the potential to elucidate and overcome the biochemical, ecological or physiological processes that are rate limiting for ruminal fibre degradation. (author)

  6. Microbial Genomes Multiply

    Science.gov (United States)

    Doolittle, Russell F.

    2002-01-01

    The publication of the first complete sequence of a bacterial genome in 1995 was a signal event, underscored by the fact that the article has been cited more than 2,100 times during the intervening seven years. It was a marvelous technical achievement, made possible by automatic DNA-sequencing machines. The feat is the more impressive in that complete genome sequencing has now been adopted in many different laboratories around the world. Four years ago in these columns I examined the situation after a dozen microbial genomes had been completed. Now, with upwards of 60 microbial genome sequences determined and twice that many in progress, it seems reasonable to assess just what is being learned. Are new concepts emerging about how cells work? Have there been practical benefits in the fields of medicine and agriculture? Is it feasible to determine the genomic sequence of every bacterial species on Earth? The answers to these questions maybe Yes, Perhaps, and No, respectively.

  7. Musa sebagai Model Genom

    Directory of Open Access Journals (Sweden)

    RITA MEGIA

    2005-12-01

    Full Text Available During the meeting in Arlington, USA in 2001, the scientists grouped in PROMUSA agreed with the launching of the Global Musa Genomics Consortium. The Consortium aims to apply genomics technologies to the improvement of this important crop. These genome projects put banana as the third model species after Arabidopsis and rice that will be analyzed and sequenced. Comparing to Arabidopsis and rice, banana genome provides a unique and powerful insight into structural and in functional genomics that could not be found in those two species. This paper discussed these subjects-including the importance of banana as the fourth main food in the world, the evolution and biodiversity of this genetic resource and its parasite.

  8. The genome editing revolution

    DEFF Research Database (Denmark)

    Stella, Stefano; Montoya, Guillermo

    2016-01-01

    -Cas system has become the main tool for genome editing in many laboratories. Currently the targeted genome editing technology has been used in many fields and may be a possible approach for human gene therapy. Furthermore, it can also be used to modifying the genomes of model organisms for studying human......In the last 10 years, we have witnessed a blooming of targeted genome editing systems and applications. The area was revolutionized by the discovery and characterization of the transcription activator-like effector proteins, which are easier to engineer to target new DNA sequences than...... sequence). This ribonucleoprotein complex protects bacteria from invading DNAs, and it was adapted to be used in genome editing. The CRISPR ribonucleic acid (RNA) molecule guides to the specific DNA site the Cas9 nuclease to cleave the DNA target. Two years and more than 1000 publications later, the CRISPR...

  9. The genomic applications in practice and prevention network.

    Science.gov (United States)

    Khoury, Muin J; Feero, W Gregory; Reyes, Michele; Citrin, Toby; Freedman, Andrew; Leonard, Debra; Burke, Wylie; Coates, Ralph; Croyle, Robert T; Edwards, Karen; Kardia, Sharon; McBride, Colleen; Manolio, Teri; Randhawa, Gurvaneet; Rasooly, Rebekah; St Pierre, Jeannette; Terry, Sharon

    2009-07-01

    The authors describe the rationale and initial development of a new collaborative initiative, the Genomic Applications in Practice and Prevention Network. The network convened by the Centers for Disease Control and Prevention and the National Institutes of Health includes multiple stakeholders from academia, government, health care, public health, industry and consumers. The premise of Genomic Applications in Practice and Prevention Network is that there is an unaddressed chasm between gene discoveries and demonstration of their clinical validity and utility. This chasm is due to the lack of readily accessible information about the utility of most genomic applications and the lack of necessary knowledge by consumers and providers to implement what is known. The mission of Genomic Applications in Practice and Prevention Network is to accelerate and streamline the effective integration of validated genomic knowledge into the practice of medicine and public health, by empowering and sponsoring research, evaluating research findings, and disseminating high quality information on candidate genomic applications in practice and prevention. Genomic Applications in Practice and Prevention Network will develop a process that links ongoing collection of information on candidate genomic applications to four crucial domains: (1) knowledge synthesis and dissemination for new and existing technologies, and the identification of knowledge gaps, (2) a robust evidence-based recommendation development process, (3) translation research to evaluate validity, utility and impact in the real world and how to disseminate and implement recommended genomic applications, and (4) programs to enhance practice, education, and surveillance.

  10. Phytozome Comparative Plant Genomics Portal

    Energy Technology Data Exchange (ETDEWEB)

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  11. Annotation of selection strengths in viral genomes

    DEFF Research Database (Denmark)

    McCauley, Stephen; de Groot, Saskia; Mailund, Thomas

    2007-01-01

    Motivation: Viral genomes tend to code in overlapping reading frames to maximize information content. This may result in atypical codon bias and particular evolutionary constraints. Due to the fast mutation rate of viruses, there is additional strong evidence for varying selection between intra......- and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley & Hein (2006), we develop a method for annotating a viral genome coding in overlapping...... may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as four Hepatitis B sequences. We...

  12. Inversion variants in human and primate genomes.

    Science.gov (United States)

    Catacchio, Claudia Rita; Maggiolini, Flavia Angela Maria; D'Addabbo, Pietro; Bitonto, Miriana; Capozzi, Oronzo; Signorile, Martina Lepore; Miroballo, Mattia; Archidiacono, Nicoletta; Eichler, Evan E; Ventura, Mario; Antonacci, Francesca

    2018-05-18

    For many years, inversions have been proposed to be a direct driving force in speciation since they suppress recombination when heterozygous. Inversions are the most common large-scale differences among humans and great apes. Nevertheless, they represent large events easily distinguishable by classical cytogenetics, whose resolution, however, is limited. Here, we performed a genome-wide comparison between human, great ape, and macaque genomes using the net alignments for the most recent releases of genome assemblies. We identified a total of 156 putative inversions, between 103 kb and 91 Mb, corresponding to 136 human loci. Combining literature, sequence, and experimental analyses, we analyzed 109 of these loci and found 67 regions inverted in one or multiple primates, including 28 newly identified inversions. These events overlap with 81 human genes at their breakpoints, and seven correspond to sites of recurrent rearrangements associated with human disease. This work doubles the number of validated primate inversions larger than 100 kb, beyond what was previously documented. We identified 74 sites of errors, where the sequence has been assembled in the wrong orientation, in the reference genomes analyzed. Our data serve two purposes: First, we generated a map of evolutionary inversions in these genomes representing a resource for interrogating differences among these species at a functional level; second, we provide a list of misassembled regions in these primate genomes, involving over 300 Mb of DNA and 1978 human genes. Accurately annotating these regions in the genome references has immediate applications for evolutionary and biomedical studies on primates. © 2018 Catacchio et al.; Published by Cold Spring Harbor Laboratory Press.

  13. VISTA - computational tools for comparative genomics

    Energy Technology Data Exchange (ETDEWEB)

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  14. Genomics and the making of yeast biodiversity.

    Science.gov (United States)

    Hittinger, Chris Todd; Rokas, Antonis; Bai, Feng-Yan; Boekhout, Teun; Gonçalves, Paula; Jeffries, Thomas W; Kominek, Jacek; Lachance, Marc-André; Libkind, Diego; Rosa, Carlos A; Sampaio, José Paulo; Kurtzman, Cletus P

    2015-12-01

    Yeasts are unicellular fungi that do not form fruiting bodies. Although the yeast lifestyle has evolved multiple times, most known species belong to the subphylum Saccharomycotina (syn. Hemiascomycota, hereafter yeasts). This diverse group includes the premier eukaryotic model system, Saccharomyces cerevisiae; the common human commensal and opportunistic pathogen, Candida albicans; and over 1000 other known species (with more continuing to be discovered). Yeasts are found in every biome and continent and are more genetically diverse than angiosperms or chordates. Ease of culture, simple life cycles, and small genomes (∼10-20Mbp) have made yeasts exceptional models for molecular genetics, biotechnology, and evolutionary genomics. Here we discuss recent developments in understanding the genomic underpinnings of the making of yeast biodiversity, comparing and contrasting natural and human-associated evolutionary processes. Only a tiny fraction of yeast biodiversity and metabolic capabilities has been tapped by industry and science. Expanding the taxonomic breadth of deep genomic investigations will further illuminate how genome function evolves to encode their diverse metabolisms and ecologies. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  16. Genome-derived vaccines.

    Science.gov (United States)

    De Groot, Anne S; Rappuoli, Rino

    2004-02-01

    Vaccine research entered a new era when the complete genome of a pathogenic bacterium was published in 1995. Since then, more than 97 bacterial pathogens have been sequenced and at least 110 additional projects are now in progress. Genome sequencing has also dramatically accelerated: high-throughput facilities can draft the sequence of an entire microbe (two to four megabases) in 1 to 2 days. Vaccine developers are using microarrays, immunoinformatics, proteomics and high-throughput immunology assays to reduce the truly unmanageable volume of information available in genome databases to a manageable size. Vaccines composed by novel antigens discovered from genome mining are already in clinical trials. Within 5 years we can expect to see a novel class of vaccines composed by genome-predicted, assembled and engineered T- and Bcell epitopes. This article addresses the convergence of three forces--microbial genome sequencing, computational immunology and new vaccine technologies--that are shifting genome mining for vaccines onto the forefront of immunology research.

  17. The Banana Genome Hub

    Science.gov (United States)

    Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie

    2013-01-01

    Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967

  18. Genomic instability following irradiation

    International Nuclear Information System (INIS)

    Hacker-Klom, U.B.; Goehde, W.

    2001-01-01

    Ionising irradiation may induce genomic instability. The broad spectrum of stress reactions in eukaryontic cells to irradiation complicates the discovery of cellular targets and pathways inducing genomic instability. Irradiation may initiate genomic instability by deletion of genes controlling stability, by induction of genes stimulating instability and/or by activating endogeneous cellular viruses. Alternatively or additionally it is discussed that the initiation of genomic instability may be a consequence of radiation or other agents independently of DNA damage implying non nuclear targets, e.g. signal cascades. As a further mechanism possibly involved our own results may suggest radiation-induced changes in chromatin structure. Once initiated the process of genomic instability probably is perpetuated by endogeneous processes necessary for proliferation. Genomic instability may be a cause or a consequence of the neoplastic phenotype. As a conclusion from the data available up to now a new interpretation of low level radiation effects for radiation protection and in radiotherapy appears useful. The detection of the molecular mechanisms of genomic instability will be important in this context and may contribute to a better understanding of phenomenons occurring at low doses <10 cSv which are not well understood up to now. (orig.)

  19. Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

    Energy Technology Data Exchange (ETDEWEB)

    Lykidis, Athanasios

    2006-12-01

    Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymes and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.

  20. Traditional medicine and genomics

    Directory of Open Access Journals (Sweden)

    Kalpana Joshi

    2010-01-01

    Full Text Available ′Omics′ developments in the form of genomics, proteomics and metabolomics have increased the impetus of traditional medicine research. Studies exploring the genomic, proteomic and metabolomic basis of human constitutional types based on Ayurveda and other systems of oriental medicine are becoming popular. Such studies remain important to developing better understanding of human variations and individual differences. Countries like India, Korea, China and Japan are investing in research on evidence-based traditional medicines and scientific validation of fundamental principles. This review provides an account of studies addressing relationships between traditional medicine and genomics.

  1. Traditional medicine and genomics.

    Science.gov (United States)

    Joshi, Kalpana; Ghodke, Yogita; Shintre, Pooja

    2010-01-01

    'Omics' developments in the form of genomics, proteomics and metabolomics have increased the impetus of traditional medicine research. Studies exploring the genomic, proteomic and metabolomic basis of human constitutional types based on Ayurveda and other systems of oriental medicine are becoming popular. Such studies remain important to developing better understanding of human variations and individual differences. Countries like India, Korea, China and Japan are investing in research on evidence-based traditional medicines and scientific validation of fundamental principles. This review provides an account of studies addressing relationships between traditional medicine and genomics.

  2. Bacillus subtilis genome diversity.

    Science.gov (United States)

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2007-02-01

    Microarray-based comparative genomic hybridization (M-CGH) is a powerful method for rapidly identifying regions of genome diversity among closely related organisms. We used M-CGH to examine the genome diversity of 17 strains belonging to the nonpathogenic species Bacillus subtilis. Our M-CGH results indicate that there is considerable genetic heterogeneity among members of this species; nearly one-third of Bsu168-specific genes exhibited variability, as measured by the microarray hybridization intensities. The variable loci include those encoding proteins involved in antibiotic production, cell wall synthesis, sporulation, and germination. The diversity in these genes may reflect this organism's ability to survive in diverse natural settings.

  3. Genomic taxonomy of vibrios

    Directory of Open Access Journals (Sweden)

    Iida Tetsuya

    2009-10-01

    Full Text Available Abstract Background Vibrio taxonomy has been based on a polyphasic approach. In this study, we retrieve useful taxonomic information (i.e. data that can be used to distinguish different taxonomic levels, such as species and genera from 32 genome sequences of different vibrio species. We use a variety of tools to explore the taxonomic relationship between the sequenced genomes, including Multilocus Sequence Analysis (MLSA, supertrees, Average Amino Acid Identity (AAI, genomic signatures, and Genome BLAST atlases. Our aim is to analyse the usefulness of these tools for species identification in vibrios. Results We have generated four new genome sequences of three Vibrio species, i.e., V. alginolyticus 40B, V. harveyi-like 1DA3, and V. mimicus strains VM573 and VM603, and present a broad analyses of these genomes along with other sequenced Vibrio species. The genome atlas and pangenome plots provide a tantalizing image of the genomic differences that occur between closely related sister species, e.g. V. cholerae and V. mimicus. The vibrio pangenome contains around 26504 genes. The V. cholerae core genome and pangenome consist of 1520 and 6923 genes, respectively. Pangenomes might allow different strains of V. cholerae to occupy different niches. MLSA and supertree analyses resulted in a similar phylogenetic picture, with a clear distinction of four groups (Vibrio core group, V. cholerae-V. mimicus, Aliivibrio spp., and Photobacterium spp.. A Vibrio species is defined as a group of strains that share > 95% DNA identity in MLSA and supertree analysis, > 96% AAI, ≤ 10 genome signature dissimilarity, and > 61% proteome identity. Strains of the same species and species of the same genus will form monophyletic groups on the basis of MLSA and supertree. Conclusion The combination of different analytical and bioinformatics tools will enable the most accurate species identification through genomic computational analysis. This endeavour will culminate in

  4. Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Block, S. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Cornwall, J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dally, W. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dyson, F. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Fortson, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Joyce, G. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Kimble, H. J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Lewis, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Max, C. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Prince, T. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Schwitters, R. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Weinberger, P. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Woodin, W. H. [The MITRE Corporation, McLean, VA (US). JASON Program Office

    1998-01-04

    The study reviews Department of Energy supported aspects of the United States Human Genome Project, the joint National Institutes of Health/Department of Energy program to characterize all human genetic material, to discover the set of human genes, and to render them accessible for further biological study. The study concentrates on issues of technology, quality assurance/control, and informatics relevant to current effort on the genome project and needs beyond it. Recommendations are presented on areas of the genome program that are of particular interest to and supported by the Department of Energy.

  5. Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    1993-01-01

    The DOE Human Genome program has grown tremendously, as shown by the marked increase in the number of genome-funded projects since the last workshop held in 1991. The abstracts in this book describe the genome research of DOE-funded grantees and contractors and invited guests, and all projects are represented at the workshop by posters. The 3-day meeting includes plenary sessions on ethical, legal, and social issues pertaining to the availability of genetic data; sequencing techniques, informatics support; and chromosome and cDNA mapping and sequencing.

  6. Genomic signal processing

    CERN Document Server

    Shmulevich, Ilya

    2007-01-01

    Genomic signal processing (GSP) can be defined as the analysis, processing, and use of genomic signals to gain biological knowledge, and the translation of that knowledge into systems-based applications that can be used to diagnose and treat genetic diseases. Situated at the crossroads of engineering, biology, mathematics, statistics, and computer science, GSP requires the development of both nonlinear dynamical models that adequately represent genomic regulation, and diagnostic and therapeutic tools based on these models. This book facilitates these developments by providing rigorous mathema

  7. The Genome of the Chicken DT40 Bursal Lymphoma Cell Line

    DEFF Research Database (Denmark)

    Molnar, Janos; Poti, Adam; Pipek, Orsolya

    2014-01-01

    The chicken DT40 cell line is a widely used model system in the study of multiple cellular processes due to the efficiency of homologous gene targeting. The cell line was derived from a bursal lymphoma induced by avian leukosis virus infection. In this study we characterized the genome of the cell...... chicken genomes and the Gallus gallus reference genome, we found no unique mutational processes shaping the DT40 genome except for a mild increase in insertion and deletion events, particularly deletions at tandem repeats. We mapped coding sequence mutations that are unique to the DT40 genome; mutations...

  8. A Guide to the PLAZA 3.0 Plant Comparative Genomic Database.

    Science.gov (United States)

    Vandepoele, Klaas

    2017-01-01

    PLAZA 3.0 is an online resource for comparative genomics and offers a versatile platform to study gene functions and gene families or to analyze genome organization and evolution in the green plant lineage. Starting from genome sequence information for over 35 plant species, precomputed comparative genomic data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, and genomic colinearity information within and between species. Complementary functional data sets, a Workbench, and interactive visualization tools are available through a user-friendly web interface, making PLAZA an excellent starting point to translate sequence or omics data sets into biological knowledge. PLAZA is available at http://bioinformatics.psb.ugent.be/plaza/ .

  9. The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.

    Science.gov (United States)

    Treangen, Todd J; Ondov, Brian D; Koren, Sergey; Phillippy, Adam M

    2014-01-01

    Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequences simultaneously. Here we present the Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Together they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees. Using simulated and real data we demonstrate that our approach exhibits unrivaled speed while maintaining the accuracy of existing methods. The Harvest suite is open-source and freely available from: http://github.com/marbl/harvest.

  10. Automated ensemble assembly and validation of microbial genomes

    Science.gov (United States)

    2014-01-01

    Background The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible. Results To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers. Conclusions Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to

  11. Low-pass sequencing for microbial comparative genomics

    Directory of Open Access Journals (Sweden)

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  12. Generalizing genetical genomics: getting added value from environmental perturbation.

    Science.gov (United States)

    Li, Yang; Breitling, Rainer; Jansen, Ritsert C

    2008-10-01

    Genetical genomics is a useful approach for studying the effect of genetic perturbations on biological systems at the molecular level. However, molecular networks depend on the environmental conditions and, thus, a comprehensive understanding of biological systems requires studying them across multiple environments. We propose a generalization of genetical genomics, which combines genetic and sensibly chosen environmental perturbations, to study the plasticity of molecular networks. This strategy forms a crucial step toward understanding why individuals respond differently to drugs, toxins, pathogens, nutrients and other environmental influences. Here we outline a strategy for selecting and allocating individuals to particular treatments, and we discuss the promises and pitfalls of the generalized genetical genomics approach.

  13. Patient-controlled encrypted genomic data: an approach to advance clinical genomics

    Directory of Open Access Journals (Sweden)

    Trakadis Yannis J

    2012-07-01

    Full Text Available Abstract Background The revolution in DNA sequencing technologies over the past decade has made it feasible to sequence an individual’s whole genome at a relatively low cost. The potential value of the information generated by genomic technologies for medicine and society is enormous. However, in order for exome sequencing, and eventually whole genome sequencing, to be implemented clinically, a number of major challenges need to be overcome. For instance, obtaining meaningful informed-consent, managing incidental findings and the great volume of data generated (including multiple findings with uncertain clinical significance, re-interpreting the genomic data and providing additional counselling to patients as genetic knowledge evolves are issues that need to be addressed. It appears that medical genetics is shifting from the present “phenotype-first” medical model to a “data-first” model which leads to multiple complexities. Discussion This manuscript discusses the different challenges associated with integrating genomic technologies into clinical practice and describes a “phenotype-first” approach, namely, “Individualized Mutation-weighed Phenotype Search”, and its benefits. The proposed approach allows for a more efficient prioritization of the genes to be tested in a clinical lab based on both the patient’s phenotype and his/her entire genomic data. It simplifies “informed-consent” for clinical use of genomic technologies and helps to protect the patient’s autonomy and privacy. Overall, this approach could potentially render widespread use of genomic technologies, in the immediate future, practical, ethical and clinically useful. Summary The “Individualized Mutation-weighed Phenotype Search” approach allows for an incremental integration of genomic technologies into clinical practice. It ensures that we do not over-medicalize genomic data but, rather, continue our current medical model which is based on serving

  14. Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation.

    Science.gov (United States)

    Kidd, Jeffrey M; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F; Peckham, Heather E; Omberg, Larsson; Bormann Chung, Christina A; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G; Russell, Archie; Reynolds, Andy; Clark, Andrew G; Reese, Martin G; Lincoln, Stephen E; Butte, Atul J; De La Vega, Francisco M; Bustamante, Carlos D

    2012-10-05

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  15. [Epigenetics 2.0: The multiple faces of the genome].

    Science.gov (United States)

    Rubinstein, Marcelo

    2016-09-01

    Epigenetics is the branch of genetics that studies the dynamic relationship between stable genotypes and varying phenotypes. To this end, epigenetics aims to discover the molecular mechanisms that explain how different nutrients and hormones, environmental changes, and emotional, social and cognitive experiences modify gene expression and behaviors, even permanently so. Psychiatry has learned that diseases with strong genetic predisposition, such as schizophrenia, show a concordance of around 50% between monozygotic twins, thus evidencing the importance of the genetic background and the presence of environmental variables that stimulate or block phenotypic development. The interest in epigenetics has increased during the last few years due to fundamental discoveries made in molecular and behavioral genetics, although within this framework factual knowledge coexists with fictional expectations and wrong concepts. Is it possible that epigenetic variants modify temperament and human behavior? May abused or neglected children develop long-lasting epigenetic marks in their DNA? May bipolar states correlate with different epigenetic signatures? Studying these subjects in not an easy task, but experiments performed in lab animals suggest that these conjectures are reasonable, although there is still a long distance between hypotheses and scientifically proven facts.

  16. Lophotrochozoan mitochondrial genomes

    Energy Technology Data Exchange (ETDEWEB)

    Valles, Yvonne; Boore, Jeffrey L.

    2005-10-01

    Progress in both molecular techniques and phylogeneticmethods has challenged many of the interpretations of traditionaltaxonomy. One example is in the recognition of the animal superphylumLophotrochozoa (annelids, mollusks, echiurans, platyhelminthes,brachiopods, and other phyla), although the relationships within thisgroup and the inclusion of some phyla remain uncertain. While much ofthis progress in phylogenetic reconstruction has been based on comparingsingle gene sequences, we are beginning to see the potential of comparinglarge-scale features of genomes, such as the relative order of genes.Even though tremendous progress is being made on the sequencedetermination of whole nuclear genomes, the dataset of choice forgenome-level characters for many animals across a broad taxonomic rangeremains mitochondrial genomes. We review here what is known aboutmitochondrial genomes of the lophotrochozoans and discuss the promisethat this dataset will enable insight into theirrelationships.

  17. Mouse Genome Informatics (MGI)

    Data.gov (United States)

    U.S. Department of Health & Human Services — MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human...

  18. Genomic definition of species

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1991-07-01

    The subject of this paper is the definition of species based on the assumption that genome is the fundamental level for the origin and maintenance of biological diversity. For this view to be logically consistent it is necessary to assume the existence and operation of the new law which we call genome law. For this reason the genome law is included in the explanation of species phenomenon presented here even if its precise formulation and elaboration are left for the future. The intellectual underpinnings of this definition can be traced to Goldschmidt. We wish to explore some philosophical aspects of the definition of species in terms of the genome. The point of proposing the definition on these grounds is that any real advance in evolutionary theory has to be correct in both its philosophy and its science.

  19. Structural genomics in endocrinology

    NARCIS (Netherlands)

    Smit, J. W.; Romijn, J. A.

    2001-01-01

    Traditionally, endocrine research evolved from the phenotypical characterisation of endocrine disorders to the identification of underlying molecular pathophysiology. This approach has been, and still is, extremely successful. The introduction of genomics and proteomics has resulted in a reversal of

  20. Epidemiology & Genomics Research Program

    Science.gov (United States)

    The Epidemiology and Genomics Research Program, in the National Cancer Institute's Division of Cancer Control and Population Sciences, funds research in human populations to understand the determinants of cancer occurrence and outcomes.

  1. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  2. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  3. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  4. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

    Science.gov (United States)

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

  5. Genetical Genomics for Evolutionary Studies

    NARCIS (Netherlands)

    Prins, J.C.P.; Smant, G.; Jansen, R.C.

    2012-01-01

    Genetical genomics combines acquired high-throughput genomic data with genetic analysis. In this chapter, we discuss the application of genetical genomics for evolutionary studies, where new high-throughput molecular technologies are combined with mapping quantitative trait loci (QTL) on the genome

  6. Development of genome- and transcriptome-derived microsatellites in related species of snapping shrimps with highly duplicated genomes.

    Science.gov (United States)

    Gaynor, Kaitlyn M; Solomon, Joseph W; Siller, Stefanie; Jessell, Linnet; Duffy, J Emmett; Rubenstein, Dustin R

    2017-11-01

    Molecular markers are powerful tools for studying patterns of relatedness and parentage within populations and for making inferences about social evolution. However, the development of molecular markers for simultaneous study of multiple species presents challenges, particularly when species exhibit genome duplication or polyploidy. We developed microsatellite markers for Synalpheus shrimp, a genus in which species exhibit not only great variation in social organization, but also interspecific variation in genome size and partial genome duplication. From the four primary clades within Synalpheus, we identified microsatellites in the genomes of four species and in the consensus transcriptome of two species. Ultimately, we designed and tested primers for 143 microsatellite markers across 25 species. Although the majority of markers were disomic, many markers were polysomic for certain species. Surprisingly, we found no relationship between genome size and the number of polysomic markers. As expected, markers developed for a given species amplified better for closely related species than for more distant relatives. Finally, the markers developed from the transcriptome were more likely to work successfully and to be disomic than those developed from the genome, suggesting that consensus transcriptomes are likely to be conserved across species. Our findings suggest that the transcriptome, particularly consensus sequences from multiple species, can be a valuable source of molecular markers for taxa with complex, duplicated genomes. © 2017 John Wiley & Sons Ltd.

  7. The human genome project

    International Nuclear Information System (INIS)

    Worton, R.

    1996-01-01

    The Human Genome Project is a massive international research project, costing 3 to 5 billion dollars and expected to take 15 years, which will identify the all the genes in the human genome - i.e. the complete sequence of bases in human DNA. The prize will be the ability to identify genes causing or predisposing to disease, and in some cases the development of gene therapy, but this new knowledge will raise important ethical issues

  8. Decoding the human genome

    CERN Multimedia

    CERN. Geneva. Audiovisual Unit; Antonerakis, S E

    2002-01-01

    Decoding the Human genome is a very up-to-date topic, raising several questions besides purely scientific, in view of the two competing teams (public and private), the ethics of using the results, and the fact that the project went apparently faster and easier than expected. The lecture series will address the following chapters: Scientific basis and challenges. Ethical and social aspects of genomics.

  9. Molluscan Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Simison, W. Brian; Boore, Jeffrey L.

    2005-12-01

    In the last 20 years there have been dramatic advances in techniques of high-throughput DNA sequencing, most recently accelerated by the Human Genome Project, a program that has determined the three billion base pair code on which we are based. Now this tremendous capability is being directed at other genome targets that are being sampled across the broad range of life. This opens up opportunities as never before for evolutionary and organismal biologists to address questions of both processes and patterns of organismal change. We stand at the dawn of a new 'modern synthesis' period, paralleling that of the early 20th century when the fledgling field of genetics first identified the underlying basis for Darwin's theory. We must now unite the efforts of systematists, paleontologists, mathematicians, computer programmers, molecular biologists, developmental biologists, and others in the pursuit of discovering what genomics can teach us about the diversity of life. Genome-level sampling for mollusks to date has mostly been limited to mitochondrial genomes and it is likely that these will continue to provide the best targets for broad phylogenetic sampling in the near future. However, we are just beginning to see an inroad into complete nuclear genome sequencing, with several mollusks and other eutrochozoans having been selected for work about to begin. Here, we provide an overview of the state of molluscan mitochondrial genomics, highlight a few of the discoveries from this research, outline the promise of broadening this dataset, describe upcoming projects to sequence whole mollusk nuclear genomes, and challenge the community to prepare for making the best use of these data.

  10. Human Germline Genome Editing

    OpenAIRE

    Ormond, Kelly E.; Mortlock, Douglas P.; Scholes, Derek T.; Bombard, Yvonne; Brody, Lawrence C.; Faucett, W. Andrew; Garrison, Nanibaa’ A.; Hercher, Laura; Isasi, Rosario; Middleton, Anna; Musunuru, Kiran; Shriner, Daniel; Virani, Alice; Young, Caroline E.

    2017-01-01

    With CRISPR/Cas9 and other genome-editing technologies, successful somatic and germline genome editing are becoming feasible. To respond, an American Society of Human Genetics (ASHG) workgroup developed this position statement, which was approved by the ASHG Board in March 2017. The workgroup included representatives from the UK Association of Genetic Nurses and Counsellors, Canadian Association of Genetic Counsellors, International Genetic Epidemiology Society, and US National Society of Gen...

  11. CAGO: a software tool for dynamic visual comparison and correlation measurement of genome organization.

    Directory of Open Access Journals (Sweden)

    Yi-Feng Chang

    Full Text Available CAGO (Comparative Analysis of Genome Organization is developed to address two critical shortcomings of conventional genome atlas plotters: lack of dynamic exploratory functions and absence of signal analysis for genomic properties. With dynamic exploratory functions, users can directly manipulate chromosome tracks of a genome atlas and intuitively identify distinct genomic signals by visual comparison. Signal analysis of genomic properties can further detect inconspicuous patterns from noisy genomic properties and calculate correlations between genomic properties across various genomes. To implement dynamic exploratory functions, CAGO presents each genome atlas in Scalable Vector Graphics (SVG format and allows users to interact with it using a SVG viewer through JavaScript. Signal analysis functions are implemented using R statistical software and a discrete wavelet transformation package waveslim. CAGO is not only a plotter for generating complex genome atlases, but also a platform for exploring genome atlases with dynamic exploratory functions for visual comparison and with signal analysis for comparing genomic properties across multiple organisms. The web-based application of CAGO, its source code, user guides, video demos, and live examples are publicly available and can be accessed at http://cbs.ym.edu.tw/cago.

  12. RadGenomics project

    Energy Technology Data Exchange (ETDEWEB)

    Iwakawa, Mayumi; Imai, Takashi; Harada, Yoshinobu [National Inst. of Radiological Sciences, Chiba (Japan). Frontier Research Center] [and others

    2002-06-01

    Human health is determined by a complex interplay of factors, predominantly between genetic susceptibility, environmental conditions and aging. The ultimate aim of the RadGenomics (Radiation Genomics) project is to understand the implications of heterogeneity in responses to ionizing radiation arising from genetic variation between individuals in the human population. The rapid progression of the human genome sequencing and the recent development of new technologies in molecular genetics are providing us with new opportunities to understand the genetic basis of individual differences in susceptibility to natural and/or artificial environmental factors, including radiation exposure. The RadGenomics project will inevitably lead to improved protocols for personalized radiotherapy and reductions in the potential side effects of such treatment. The project will contribute to future research into the molecular mechanisms of radiation sensitivity in humans and will stimulate the development of new high-throughput technologies for a broader application of biological and medical sciences. The staff members are specialists in a variety of fields, including genome science, radiation biology, medical science, molecular biology, and informatics, and have joined the RadGenomics project from various universities, companies, and research institutes. The project started in April 2001. (author)

  13. Comparative Genome Viewer

    International Nuclear Information System (INIS)

    Molineris, I.; Sales, G.

    2009-01-01

    The amount of information about genomes, both in the form of complete sequences and annotations, has been exponentially increasing in the last few years. As a result there is the need for tools providing a graphical representation of such information that should be comprehensive and intuitive. Visual representation is especially important in the comparative genomics field since it should provide a combined view of data belonging to different genomes. We believe that existing tools are limited in this respect as they focus on a single genome at a time (conservation histograms) or compress alignment representation to a single dimension. We have therefore developed a web-based tool called Comparative Genome Viewer (Cgv): it integrates a bidimensional representation of alignments between two regions, both at small and big scales, with the richness of annotations present in other genome browsers. We give access to our system through a web-based interface that provides the user with an interactive representation that can be updated in real time using the mouse to move from region to region and to zoom in on interesting details.

  14. Human social genomics.

    Directory of Open Access Journals (Sweden)

    Steven W Cole

    2014-08-01

    Full Text Available A growing literature in human social genomics has begun to analyze how everyday life circumstances influence human gene expression. Social-environmental conditions such as urbanity, low socioeconomic status, social isolation, social threat, and low or unstable social status have been found to associate with differential expression of hundreds of gene transcripts in leukocytes and diseased tissues such as metastatic cancers. In leukocytes, diverse types of social adversity evoke a common conserved transcriptional response to adversity (CTRA characterized by increased expression of proinflammatory genes and decreased expression of genes involved in innate antiviral responses and antibody synthesis. Mechanistic analyses have mapped the neural "social signal transduction" pathways that stimulate CTRA gene expression in response to social threat and may contribute to social gradients in health. Research has also begun to analyze the functional genomics of optimal health and thriving. Two emerging opportunities now stand to revolutionize our understanding of the everyday life of the human genome: network genomics analyses examining how systems-level capabilities emerge from groups of individual socially sensitive genomes and near-real-time transcriptional biofeedback to empirically optimize individual well-being in the context of the unique genetic, geographic, historical, developmental, and social contexts that jointly shape the transcriptional realization of our innate human genomic potential for thriving.

  15. RadGenomics project

    International Nuclear Information System (INIS)

    Iwakawa, Mayumi; Imai, Takashi; Harada, Yoshinobu

    2002-01-01

    Human health is determined by a complex interplay of factors, predominantly between genetic susceptibility, environmental conditions and aging. The ultimate aim of the RadGenomics (Radiation Genomics) project is to understand the implications of heterogeneity in responses to ionizing radiation arising from genetic variation between individuals in the human population. The rapid progression of the human genome sequencing and the recent development of new technologies in molecular genetics are providing us with new opportunities to understand the genetic basis of individual differences in susceptibility to natural and/or artificial environmental factors, including radiation exposure. The RadGenomics project will inevitably lead to improved protocols for personalized radiotherapy and reductions in the potential side effects of such treatment. The project will contribute to future research into the molecular mechanisms of radiation sensitivity in humans and will stimulate the development of new high-throughput technologies for a broader application of biological and medical sciences. The staff members are specialists in a variety of fields, including genome science, radiation biology, medical science, molecular biology, and informatics, and have joined the RadGenomics project from various universities, companies, and research institutes. The project started in April 2001. (author)

  16. Ultrafast comparison of personal genomes

    OpenAIRE

    Mauldin, Denise; Hood, Leroy; Robinson, Max; Glusman, Gustavo

    2017-01-01

    We present an ultra-fast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into 'genome fingerprints' that can be readily compared across sequencing technologies and reference versions. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. This enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative s...

  17. Genomic value prediction for quantitative traits under the epistatic model

    Directory of Open Access Journals (Sweden)

    Xu Shizhong

    2011-01-01

    Full Text Available Abstract Background Most quantitative traits are controlled by multiple quantitative trait loci (QTL. The contribution of each locus may be negligible but the collective contribution of all loci is usually significant. Genome selection that uses markers of the entire genome to predict the genomic values of individual plants or animals can be more efficient than selection on phenotypic values and pedigree information alone for genetic improvement. When a quantitative trait is contributed by epistatic effects, using all markers (main effects and marker pairs (epistatic effects to predict the genomic values of plants can achieve the maximum efficiency for genetic improvement. Results In this study, we created 126 recombinant inbred lines of soybean and genotyped 80 makers across the genome. We applied the genome selection technique to predict the genomic value of somatic embryo number (a quantitative trait for each line. Cross validation analysis showed that the squared correlation coefficient between the observed and predicted embryo numbers was 0.33 when only main (additive effects were used for prediction. When the interaction (epistatic effects were also included in the model, the squared correlation coefficient reached 0.78. Conclusions This study provided an excellent example for the application of genome selection to plant breeding.

  18. Elevated Rate of Genome Rearrangements in Radiation-Resistant Bacteria.

    Science.gov (United States)

    Repar, Jelena; Supek, Fran; Klanjscek, Tin; Warnecke, Tobias; Zahradka, Ksenija; Zahradka, Davor

    2017-04-01

    A number of bacterial, archaeal, and eukaryotic species are known for their resistance to ionizing radiation. One of the challenges these species face is a potent environmental source of DNA double-strand breaks, potential drivers of genome structure evolution. Efficient and accurate DNA double-strand break repair systems have been demonstrated in several unrelated radiation-resistant species and are putative adaptations to the DNA damaging environment. Such adaptations are expected to compensate for the genome-destabilizing effect of environmental DNA damage and may be expected to result in a more conserved gene order in radiation-resistant species. However, here we show that rates of genome rearrangements, measured as loss of gene order conservation with time, are higher in radiation-resistant species in multiple, phylogenetically independent groups of bacteria. Comparison of indicators of selection for genome organization between radiation-resistant and phylogenetically matched, nonresistant species argues against tolerance to disruption of genome structure as a strategy for radiation resistance. Interestingly, an important mechanism affecting genome rearrangements in prokaryotes, the symmetrical inversions around the origin of DNA replication, shapes genome structure of both radiation-resistant and nonresistant species. In conclusion, the opposing effects of environmental DNA damage and DNA repair result in elevated rates of genome rearrangements in radiation-resistant bacteria. Copyright © 2017 Repar et al.

  19. A computational genomics pipeline for prokaryotic sequencing projects.

    Science.gov (United States)

    Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

    2010-08-01

    New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.

  20. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    Directory of Open Access Journals (Sweden)

    Wolf Yuri I

    2007-11-01

    Full Text Available Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs. Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results New Archaeal Clusters of Orthologous Genes (arCOGs were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile

  1. CoCoNUT: an efficient system for the comparison and analysis of genomes

    Directory of Open Access Journals (Sweden)

    Kurtz Stefan

    2008-11-01

    Full Text Available Abstract Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit that allows solving several different tasks in a unified framework: (1 finding regions of high similarity among multiple genomic sequences and aligning them, (2 comparing two draft or multi-chromosomal genomes, (3 locating large segmental duplications in large genomic sequences, and (4 mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component, CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics.

  2. Decoding Synteny Blocks and Large-Scale Duplications in Mammalian and Plant Genomes

    Science.gov (United States)

    Peng, Qian; Alekseyev, Max A.; Tesler, Glenn; Pevzner, Pavel A.

    The existing synteny block reconstruction algorithms use anchors (e.g., orthologous genes) shared over all genomes to construct the synteny blocks for multiple genomes. This approach, while efficient for a few genomes, cannot be scaled to address the need to construct synteny blocks in many mammalian genomes that are currently being sequenced. The problem is that the number of anchors shared among all genomes quickly decreases with the increase in the number of genomes. Another problem is that many genomes (plant genomes in particular) had extensive duplications, which makes decoding of genomic architecture and rearrangement analysis in plants difficult. The existing synteny block generation algorithms in plants do not address the issue of generating non-overlapping synteny blocks suitable for analyzing rearrangements and evolution history of duplications. We present a new algorithm based on the A-Bruijn graph framework that overcomes these difficulties and provides a unified approach to synteny block reconstruction for multiple genomes, and for genomes with large duplications.

  3. gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances.

    Directory of Open Access Journals (Sweden)

    Mirjana Domazet-Lošo

    Full Text Available Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure, a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos.

  4. Detecting uber-operons in prokaryotic genomes.

    Science.gov (United States)

    Che, Dongsheng; Li, Guojun; Mao, Fenglou; Wu, Hongwei; Xu, Ying

    2006-01-01

    We present a study on computational identification of uber-operons in a prokaryotic genome, each of which represents a group of operons that are evolutionarily or functionally associated through operons in other (reference) genomes. Uber-operons represent a rich set of footprints of operon evolution, whose full utilization could lead to new and more powerful tools for elucidation of biological pathways and networks than what operons have provided, and a better understanding of prokaryotic genome structures and evolution. Our prediction algorithm predicts uber-operons through identifying groups of functionally or transcriptionally related operons, whose gene sets are conserved across the target and multiple reference genomes. Using this algorithm, we have predicted uber-operons for each of a group of 91 genomes, using the other 90 genomes as references. In particular, we predicted 158 uber-operons in Escherichia coli K12 covering 1830 genes, and found that many of the uber-operons correspond to parts of known regulons or biological pathways or are involved in highly related biological processes based on their Gene Ontology (GO) assignments. For some of the predicted uber-operons that are not parts of known regulons or pathways, our analyses indicate that their genes are highly likely to work together in the same biological processes, suggesting the possibility of new regulons and pathways. We believe that our uber-operon prediction provides a highly useful capability and a rich information source for elucidation of complex biological processes, such as pathways in microbes. All the prediction results are available at our Uber-Operon Database: http://csbl.bmb.uga.edu/uber, the first of its kind.

  5. Genomics using the Assembly of the Mink Genome

    DEFF Research Database (Denmark)

    Guldbrandtsen, Bernt; Cai, Zexi; Sahana, Goutam

    2018-01-01

    The American Mink’s (Neovison vison) genome has recently been sequenced. This opens numerous avenues of research both for studying the basic genetics and physiology of the mink as well as genetic improvement in mink. Using genotyping-by-sequencing (GBS) generated marker data for 2,352 Danish farm...... mink runs of homozygosity (ROH) were detect in mink genomes. Detectable ROH made up on average 1.7% of the genome indicating the presence of at most a moderate level of genomic inbreeding. The fraction of genome regions found in ROH varied. Ten percent of the included regions were never found in ROH....... The ability to detect ROH in the mink genome also demonstrates the general reliability of the new mink genome assembly. Keywords: american mink, run of homozygosity, genome, selection, genomic inbreeding...

  6. Genome size analyses of Pucciniales reveal the largest fungal genomes.

    Science.gov (United States)

    Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T; Loureiro, João; Talhinhas, Pedro

    2014-01-01

    Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.

  7. Global organization of a positive-strand RNA virus genome.

    Directory of Open Access Journals (Sweden)

    Baodong Wu

    Full Text Available The genomes of plus-strand RNA viruses contain many regulatory sequences and structures that direct different viral processes. The traditional view of these RNA elements are as local structures present in non-coding regions. However, this view is changing due to the discovery of regulatory elements in coding regions and functional long-range intra-genomic base pairing interactions. The ∼4.8 kb long RNA genome of the tombusvirus tomato bushy stunt virus (TBSV contains these types of structural features, including six different functional long-distance interactions. We hypothesized that to achieve these multiple interactions this viral genome must utilize a large-scale organizational strategy and, accordingly, we sought to assess the global conformation of the entire TBSV genome. Atomic force micrographs of the genome indicated a mostly condensed structure composed of interconnected protrusions extending from a central hub. This configuration was consistent with the genomic secondary structure model generated using high-throughput selective 2'-hydroxyl acylation analysed by primer extension (i.e. SHAPE, which predicted different sized RNA domains originating from a central region. Known RNA elements were identified in both domain and inter-domain regions, and novel structural features were predicted and functionally confirmed. Interestingly, only two of the six long-range interactions known to form were present in the structural model. However, for those interactions that did not form, complementary partner sequences were positioned relatively close to each other in the structure, suggesting that the secondary structure level of viral genome structure could provide a basic scaffold for the formation of different long-range interactions. The higher-order structural model for the TBSV RNA genome provides a snapshot of the complex framework that allows multiple functional components to operate in concert within a confined context.

  8. Genomes to Proteomes

    Energy Technology Data Exchange (ETDEWEB)

    Panisko, Ellen A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Grigoriev, Igor [USDOE Joint Genome Inst., Walnut Creek, CA (United States); Daly, Don S. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Webb-Robertson, Bobbie-Jo [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Baker, Scott E. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2009-03-01

    Biologists are awash with genomic sequence data. In large part, this is due to the rapid acceleration in the generation of DNA sequence that occurred as public and private research institutes raced to sequence the human genome. In parallel with the large human genome effort, mostly smaller genomes of other important model organisms were sequenced. Projects following on these initial efforts have made use of technological advances and the DNA sequencing infrastructure that was built for the human and other organism genome projects. As a result, the genome sequences of many organisms are available in high quality draft form. While in many ways this is good news, there are limitations to the biological insights that can be gleaned from DNA sequences alone; genome sequences offer only a bird's eye view of the biological processes endemic to an organism or community. Fortunately, the genome sequences now being produced at such a high rate can serve as the foundation for other global experimental platforms such as proteomics. Proteomic methods offer a snapshot of the proteins present at a point in time for a given biological sample. Current global proteomics methods combine enzymatic digestion, separations, mass spectrometry and database searching for peptide identification. One key aspect of proteomics is the prediction of peptide sequences from mass spectrometry data. Global proteomic analysis uses computational matching of experimental mass spectra with predicted spectra based on databases of gene models that are often generated computationally. Thus, the quality of gene models predicted from a genome sequence is crucial in the generation of high quality peptide identifications. Once peptides are identified they can be assigned to their parent protein. Proteins identified as expressed in a given experiment are most useful when compared to other expressed proteins in a larger biological context or biochemical pathway. In this chapter we will discuss the automatic

  9. Experimental Induction of Genome Chaos.

    Science.gov (United States)

    Ye, Christine J; Liu, Guo; Heng, Henry H

    2018-01-01

    Genome chaos, or karyotype chaos, represents a powerful survival strategy for somatic cells under high levels of stress/selection. Since the genome context, not the gene content, encodes the genomic blueprint of the cell, stress-induced rapid and massive reorganization of genome topology functions as a very important mechanism for genome (karyotype) evolution. In recent years, the phenomenon of genome chaos has been confirmed by various sequencing efforts, and many different terms have been coined to describe different subtypes of the chaotic genome including "chromothripsis," "chromoplexy," and "structural mutations." To advance this exciting field, we need an effective experimental system to induce and characterize the karyotype reorganization process. In this chapter, an experimental protocol to induce chaotic genomes is described, following a brief discussion of the mechanism and implication of genome chaos in cancer evolution.

  10. SAGE: String-overlap Assembly of GEnomes.

    Science.gov (United States)

    Ilie, Lucian; Haider, Bahlul; Molnar, Michael; Solis-Oba, Roberto

    2014-09-15

    De novo genome assembly of next-generation sequencing data is one of the most important current problems in bioinformatics, essential in many biological applications. In spite of significant amount of work in this area, better solutions are still very much needed. We present a new program, SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers. SAGE benefits from innovations in almost every aspect of the assembly process: error correction of input reads, string-overlap graph construction, read copy counts estimation, overlap graph analysis and reduction, contig extraction, and scaffolding. We hope that these new ideas will help advance the current state-of-the-art in an essential area of research in genomics.

  11. The agents of natural genome editing.

    Science.gov (United States)

    Witzany, Guenther

    2011-06-01

    The DNA serves as a stable information storage medium and every protein which is needed by the cell is produced from this blueprint via an RNA intermediate code. More recently it was found that an abundance of various RNA elements cooperate in a variety of steps and substeps as regulatory and catalytic units with multiple competencies to act on RNA transcripts. Natural genome editing on one side is the competent agent-driven generation and integration of meaningful DNA nucleotide sequences into pre-existing genomic content arrangements, and the ability to (re-)combine and (re-)regulate them according to context-dependent (i.e. adaptational) purposes of the host organism. Natural genome editing on the other side designates the integration of all RNA activities acting on RNA transcripts without altering DNA-encoded genes. If we take the genetic code seriously as a natural code, there must be agents that are competent to act on this code because no natural code codes itself as no natural language speaks itself. As code editing agents, viral and subviral agents have been suggested because there are several indicators that demonstrate viruses competent in both RNA and DNA natural genome editing.

  12. Hierarchical role for transcription factors and chromatin structure in genome organization along adipogenesis

    DEFF Research Database (Denmark)

    Sarusi Portuguez, Avital; Schwartz, Michal; Siersbaek, Rasmus

    2017-01-01

    The three dimensional folding of mammalian genomes is cell type specific and difficult to alter suggesting that it is an important component of gene regulation. However, given the multitude of chromatin-associating factors, the mechanisms driving the colocalization of active chromosomal domains...... by PPARγ and Lpin1, undergoes orchestrated reorganization during adipogenesis. Coupling the dynamics of genome architecture with multiple chromatin datasets indicated that among all the transcription factors (TFs) tested, RXR is central to genome reorganization at the beginning of adipogenesis...

  13. Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures

    DEFF Research Database (Denmark)

    Stark, Alexander; Lin, Michael F; Kheradpour, Pouya

    2007-01-01

    Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional e...... individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies....

  14. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

    2018-01-01

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  15. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  16. Genome position specific priors for genomic prediction

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Lund, Mogens Sandø

    2012-01-01

    casual mutation is different between the populations but affects the same gene. Proportions of a four-distribution mixture for SNP effects in segments of fixed size along the genome are derived from one population and set as location specific prior proportions of distributions of SNP effects...... for the target population. The model was tested using dairy cattle populations of different breeds: 540 Australian Jersey bulls, 2297 Australian Holstein bulls and 5214 Nordic Holstein bulls. The traits studied were protein-, fat- and milk yield. Genotypic data was Illumina 777K SNPs, real or imputed Results...

  17. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis.

    Science.gov (United States)

    Jun, Se-Ran; Wassenaar, Trudy M; Nookaew, Intawat; Hauser, Loren; Wanchai, Visanu; Land, Miriam; Timm, Collin M; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A; Ussery, David W

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants. Copyright © 2015 Jun et al.

  18. Vitamin D and the brain: Genomic and non-genomic actions.

    Science.gov (United States)

    Cui, Xiaoying; Gooch, Helen; Petty, Alice; McGrath, John J; Eyles, Darryl

    2017-09-15

    1,25(OH) 2 D 3 (vitamin D) is well-recognized as a neurosteroid that modulates multiple brain functions. A growing body of evidence indicates that vitamin D plays a pivotal role in brain development, neurotransmission, neuroprotection and immunomodulation. However, the precise molecular mechanisms by which vitamin D exerts these functions in the brain are still unclear. Vitamin D signalling occurs via the vitamin D receptor (VDR), a zinc-finger protein in the nuclear receptor superfamily. Like other nuclear steroids, vitamin D has both genomic and non-genomic actions. The transcriptional activity of vitamin D occurs via the nuclear VDR. Its faster, non-genomic actions can occur when the VDR is distributed outside the nucleus. The VDR is present in the developing and adult brain where it mediates the effects of vitamin D on brain development and function. The purpose of this review is to summarise the in vitro and in vivo work that has been conducted to characterise the genomic and non-genomic actions of vitamin D in the brain. Additionally we link these processes to functional neurochemical and behavioural outcomes. Elucidation of the precise molecular mechanisms underpinning vitamin D signalling in the brain may prove useful in understanding the role this steroid plays in brain ontogeny and function. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. The spectrum of genomic signatures: from dinucleotides to chaos game representation.

    Science.gov (United States)

    Wang, Yingwei; Hill, Kathleen; Singh, Shiva; Kari, Lila

    2005-02-14

    In the post genomic era, access to complete genome sequence data for numerous diverse species has opened multiple avenues for examining and comparing primary DNA sequence organization of entire genomes. Previously, the concept of a genomic signature was introduced with the observation of species-type specific Dinucleotide Relative Abundance Profiles (DRAPs); dinucleotides were identified as the subsequences with the greatest bias in representation in a majority of genomes. Herein, we demonstrate that DRAP is one particular genomic signature contained within a broader spectrum of signatures. Within this spectrum, an alternative genomic signature, Chaos Game Representation (CGR), provides a unique visualization of patterns in sequence organization. A genomic signature is associated with a particular integer order or subsequence length that represents a measure of the resolution or granularity in the analysis of primary DNA sequence organization. We quantitatively explore the organizational information provided by genomic signatures of different orders through different distance measures, including a novel Image Distance. The Image Distance and other existing distance measures are evaluated by comparing the phylogenetic trees they generate for 26 complete mitochondrial genomes from a diversity of species. The phylogenetic tree generated by the Image Distance is compatible with the known relatedness of species. Quantitative evaluation of the spectrum of genomic signatures may be used to ultimately gain insight into the determinants and biological relevance of the genome signatures.

  20. Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops

    Directory of Open Access Journals (Sweden)

    Gendrault-Jacquemard A

    2005-07-01

    Full Text Available Abstract Background Public databases now contain multitude of complete bacterial genomes, including several genomes of the same species. The available data offers new opportunities to address questions about bacterial genome evolution, a task that requires reliable fine comparison data of closely related genomes. Recent analyses have shown, using pairwise whole genome alignments, that it is possible to segment bacterial genomes into a common conserved backbone and strain-specific sequences called loops. Results Here, we generalize this approach and propose a strategy that allows systematic and non-biased genome segmentation based on multiple genome alignments. Segmentation analyses, as applied to 13 different bacterial species, confirmed the feasibility of our approach to discern the 'mosaic' organization of bacterial genomes. Segmentation results are available through a Web interface permitting functional analysis, extraction and visualization of the backbone/loops structure of documented genomes. To illustrate the potential of this approach, we performed a precise analysis of the mosaic organization of three E. coli strains and functional characterization of the loops. Conclusion The segmentation results including the backbone/loops structure of 13 bacterial species genomes are new and available for use by the scientific community at the URL: http://genome.jouy.inra.fr/mosaic.

  1. Genomics of Volvocine Algae

    Science.gov (United States)

    Umen, James G.; Olson, Bradley J.S.C.

    2015-01-01

    Volvocine algae are a group of chlorophytes that together comprise a unique model for evolutionary and developmental biology. The species Chlamydomonas reinhardtii and Volvox carteri represent extremes in morphological diversity within the Volvocine clade. Chlamydomonas is unicellular and reflects the ancestral state of the group, while Volvox is multicellular and has evolved numerous innovations including germ-soma differentiation, sexual dimorphism, and complex morphogenetic patterning. The Chlamydomonas genome sequence has shed light on several areas of eukaryotic cell biology, metabolism and evolution, while the Volvox genome sequence has enabled a comparison with Chlamydomonas that reveals some of the underlying changes that enabled its transition to multicellularity, but also underscores the subtlety of this transition. Many of the tools and resources are in place to further develop Volvocine algae as a model for evolutionary genomics. PMID:25883411

  2. Genomics of Preterm Birth

    Science.gov (United States)

    Swaggart, Kayleigh A.; Pavlicev, Mihaela; Muglia, Louis J.

    2015-01-01

    The molecular mechanisms controlling human birth timing at term, or resulting in preterm birth, have been the focus of considerable investigation, but limited insights have been gained over the past 50 years. In part, these processes have remained elusive because of divergence in reproductive strategies and physiology shown by model organisms, making extrapolation to humans uncertain. Here, we summarize the evolution of progesterone signaling and variation in pregnancy maintenance and termination. We use this comparative physiology to support the hypothesis that selective pressure on genomic loci involved in the timing of parturition have shaped human birth timing, and that these loci can be identified with comparative genomic strategies. Previous limitations imposed by divergence of mechanisms provide an important new opportunity to elucidate fundamental pathways of parturition control through increasing availability of sequenced genomes and associated reproductive physiology characteristics across diverse organisms. PMID:25646385

  3. Genomics of Salmonella Species

    Science.gov (United States)

    Canals, Rocio; McClelland, Michael; Santiviago, Carlos A.; Andrews-Polymenis, Helene

    Progress in the study of Salmonella survival, colonization, and virulence has increased rapidly with the advent of complete genome sequencing and higher capacity assays for transcriptomic and proteomic analysis. Although many of these techniques have yet to be used to directly assay Salmonella growth on foods, these assays are currently in use to determine Salmonella factors necessary for growth in animal models including livestock animals and in in vitro conditions that mimic many different environments. As sequencing of the Salmonella genome and microarray analysis have revolutionized genomics and transcriptomics of salmonellae over the last decade, so are new high-throughput sequencing technologies currently accelerating the pace of our studies and allowing us to approach complex problems that were not previously experimentally tractable.

  4. Quality control and conduct of genome-wide association meta-analyses

    DEFF Research Database (Denmark)

    Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C

    2014-01-01

    Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC...

  5. Using Microbial Genome Annotation as a Foundation for Collaborative Student Research

    Science.gov (United States)

    Reed, Kelynne E.; Richardson, John M.

    2013-01-01

    We used the Integrated Microbial Genomes Annotation Collaboration Toolkit as a framework to incorporate microbial genomics research into a microbiology and biochemistry course in a way that promoted student learning of bioinformatics and research skills and emphasized teamwork and collaboration as evidenced through multiple assessment mechanisms.…

  6. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index

    DEFF Research Database (Denmark)

    Minster, Ryan L; Sanders, Jason L; Singh, Jatinder

    2015-01-01

    BACKGROUND: The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. METHODS: We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted...

  7. Direct Mutagenesis of Thousands of Genomic Targets using Microarray-derived Oligonucleotides

    DEFF Research Database (Denmark)

    Bonde, Mads; Kosuri, Sriram; Genee, Hans Jasper

    2015-01-01

    Multiplex Automated Genome Engineering (MAGE) allows simultaneous mutagenesis of multiple target sites in bacterial genomes using short oligonucleotides. However, large-scale mutagenesis requires hundreds to thousands of unique oligos, which are costly to synthesize and impossible to scale-up by ...

  8. Brief Guide to Genomics: DNA, Genes and Genomes

    Science.gov (United States)

    ... clinic. Most new drugs based on genome-based research are estimated to be at least 10 to 15 years away, though recent genome-driven efforts in lipid-lowering therapy have considerably shortened that interval. According ...

  9. Genomic Prediction in Barley

    DEFF Research Database (Denmark)

    Edriss, Vahid; Cericola, Fabio; Jensen, Jens D

    2015-01-01

    to next generation. The main goal of this study was to see the potential of using genomic prediction in a commercial Barley breeding program. The data used in this study was from Nordic Seed company which is located in Denmark. Around 350 advanced lines were genotyped with 9K Barely chip from Illumina....... Traits used in this study were grain yield, plant height and heading date. Heading date is number days it takes after 1st June for plant to head. Heritabilities were 0.33, 0.44 and 0.48 for yield, height and heading, respectively for the average of nine plots. The GBLUP model was used for genomic...

  10. Mycobacterial species as case-study of comparative genome analysis.

    Science.gov (United States)

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-02-08

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species.

  11. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    Science.gov (United States)

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  12. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  13. Systematic evaluation of bias in microbial community profiles induced by whole genome amplification.

    NARCIS (Netherlands)

    Direito, S.; Zaura, E.; Little, M.; Ehrenfreund, P.; Roling, W.F.M.

    2014-01-01

    Whole genome amplification methods facilitate the detection and characterization of microbial communities in low biomass environments. We examined the extent to which the actual community structure is reliably revealed and factors contributing to bias. One widely used [multiple displacement

  14. The Micronutrient Genomics Project: A community-driven knowledge base for micronutrient research

    NARCIS (Netherlands)

    Ommen, B. van; El-Sohemy, A.; Hesketh, J.; Kaput, J.; Fenech, M.; Evelo, C.T.; McArdle, H.J.; Bouwman, J.; Lietz, G.; Mathers, J.C.; Fairweather-Tait, S.; Kranen, H. van; Elliott, R.; Wopereis, S.; Ferguson, L.R.; Méplan, C.; Perozzi, G.; Allen, L.; Rivero, D.

    2010-01-01

    Micronutrients influence multiple metabolic pathways including oxidative and inflammatory processes. Optimum micronutrient supply is important for the maintenance of homeostasis in metabolism and, ultimately, for maintaining good health. With advances in systems biology and genomics technologies, it

  15. Systematic evaluation of bias in microbial community profiles induced by whole genome amplification

    NARCIS (Netherlands)

    Direito, S.O.L.; Zaura, E.; Little, M.; Ehrenfreund, P.; Röling, W.F.M.

    2014-01-01

    Whole genome amplification methods facilitate the detection and characterization of microbial communities in low biomass environments. We examined the extent to which the actual community structure is reliably revealed and factors contributing to bias. One widely used [multiple displacement

  16. Sparse redundancy analysis of high-dimensional genetic and genomic data

    NARCIS (Netherlands)

    Csala, Attila; Voorbraak, Frans P. J. M.; Zwinderman, Aeilko H.; Hof, Michel H.

    2017-01-01

    Motivation: Recent technological developments have enabled the possibility of genetic and genomic integrated data analysis approaches, where multiple omics datasets from various biological levels are combined and used to describe (disease) phenotypic variations. The main goal is to explain and

  17. Multiple Input - Multiple Output (MIMO) SAR

    Data.gov (United States)

    National Aeronautics and Space Administration — This effort will research and implement advanced Multiple-Input Multiple-Output (MIMO) Synthetic Aperture Radar (SAR) techniques which have the potential to improve...

  18. Partitioning of genomic variance using biological pathways

    DEFF Research Database (Denmark)

    Edwards, Stefan McKinnon; Janss, Luc; Madsen, Per

    and that these variants are enriched for genes that are connected in biological pathways or for likely functional effects on genes. These biological findings provide valuable insight for developing better genomic models. These are statistical models for predicting complex trait phenotypes on the basis of SNP......-data and trait phenotypes and can account for a much larger fraction of the heritable component. A disadvantage is that this “black-box” modelling approach conceals the biological mechanisms underlying the trait. We propose to open the “black-box” by building SNP-set genomic models that evaluate the collective...... action of multiple SNPs in genes, biological pathways or other external findings on the trait phenotype. As proof of concept we have tested the modelling framework on several traits in dairy cattle....

  19. Passage relevance models for genomics search

    Directory of Open Access Journals (Sweden)

    Frieder Ophir

    2009-03-01

    Full Text Available Abstract We present a passage relevance model for integrating syntactic and semantic evidence of biomedical concepts and topics using a probabilistic graphical model. Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field. The probability of a passage being relevant to a biologist's information need is represented as the joint distribution across all potential functions. Relevance model feedback of top ranked passages is used to improve distributional estimates of query concepts and topics in context, and a dimensional indexing strategy is used for efficient aggregation of concept and term statistics. By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision. Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.

  20. Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine

    Directory of Open Access Journals (Sweden)

    Wenming Xiao

    2016-04-01

    Full Text Available Even though each of us shares more than 99% of the DNA sequences in our genome, there are millions of sequence codes or structure in small regions that differ between individuals, giving us different characteristics of appearance or responsiveness to medical treatments. Currently, genetic variants in diseased tissues, such as tumors, are uncovered by exploring the differences between the reference genome and the sequences detected in the diseased tissue. However, the public reference genome was derived with the DNA from multiple individuals. As a result of this, the reference genome is incomplete and may misrepresent the sequence variants of the general population. The more reliable solution is to compare sequences of diseased tissue with its own genome sequence derived from tissue in a normal state. As the price to sequence the human genome has dropped dramatically to around $1000, it shows a promising future of documenting the personal genome for every individual. However, de novo assembly of individual genomes at an affordable cost is still challenging. Thus, till now, only a few human genomes have been fully assembled. In this review, we introduce the history of human genome sequencing and the evolution of sequencing platforms, from Sanger sequencing to emerging “third generation sequencing” technologies. We present the currently available de novo assembly and post-assembly software packages for human genome assembly and their requirements for computational infrastructures. We recommend that a combined hybrid assembly with long and short reads would be a promising way to generate good quality human genome assemblies and specify parameters for the quality assessment of assembly outcomes. We provide a perspective view of the benefit of using personal genomes as references and suggestions for obtaining a quality personal genome. Finally, we discuss the usage of the personal genome in aiding vaccine design and development, monitoring host

  1. phiGENOME: an integrative navigation throughout bacteriophage genomes.

    Science.gov (United States)

    Stano, Matej; Klucar, Lubos

    2011-11-01

    phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright © 2011 Elsevier Inc. All rights reserved.

  2. Complete genome sequence of Truepera radiovictrix type strain (RQ-24).

    Science.gov (United States)

    Ivanova, Natalia; Rohde, Christine; Munk, Christine; Nolan, Matt; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Deshpande, Shweta; Cheng, Jan-Fang; Tapia, Roxane; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Brambilla, Evelyne; Rohde, Manfred; Göker, Markus; Tindall, Brian J; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

    2011-02-22

    Truepera radiovictrix Albuquerque et al. 2005 is the type species of the genus Truepera within the phylum "Deinococcus/Thermus". T. radiovictrix is of special interest not only because of its isolated phylogenetic location in the order Deinococcales, but also because of its ability to grow under multiple extreme conditions in alkaline, moderately saline, and high temperature habitats. Of particular interest is the fact that, T. radiovictrix is also remarkably resistant to ionizing radiation, a feature it shares with members of the genus Deinococcus. This is the first completed genome sequence of a member of the family Trueperaceae and the fourth type strain genome sequence from a member of the order Deinococcales. The 3,260,398 bp long genome with its 2,994 protein-coding and 52 RNA genes consists of one circular chromosome and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  3. The genomic landscape at a late stage of stickleback speciation: High genomic divergence interspersed by small localized regions of introgression.

    Directory of Open Access Journals (Sweden)

    Mark Ravinet

    2018-05-01

    Full Text Available Speciation is a continuous process and analysis of species pairs at different stages of divergence provides insight into how it unfolds. Previous genomic studies on young species pairs have revealed peaks of divergence and heterogeneous genomic differentiation. Yet less known is how localised peaks of differentiation progress to genome-wide divergence during the later stages of speciation in the presence of persistent gene flow. Spanning the speciation continuum, stickleback species pairs are ideal for investigating how genomic divergence builds up during speciation. However, attention has largely focused on young postglacial species pairs, with little knowledge of the genomic signatures of divergence and introgression in older stickleback systems. The Japanese stickleback species pair, composed of the Pacific Ocean three-spined stickleback (Gasterosteus aculeatus and the Japan Sea stickleback (G. nipponicus, which co-occur in the Japanese islands, is at a late stage of speciation. Divergence likely started well before the end of the last glacial period and crosses between Japan Sea females and Pacific Ocean males result in hybrid male sterility. Here we use coalescent analyses and Approximate Bayesian Computation to show that the two species split approximately 0.68-1 million years ago but that they have continued to exchange genes at a low rate throughout divergence. Population genomic data revealed that, despite gene flow, a high level of genomic differentiation is maintained across the majority of the genome. However, we identified multiple, small regions of introgression, occurring mainly in areas of low recombination rate. Our results demonstrate that a high level of genome-wide divergence can establish in the face of persistent introgression and that gene flow can be localized to small genomic regions at the later stages of speciation with gene flow.

  4. The genomic landscape at a late stage of stickleback speciation: High genomic divergence interspersed by small localized regions of introgression.

    Science.gov (United States)

    Ravinet, Mark; Yoshida, Kohta; Shigenobu, Shuji; Toyoda, Atsushi; Fujiyama, Asao; Kitano, Jun

    2018-05-01

    Speciation is a continuous process and analysis of species pairs at different stages of divergence provides insight into how it unfolds. Previous genomic studies on young species pairs have revealed peaks of divergence and heterogeneous genomic differentiation. Yet less known is how localised peaks of differentiation progress to genome-wide divergence during the later stages of speciation in the presence of persistent gene flow. Spanning the speciation continuum, stickleback species pairs are ideal for investigating how genomic divergence builds up during speciation. However, attention has largely focused on young postglacial species pairs, with little knowledge of the genomic signatures of divergence and introgression in older stickleback systems. The Japanese stickleback species pair, composed of the Pacific Ocean three-spined stickleback (Gasterosteus aculeatus) and the Japan Sea stickleback (G. nipponicus), which co-occur in the Japanese islands, is at a late stage of speciation. Divergence likely started well before the end of the last glacial period and crosses between Japan Sea females and Pacific Ocean males result in hybrid male sterility. Here we use coalescent analyses and Approximate Bayesian Computation to show that the two species split approximately 0.68-1 million years ago but that they have continued to exchange genes at a low rate throughout divergence. Population genomic data revealed that, despite gene flow, a high level of genomic differentiation is maintained across the majority of the genome. However, we identified multiple, small regions of introgression, occurring mainly in areas of low recombination rate. Our results demonstrate that a high level of genome-wide divergence can establish in the face of persistent introgression and that gene flow can be localized to small genomic regions at the later stages of speciation with gene flow.

  5. BLAST Ring Image Generator (BRIG: simple prokaryote genome comparisons

    Directory of Open Access Journals (Sweden)

    Beatson Scott A

    2011-08-01

    Full Text Available Abstract Background Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. Results BLAST Ring Image Generator (BRIG can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons

  6. BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons.

    Science.gov (United States)

    Alikhan, Nabil-Fareed; Petty, Nicola K; Ben Zakour, Nouri L; Beatson, Scott A

    2011-08-08

    Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. There is a clear need for a user

  7. Nannochloropsis genomes reveal evolution of microalgal oleaginous traits.

    Directory of Open Access Journals (Sweden)

    Dongmei Wang

    2014-01-01

    Full Text Available Oleaginous microalgae are promising feedstock for biofuels, yet the genetic diversity, origin and evolution of oleaginous traits remain largely unknown. Here we present a detailed phylogenomic analysis of five oleaginous Nannochloropsis species (a total of six strains and one time-series transcriptome dataset for triacylglycerol (TAG synthesis on one representative strain. Despite small genome sizes, high coding potential and relative paucity of mobile elements, the genomes feature small cores of ca. 2,700 protein-coding genes and a large pan-genome of >38,000 genes. The six genomes share key oleaginous traits, such as the enrichment of selected lipid biosynthesis genes and certain glycoside hydrolase genes that potentially shift carbon flux from chrysolaminaran to TAG synthesis. The eleven type II diacylglycerol acyltransferase genes (DGAT-2 in every strain, each expressed during TAG synthesis, likely originated from three ancient genomes, including the secondary endosymbiosis host and the engulfed green and red algae. Horizontal gene transfers were inferred in most lipid synthesis nodes with expanded gene doses and many glycoside hydrolase genes. Thus multiple genome pooling and horizontal genetic exchange, together with selective inheritance of lipid synthesis genes and species-specific gene loss, have led to the enormous genetic apparatus for oleaginousness and the wide genomic divergence among present-day Nannochloropsis. These findings have important implications in the screening and genetic engineering of microalgae for biofuels.

  8. Implementation of genomics research in Africa: challenges and recommendations.

    Science.gov (United States)

    Adebamowo, Sally N; Francis, Veronica; Tambo, Ernest; Diallo, Seybou H; Landouré, Guida; Nembaware, Victoria; Dareng, Eileen; Muhamed, Babu; Odutola, Michael; Akeredolu, Teniola; Nerima, Barbara; Ozumba, Petronilla J; Mbhele, Slee; Ghanash, Anita; Wachinou, Ablo P; Ngomi, Nicholas

    2018-01-01

    There is exponential growth in the interest and implementation of genomics research in Africa. This growth has been facilitated by the Human Hereditary and Health in Africa (H3Africa) initiative, which aims to promote a contemporary research approach to the study of genomics and environmental determinants of common diseases in African populations. The purpose of this article is to describe important challenges affecting genomics research implementation in Africa. The observations, challenges and recommendations presented in this article were obtained through discussions by African scientists at teleconferences and face-to-face meetings, seminars at consortium conferences and in-depth individual discussions. Challenges affecting genomics research implementation in Africa, which are related to limited resources include ill-equipped facilities, poor accessibility to research centers, lack of expertise and an enabling environment for research activities in local hospitals. Challenges related to the research study include delayed funding, extensive procedures and interventions requiring multiple visits, delays setting up research teams and insufficient staff training, language barriers and an underappreciation of cultural norms. While many African countries are struggling to initiate genomics projects, others have set up genomics research facilities that meet international standards. The lessons learned in implementing successful genomics projects in Africa are recommended as strategies to overcome these challenges. These recommendations may guide the development and application of new research programs in low-resource settings.

  9. Implementation of genomics research in Africa: challenges and recommendations

    Science.gov (United States)

    Adebamowo, Sally N.; Francis, Veronica; Tambo, Ernest; Diallo, Seybou H.; Landouré, Guida; Nembaware, Victoria; Dareng, Eileen; Muhamed, Babu; Odutola, Michael; Akeredolu, Teniola; Nerima, Barbara; Ozumba, Petronilla J.; Mbhele, Slee; Ghanash, Anita; Wachinou, Ablo P.; Ngomi, Nicholas

    2018-01-01

    ABSTRACT Background: There is exponential growth in the interest and implementation of genomics research in Africa. This growth has been facilitated by the Human Hereditary and Health in Africa (H3Africa) initiative, which aims to promote a contemporary research approach to the study of genomics and environmental determinants of common diseases in African populations. Objective: The purpose of this article is to describe important challenges affecting genomics research implementation in Africa. Methods: The observa