WorldWideScience

Sample records for genome level analyses

  1. Genome-wide DNA polymorphism analyses using VariScan

    Directory of Open Access Journals (Sweden)

    Vilella Albert J

    2006-09-01

    Full Text Available Abstract Background DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. Results We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i exhaustive population-genetic analyses including those based on the coalescent theory; ii analysis adapted to the shallow data generated by the high-throughput genome projects; iii use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v visualization of the results integrated with current genome annotations in commonly available genome browsers. Conclusion VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.

  2. PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses in R

    OpenAIRE

    Pfeifer, Bastian; Wittelsbürger, Ulrich; Ramos-Onsins, Sebastian E.; Lercher, Martin J.

    2014-01-01

    Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a populati...

  3. Hal: an automated pipeline for phylogenetic analyses of genomic data.

    Science.gov (United States)

    Robbertse, Barbara; Yoder, Ryan J; Boyd, Alex; Reeves, John; Spatafora, Joseph W

    2011-02-07

    The rapid increase in genomic and genome-scale data is resulting in unprecedented levels of discrete sequence data available for phylogenetic analyses. Major analytical impasses exist, however, prior to analyzing these data with existing phylogenetic software. Obstacles include the management of large data sets without standardized naming conventions, identification and filtering of orthologous clusters of proteins or genes, and the assembly of alignments of orthologous sequence data into individual and concatenated super alignments. Here we report the production of an automated pipeline, Hal that produces multiple alignments and trees from genomic data. These alignments can be produced by a choice of four alignment programs and analyzed by a variety of phylogenetic programs. In short, the Hal pipeline connects the programs BLASTP, MCL, user specified alignment programs, GBlocks, ProtTest and user specified phylogenetic programs to produce species trees. The script is available at sourceforge (http://sourceforge.net/projects/bio-hal/). The results from an example analysis of Kingdom Fungi are briefly discussed.

  4. Quantitative metagenomic analyses based on average genome size normalization

    DEFF Research Database (Denmark)

    Frank, Jeremy Alexander; Sørensen, Søren Johannes

    2011-01-01

    provide not just a census of the community members but direct information on metabolic capabilities and potential interactions among community members. Here we introduce a method for the quantitative characterization and comparison of microbial communities based on the normalization of metagenomic data...... marine sources using both conventional small-subunit (SSU) rRNA gene analyses and our quantitative method to calculate the proportion of genomes in each sample that are capable of a particular metabolic trait. With both environments, to determine what proportion of each community they make up and how......). These analyses demonstrate how genome proportionality compares to SSU rRNA gene relative abundance and how factors such as average genome size and SSU rRNA gene copy number affect sampling probability and therefore both types of community analysis....

  5. Genome sequencing elucidates Sardinian genetic architecture and augments association analyses for lipid and blood inflammatory markers

    Science.gov (United States)

    Zoledziewska, Magdalena; Mulas, Antonella; Pistis, Giorgio; Steri, Maristella; Danjou, Fabrice; Kwong, Alan; Ortega del Vecchyo, Vicente Diego; Chiang, Charleston W. K.; Bragg-Gresham, Jennifer; Pitzalis, Maristella; Nagaraja, Ramaiah; Tarrier, Brendan; Brennan, Christine; Uzzau, Sergio; Fuchsberger, Christian; Atzeni, Rossano; Reinier, Frederic; Berutti, Riccardo; Huang, Jie; Timpson, Nicholas J; Toniolo, Daniela; Gasparini, Paolo; Malerba, Giovanni; Dedoussis, George; Zeggini, Eleftheria; Soranzo, Nicole; Jones, Chris; Lyons, Robert; Angius, Andrea; Kang, Hyun M.; Novembre, John; Sanna, Serena; Schlessinger, David; Cucca, Francesco; Abecasis, Gonçalo R

    2015-01-01

    We report ~17.6M genetic variants from whole-genome sequencing of 2,120 Sardinians; 22% are absent from prior sequencing-based compilations and enriched for predicted functional consequence. Furthermore, ~76K variants common in our sample (frequency >5%) are rare elsewhere (Genomes Project). We assessed the impact of these variants on circulating lipid levels and five inflammatory biomarkers. Fourteen signals, including two major new loci, were observed for lipid levels, and 19, including two novel loci, for inflammatory markers. New associations would be missed in analyses based on 1000 Genomes data, underlining the advantages of large-scale sequencing in this founder population. PMID:26366554

  6. Statistical analyses of conserved features of genomic islands in bacteria.

    Science.gov (United States)

    Guo, F-B; Xia, Z-K; Wei, W; Zhao, H-L

    2014-03-17

    We performed statistical analyses of five conserved features of genomic islands of bacteria. Analyses were made based on 104 known genomic islands, which were identified by comparative methods. Four of these features include sequence size, abnormal G+C content, flanking tRNA gene, and embedded mobility gene, which are frequently investigated. One relatively new feature, G+C homogeneity, was also investigated. Among the 104 known genomic islands, 88.5% were found to fall in the typical length of 10-200 kb and 80.8% had G+C deviations with absolute values larger than 2%. For the 88 genomic islands whose hosts have been sequenced and annotated, 52.3% of them were found to have flanking tRNA genes and 64.7% had embedded mobility genes. For the homogeneity feature, 85% had an h homogeneity index less than 0.1, indicating that their G+C content is relatively uniform. Taking all the five features into account, 87.5% of 88 genomic islands had three of them. Only one genomic island had only one conserved feature and none of the genomic islands had zero features. These statistical results should help to understand the general structure of known genomic islands. We found that larger genomic islands tend to have relatively small G+C deviations relative to absolute values. For example, the absolute G+C deviations of 9 genomic islands longer than 100,000 bp were all less than 5%. This is a novel but reasonable result given that larger genomic islands should have greater restrictions in their G+C contents, in order to maintain the stable G+C content of the recipient genome.

  7. Quality control and conduct of genome-wide association meta-analyses

    DEFF Research Database (Denmark)

    Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C

    2014-01-01

    Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC...

  8. Genomic analyses of modern dog breeds.

    Science.gov (United States)

    Parker, Heidi G

    2012-02-01

    A rose may be a rose by any other name, but when you call a dog a poodle it becomes a very different animal than if you call it a bulldog. Both the poodle and the bulldog are examples of dog breeds of which there are >400 recognized worldwide. Breed creation has played a significant role in shaping the modern dog from the length of his leg to the cadence of his bark. The selection and line-breeding required to maintain a breed has also reshaped the genome of the dog, resulting in a unique genetic pattern for each breed. The breed-based population structure combined with extensive morphologic variation and shared human environments have made the dog a popular model for mapping both simple and complex traits and diseases. In order to obtain the most benefit from the dog as a genetic system, it is necessary to understand the effect structured breeding has had on the genome of the species. That is best achieved by looking at genomic analyses of the breeds, their histories, and their relationships to each other.

  9. Genome-based comparative analyses of Antarctic and temperate species of Paenibacillus.

    Directory of Open Access Journals (Sweden)

    Melissa Dsouza

    Full Text Available Antarctic soils represent a unique environment characterised by extremes of temperature, salinity, elevated UV radiation, low nutrient and low water content. Despite the harshness of this environment, members of 15 bacterial phyla have been identified in soils of the Ross Sea Region (RSR. However, the survival mechanisms and ecological roles of these phyla are largely unknown. The aim of this study was to investigate whether strains of Paenibacillus darwinianus owe their resilience to substantial genomic changes. For this, genome-based comparative analyses were performed on three P. darwinianus strains, isolated from gamma-irradiated RSR soils, together with nine temperate, soil-dwelling Paenibacillus spp. The genome of each strain was sequenced to over 1,000-fold coverage, then assembled into contigs totalling approximately 3 Mbp per genome. Based on the occurrence of essential, single-copy genes, genome completeness was estimated at approximately 88%. Genome analysis revealed between 3,043-3,091 protein-coding sequences (CDSs, primarily associated with two-component systems, sigma factors, transporters, sporulation and genes induced by cold-shock, oxidative and osmotic stresses. These comparative analyses provide an insight into the metabolic potential of P. darwinianus, revealing potential adaptive mechanisms for survival in Antarctic soils. However, a large proportion of these mechanisms were also identified in temperate Paenibacillus spp., suggesting that these mechanisms are beneficial for growth and survival in a range of soil environments. These analyses have also revealed that the P. darwinianus genomes contain significantly fewer CDSs and have a lower paralogous content. Notwithstanding the incompleteness of the assemblies, the large differences in genome sizes, determined by the number of genes in paralogous clusters and the CDS content, are indicative of genome content scaling. Finally, these sequences are a resource for further

  10. Genomic analyses of the CAM plant pineapple.

    Science.gov (United States)

    Zhang, Jisen; Liu, Juan; Ming, Ray

    2014-07-01

    The innovation of crassulacean acid metabolism (CAM) photosynthesis in arid and/or low CO2 conditions is a remarkable case of adaptation in flowering plants. As the most important crop that utilizes CAM photosynthesis, the genetic and genomic resources of pineapple have been developed over many years. Genetic diversity studies using various types of DNA markers led to the reclassification of the two genera Ananas and Pseudananas and nine species into one genus Ananas and two species, A. comosus and A. macrodontes with five botanical varieties in A. comosus. Five genetic maps have been constructed using F1 or F2 populations, and high-density genetic maps generated by genotype sequencing are essential resources for sequencing and assembling the pineapple genome and for marker-assisted selection. There are abundant expression sequence tag resources but limited genomic sequences in pineapple. Genes involved in the CAM pathway has been analysed in several CAM plants but only a few of them are from pineapple. A reference genome of pineapple is being generated and will accelerate genetic and genomic research in this major CAM crop. This reference genome of pineapple provides the foundation for studying the origin and regulatory mechanism of CAM photosynthesis, and the opportunity to evaluate the classification of Ananas species and botanical cultivars. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  11. Genome-Wide Analyses Suggest Mechanisms Involving Early B-Cell Development in Canine IgA Deficiency.

    Directory of Open Access Journals (Sweden)

    Mia Olsson

    Full Text Available Immunoglobulin A deficiency (IgAD is the most common primary immune deficiency disorder in both humans and dogs, characterized by recurrent mucosal tract infections and a predisposition for allergic and other immune mediated diseases. In several dog breeds, low IgA levels have been observed at a high frequency and with a clinical resemblance to human IgAD. In this study, we used genome-wide association studies (GWAS to identify genomic regions associated with low IgA levels in dogs as a comparative model for human IgAD. We used a novel percentile groups-approach to establish breed-specific cut-offs and to perform analyses in a close to continuous manner. GWAS performed in four breeds prone to low IgA levels (German shepherd, Golden retriever, Labrador retriever and Shar-Pei identified 35 genomic loci suggestively associated (p <0.0005 to IgA levels. In German shepherd, three genomic regions (candidate genes include KIRREL3 and SERPINA9 were genome-wide significantly associated (p <0.0002 with IgA levels. A ~20kb long haplotype on CFA28, significantly associated (p = 0.0005 to IgA levels in Shar-Pei, was positioned within the first intron of the gene SLIT1. Both KIRREL3 and SLIT1 are highly expressed in the central nervous system and in bone marrow and are potentially important during B-cell development. SERPINA9 expression is restricted to B-cells and peaks at the time-point when B-cells proliferate into antibody-producing plasma cells. The suggestively associated regions were enriched for genes in Gene Ontology gene sets involving inflammation and early immune cell development.

  12. Tracing common origins of Genomic Islands in prokaryotes based on genome signature analyses.

    Science.gov (United States)

    van Passel, Mark Wj

    2011-09-01

    Horizontal gene transfer constitutes a powerful and innovative force in evolution, but often little is known about the actual origins of transferred genes. Sequence alignments are generally of limited use in tracking the original donor, since still only a small fraction of the total genetic diversity is thought to be uncovered. Alternatively, approaches based on similarities in the genome specific relative oligonucleotide frequencies do not require alignments. Even though the exact origins of horizontally transferred genes may still not be established using these compositional analyses, it does suggest that compositionally very similar regions are likely to have had a common origin. These analyses have shown that up to a third of large acquired gene clusters that reside in the same genome are compositionally very similar, indicative of a shared origin. This brings us closer to uncovering the original donors of horizontally transferred genes, and could help in elucidating possible regulatory interactions between previously unlinked sequences.

  13. Nature vs. nurture in human sociality: multi-level genomic analyses of social conformity.

    Science.gov (United States)

    Chen, Biqing; Zhu, Zijian; Wang, Yingying; Ding, Xiaohu; Guo, Xiaobo; He, Mingguang; Fang, Wan; Zhou, Qin; Zhou, Shanbi; Lei, Han; Huang, Ailong; Chen, Tingmei; Ni, Dongsheng; Gu, Yuping; Liu, Jianing; Rao, Yi

    2018-05-01

    Social conformity is fundamental to human societies and has been studied for more than six decades, but our understanding of its mechanisms remains limited. Individual differences in conformity have been attributed to social and cultural environmental influences, but not to genes. Here we demonstrate a genetic contribution to conformity after analyzing 1,140 twins and single-nucleotide polymorphism (SNP)-based studies of 2,130 young adults. A two-step genome-wide association study (GWAS) revealed replicable associations in 9 genomic loci, and a meta-analysis of three GWAS with a sample size of ~2,600 further confirmed one locus, corresponding to the NAV3 (Neuron Navigator 3) gene which encodes a protein important for axon outgrowth and guidance. Further multi-level (haplotype, gene, pathway) GWAS strongly associated genes including NAV3, PTPRD (protein tyrosine phosphatase receptor type D), ARL10 (ADP ribosylation factor-like GTPase 10), and CTNND2 (catenin delta 2), with conformity. Magnetic resonance imaging of 64 subjects shows correlation of activation or structural features of brain regions with the SNPs of these genes, supporting their functional significance. Our results suggest potential moderate genetic influence on conformity, implicate several specific genetic elements in conformity and will facilitate further research on cellular and molecular mechanisms underlying human conformity.

  14. The complete mitochondrial genome of the styloperlid stonefly species Styloperla spinicercia Wu (Insecta: Plecoptera) with family-level phylogenetic analyses of the Pteronarcyoidea.

    Science.gov (United States)

    Wang, Ying; Cao, Jinjun; Li, Weihai

    2017-03-13

    We present the complete mitochondrial (mt) genome sequence of the stonefly, Styloperla spinicercia Wu, 1935 (Plecoptera: Styloperlidae), the type species of the genus Styloperla and the first complete mt genome for the family Styloperlidae. The genome is circular, 16,129 base pairs long, has an A+T content of 70.7%, and contains 37 genes including the large and small ribosomal RNA (rRNA) subunits, 13 protein coding genes (PCGs), 22 tRNA genes and a large non-coding region (CR). All of the PCGs use the standard initiation codon ATN except ND1 and ND5, which start with TTG and GTG. Twelve of the PCGs stop with conventional terminal codons TAA and TAG, except ND5 which shows an incomplete terminator signal T. All tRNAs have the classic clover-leaf structures with the dihydrouridine (DHU) arm of tRNASer(AGN) forming a simple loop. Secondary structures of the two ribosomal RNAs are presented with reference to previous models. The structural elements and the variable numbers of tandem repeats are described within the control region. Phylogenetic analyses using both Bayesian (BI) and Maximum Likelihood (ML) methods support the previous hypotheses regarding family level relationships within the Pteronarcyoidea. The genetic distance calculated based on 13 PCGs and two rRNAs between Styloperla sp. and S. spinicercia is provided and interspecific divergence is discussed.

  15. [Complete genome sequencing and analyses of rabies viruses isolated from wild animals (Chinese Ferret-Badger) in Zhejiang province].

    Science.gov (United States)

    Lei, Yong-Liang; Wang, Xiao-Guang; Liu, Fu-Ming; Chen, Xiu-Ying; Ye, Bi-Feng; Mei, Jian-Hua; Lan, Jin-Quan; Tang, Qing

    2009-08-01

    Based on sequencing the full-length genomes of two Chinese Ferret-Badger, we analyzed the properties of rabies viruses genetic variation in molecular level to get information on prevalence and variation of rabies viruses in Zhejiang, and to enrich the genome database of rabies viruses street strains isolated from Chinese wildlife. Overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses of the N genes from Chinese Ferret-Badger, sika deer, vole, dog. Vaccine strains were then determined. The two full-length genomes were completely sequenced to find out that they had the same genetic structure with 11 923 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions (IGRs), 423 nts-Pseudogene-like sequence (Psi), 70 nts-Trailer. The two full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by blast and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the two full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so that the nucleotide mutations happened in these two genomes were most probably as synonymous mutations. Compared to the referenced rabies viruses, the lengths of the five protein coding regions did not show any changes or recombination, but only with a few-point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the two ferret badgers genomes were similar to the referenced vaccine or street strains. The two strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessing the distinct geographyphic characteristics of China. All the evidence suggested a cue that these two ferret badgers

  16. CrusView: a Java-based visualization platform for comparative genomics analyses in Brassicaceae species.

    Science.gov (United States)

    Chen, Hao; Wang, Xiangfeng

    2013-09-01

    In plants and animals, chromosomal breakage and fusion events based on conserved syntenic genomic blocks lead to conserved patterns of karyotype evolution among species of the same family. However, karyotype information has not been well utilized in genomic comparison studies. We present CrusView, a Java-based bioinformatic application utilizing Standard Widget Toolkit/Swing graphics libraries and a SQLite database for performing visualized analyses of comparative genomics data in Brassicaceae (crucifer) plants. Compared with similar software and databases, one of the unique features of CrusView is its integration of karyotype information when comparing two genomes. This feature allows users to perform karyotype-based genome assembly and karyotype-assisted genome synteny analyses with preset karyotype patterns of the Brassicaceae genomes. Additionally, CrusView is a local program, which gives its users high flexibility when analyzing unpublished genomes and allows users to upload self-defined genomic information so that they can visually study the associations between genome structural variations and genetic elements, including chromosomal rearrangements, genomic macrosynteny, gene families, high-frequency recombination sites, and tandem and segmental duplications between related species. This tool will greatly facilitate karyotype, chromosome, and genome evolution studies using visualized comparative genomics approaches in Brassicaceae species. CrusView is freely available at http://www.cmbb.arizona.edu/CrusView/.

  17. Comparative genomics analyses revealed two virulent Listeria monocytogenes strains isolated from ready-to-eat food.

    Science.gov (United States)

    Lim, Shu Yong; Yap, Kien-Pong; Thong, Kwai Lin

    2016-01-01

    Listeria monocytogenes is an important foodborne pathogen that causes considerable morbidity in humans with high mortality rates. In this study, we have sequenced the genomes and performed comparative genomics analyses on two strains, LM115 and LM41, isolated from ready-to-eat food in Malaysia. The genome size of LM115 and LM41 was 2,959,041 and 2,963,111 bp, respectively. These two strains shared approximately 90% homologous genes. Comparative genomics and phylogenomic analyses revealed that LM115 and LM41 were more closely related to the reference strains F2365 and EGD-e, respectively. Our virulence profiling indicated a total of 31 virulence genes shared by both analysed strains. These shared genes included those that encode for internalins and L. monocytogenes pathogenicity island 1 (LIPI-1). Both the Malaysian L. monocytogenes strains also harboured several genes associated with stress tolerance to counter the adverse conditions. Seven antibiotic and efflux pump related genes which may confer resistance against lincomycin, erythromycin, fosfomycin, quinolone, tetracycline, and penicillin, and macrolides were identified in the genomes of both strains. Whole genome sequencing and comparative genomics analyses revealed two virulent L. monocytogenes strains isolated from ready-to-eat foods in Malaysia. The identification of strains with pathogenic, persistent, and antibiotic resistant potentials from minimally processed food warrant close attention from both healthcare and food industry.

  18. Comparative Genomic and Transcriptional Analyses of CRISPR Systems Across the Genus Pyrobaculum

    Directory of Open Access Journals (Sweden)

    David L Bernick

    2012-07-01

    Full Text Available Within the domain Archaea, the CRISPR immune system appears to be nearly ubiquitous based on computational genome analyses. Initial studies in bacteria demonstrated that the CRISPR system targets invading plasmid and viral DNA. Recent experiments in the model archaeon Pyrococcus furiosus uncovered a novel RNA-targeting variant of the CRISPR system potentially unique to archaea. Because our understanding of CRISPR system evolution in other archaea is limited, we have taken a comparative genomic and transcriptomic view of the CRISPR arrays across six diverse species within the crenarchaeal genus Pyrobaculum. We present transcriptional data from each of four species in the genus (P. aerophilum, P. islandicum, P. calidifontis, P. arsenaticum, analyzing mature CRISPR-associated small RNA abundance from over 20 arrays. Within the genus, there is remarkable conservation of CRISPR array structure, as well as unique features that are have not been studied in other archaeal systems. These unique features include: a nearly invariant CRISPR promoter, conservation of direct repeat families, the 5' polarity of CRISPR-associated small RNA abundance, and a novel CRISPR-specific association with homologues of nurA and herA. These analyses provide a genus-level evolutionary perspective on archaeal CRISPR systems, broadening our understanding beyond existing non-comparative model systems.

  19. CrusView: A Java-Based Visualization Platform for Comparative Genomics Analyses in Brassicaceae Species[OPEN

    Science.gov (United States)

    Chen, Hao; Wang, Xiangfeng

    2013-01-01

    In plants and animals, chromosomal breakage and fusion events based on conserved syntenic genomic blocks lead to conserved patterns of karyotype evolution among species of the same family. However, karyotype information has not been well utilized in genomic comparison studies. We present CrusView, a Java-based bioinformatic application utilizing Standard Widget Toolkit/Swing graphics libraries and a SQLite database for performing visualized analyses of comparative genomics data in Brassicaceae (crucifer) plants. Compared with similar software and databases, one of the unique features of CrusView is its integration of karyotype information when comparing two genomes. This feature allows users to perform karyotype-based genome assembly and karyotype-assisted genome synteny analyses with preset karyotype patterns of the Brassicaceae genomes. Additionally, CrusView is a local program, which gives its users high flexibility when analyzing unpublished genomes and allows users to upload self-defined genomic information so that they can visually study the associations between genome structural variations and genetic elements, including chromosomal rearrangements, genomic macrosynteny, gene families, high-frequency recombination sites, and tandem and segmental duplications between related species. This tool will greatly facilitate karyotype, chromosome, and genome evolution studies using visualized comparative genomics approaches in Brassicaceae species. CrusView is freely available at http://www.cmbb.arizona.edu/CrusView/. PMID:23898041

  20. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    Science.gov (United States)

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  1. Genome size analyses of Pucciniales reveal the largest fungal genomes.

    Science.gov (United States)

    Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T; Loureiro, João; Talhinhas, Pedro

    2014-01-01

    Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.

  2. Complete genome of a European hepatitis C virus subtype 1g isolate: phylogenetic and genetic analyses.

    Science.gov (United States)

    Bracho, Maria A; Saludes, Verónica; Martró, Elisa; Bargalló, Ana; González-Candelas, Fernando; Ausina, Vicent

    2008-06-05

    Hepatitis C virus isolates have been classified into six main genotypes and a variable number of subtypes within each genotype, mainly based on phylogenetic analysis. Analyses of the genetic relationship among genotypes and subtypes are more reliable when complete genome sequences (or at least the full coding region) are used; however, so far 31 of 80 confirmed or proposed subtypes have at least one complete genome available. Of these, 20 correspond to confirmed subtypes of epidemic interest. We present and analyse the first complete genome sequence of a HCV subtype 1g isolate. Phylogenetic and genetic distance analyses reveal that HCV-1g is the most divergent subtype among the HCV-1 confirmed subtypes. Potential genomic recombination events between genotypes or subtype 1 genomes were ruled out. We demonstrate phylogenetic congruence of previously deposited partial sequences of HCV-1g with respect to our sequence. In light of this, we propose changing the current status of its subtype-specific designation from provisional to confirmed.

  3. Comparative Genome Analyses of Serratia marcescens FS14 Reveals Its High Antagonistic Potential

    Science.gov (United States)

    Li, Pengpeng; Kwok, Amy H. Y.; Jiang, Jingwei; Ran, Tingting; Xu, Dongqing; Wang, Weiwu; Leung, Frederick C.

    2015-01-01

    S. marcescens FS14 was isolated from an Atractylodes macrocephala Koidz plant that was infected by Fusarium oxysporum and showed symptoms of root rot. With the completion of the genome sequence of FS14, the first comprehensive comparative-genomic analysis of the Serratia genus was performed. Pan-genome and COG analyses showed that the majority of the conserved core genes are involved in basic cellular functions, while genomic factors such as prophages contribute considerably to genome diversity. Additionally, a Type I restriction-modification system, a Type III secretion system and tellurium resistance genes are found in only some Serratia species. Comparative analysis further identified that S. marcescens FS14 possesses multiple mechanisms for antagonism against other microorganisms, including the production of prodigiosin, bacteriocins, and multi-antibiotic resistant determinants as well as chitinases. The presence of two evolutionarily distinct Type VI secretion systems (T6SSs) in FS14 may provide further competitive advantages for FS14 against other microbes. To our knowledge, this is the first report of comparative analysis on T6SSs in the genus, which identifies four types of T6SSs in Serratia spp.. Competition bioassays of FS14 against the vital plant pathogenic bacterium Ralstonia solanacearum and fungi Fusarium oxysporum and Sclerotinia sclerotiorum were performed to support our genomic analyses, in which FS14 demonstrated high antagonistic activities against both bacterial and fungal phytopathogens. PMID:25856195

  4. Comparative genome analyses of Serratia marcescens FS14 reveals its high antagonistic potential.

    Science.gov (United States)

    Li, Pengpeng; Kwok, Amy H Y; Jiang, Jingwei; Ran, Tingting; Xu, Dongqing; Wang, Weiwu; Leung, Frederick C

    2015-01-01

    S. marcescens FS14 was isolated from an Atractylodes macrocephala Koidz plant that was infected by Fusarium oxysporum and showed symptoms of root rot. With the completion of the genome sequence of FS14, the first comprehensive comparative-genomic analysis of the Serratia genus was performed. Pan-genome and COG analyses showed that the majority of the conserved core genes are involved in basic cellular functions, while genomic factors such as prophages contribute considerably to genome diversity. Additionally, a Type I restriction-modification system, a Type III secretion system and tellurium resistance genes are found in only some Serratia species. Comparative analysis further identified that S. marcescens FS14 possesses multiple mechanisms for antagonism against other microorganisms, including the production of prodigiosin, bacteriocins, and multi-antibiotic resistant determinants as well as chitinases. The presence of two evolutionarily distinct Type VI secretion systems (T6SSs) in FS14 may provide further competitive advantages for FS14 against other microbes. To our knowledge, this is the first report of comparative analysis on T6SSs in the genus, which identifies four types of T6SSs in Serratia spp.. Competition bioassays of FS14 against the vital plant pathogenic bacterium Ralstonia solanacearum and fungi Fusarium oxysporum and Sclerotinia sclerotiorum were performed to support our genomic analyses, in which FS14 demonstrated high antagonistic activities against both bacterial and fungal phytopathogens.

  5. Complete genome of a European hepatitis C virus subtype 1g isolate: phylogenetic and genetic analyses

    Directory of Open Access Journals (Sweden)

    Bargalló Ana

    2008-06-01

    Full Text Available Abstract Background Hepatitis C virus isolates have been classified into six main genotypes and a variable number of subtypes within each genotype, mainly based on phylogenetic analysis. Analyses of the genetic relationship among genotypes and subtypes are more reliable when complete genome sequences (or at least the full coding region are used; however, so far 31 of 80 confirmed or proposed subtypes have at least one complete genome available. Of these, 20 correspond to confirmed subtypes of epidemic interest. Results We present and analyse the first complete genome sequence of a HCV subtype 1g isolate. Phylogenetic and genetic distance analyses reveal that HCV-1g is the most divergent subtype among the HCV-1 confirmed subtypes. Potential genomic recombination events between genotypes or subtype 1 genomes were ruled out. We demonstrate phylogenetic congruence of previously deposited partial sequences of HCV-1g with respect to our sequence. Conclusion In light of this, we propose changing the current status of its subtype-specific designation from provisional to confirmed.

  6. Comparative genome analyses of Serratia marcescens FS14 reveals its high antagonistic potential.

    Directory of Open Access Journals (Sweden)

    Pengpeng Li

    Full Text Available S. marcescens FS14 was isolated from an Atractylodes macrocephala Koidz plant that was infected by Fusarium oxysporum and showed symptoms of root rot. With the completion of the genome sequence of FS14, the first comprehensive comparative-genomic analysis of the Serratia genus was performed. Pan-genome and COG analyses showed that the majority of the conserved core genes are involved in basic cellular functions, while genomic factors such as prophages contribute considerably to genome diversity. Additionally, a Type I restriction-modification system, a Type III secretion system and tellurium resistance genes are found in only some Serratia species. Comparative analysis further identified that S. marcescens FS14 possesses multiple mechanisms for antagonism against other microorganisms, including the production of prodigiosin, bacteriocins, and multi-antibiotic resistant determinants as well as chitinases. The presence of two evolutionarily distinct Type VI secretion systems (T6SSs in FS14 may provide further competitive advantages for FS14 against other microbes. To our knowledge, this is the first report of comparative analysis on T6SSs in the genus, which identifies four types of T6SSs in Serratia spp.. Competition bioassays of FS14 against the vital plant pathogenic bacterium Ralstonia solanacearum and fungi Fusarium oxysporum and Sclerotinia sclerotiorum were performed to support our genomic analyses, in which FS14 demonstrated high antagonistic activities against both bacterial and fungal phytopathogens.

  7. Genomic analyses inform on migration events during the peopling of Eurasia.

    Science.gov (United States)

    Pagani, Luca; Lawson, Daniel John; Jagoda, Evelyn; Mörseburg, Alexander; Eriksson, Anders; Mitt, Mario; Clemente, Florian; Hudjashov, Georgi; DeGiorgio, Michael; Saag, Lauri; Wall, Jeffrey D; Cardona, Alexia; Mägi, Reedik; Wilson Sayres, Melissa A; Kaewert, Sarah; Inchley, Charlotte; Scheib, Christiana L; Järve, Mari; Karmin, Monika; Jacobs, Guy S; Antao, Tiago; Iliescu, Florin Mircea; Kushniarevich, Alena; Ayub, Qasim; Tyler-Smith, Chris; Xue, Yali; Yunusbayev, Bayazit; Tambets, Kristiina; Mallick, Chandana Basu; Saag, Lehti; Pocheshkhova, Elvira; Andriadze, George; Muller, Craig; Westaway, Michael C; Lambert, David M; Zoraqi, Grigor; Turdikulova, Shahlo; Dalimova, Dilbar; Sabitov, Zhaxylyk; Sultana, Gazi Nurun Nahar; Lachance, Joseph; Tishkoff, Sarah; Momynaliev, Kuvat; Isakova, Jainagul; Damba, Larisa D; Gubina, Marina; Nymadawa, Pagbajabyn; Evseeva, Irina; Atramentova, Lubov; Utevska, Olga; Ricaut, François-Xavier; Brucato, Nicolas; Sudoyo, Herawati; Letellier, Thierry; Cox, Murray P; Barashkov, Nikolay A; Skaro, Vedrana; Mulahasanovic, Lejla; Primorac, Dragan; Sahakyan, Hovhannes; Mormina, Maru; Eichstaedt, Christina A; Lichman, Daria V; Abdullah, Syafiq; Chaubey, Gyaneshwer; Wee, Joseph T S; Mihailov, Evelin; Karunas, Alexandra; Litvinov, Sergei; Khusainova, Rita; Ekomasova, Natalya; Akhmetova, Vita; Khidiyatova, Irina; Marjanović, Damir; Yepiskoposyan, Levon; Behar, Doron M; Balanovska, Elena; Metspalu, Andres; Derenko, Miroslava; Malyarchuk, Boris; Voevoda, Mikhail; Fedorova, Sardana A; Osipova, Ludmila P; Lahr, Marta Mirazón; Gerbault, Pascale; Leavesley, Matthew; Migliano, Andrea Bamberg; Petraglia, Michael; Balanovsky, Oleg; Khusnutdinova, Elza K; Metspalu, Ene; Thomas, Mark G; Manica, Andrea; Nielsen, Rasmus; Villems, Richard; Willerslev, Eske; Kivisild, Toomas; Metspalu, Mait

    2016-10-13

    High-coverage whole-genome sequence studies have so far focused on a limited number of geographically restricted populations, or been targeted at specific diseases, such as cancer. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history and refuelled the debate on the mutation rate in humans. Here we present the Estonian Biocentre Human Genome Diversity Panel (EGDP), a dataset of 483 high-coverage human genomes from 148 populations worldwide, including 379 new genomes from 125 populations, which we group into diversity and selection sets. We analyse this dataset to refine estimates of continent-wide patterns of heterozygosity, long- and short-distance gene flow, archaic admixture, and changes in effective population size through time as well as for signals of positive or balancing selection. We find a genetic signature in present-day Papuans that suggests that at least 2% of their genome originates from an early and largely extinct expansion of anatomically modern humans (AMHs) out of Africa. Together with evidence from the western Asian fossil record, and admixture between AMHs and Neanderthals predating the main Eurasian expansion, our results contribute to the mounting evidence for the presence of AMHs out of Africa earlier than 75,000 years ago.

  8. Genomic analyses inform on migration events during the peopling of Eurasia

    Science.gov (United States)

    Pagani, Luca; Lawson, Daniel John; Jagoda, Evelyn; Mörseburg, Alexander; Eriksson, Anders; Mitt, Mario; Clemente, Florian; Hudjashov, Georgi; Degiorgio, Michael; Saag, Lauri; Wall, Jeffrey D.; Cardona, Alexia; Mägi, Reedik; Sayres, Melissa A. Wilson; Kaewert, Sarah; Inchley, Charlotte; Scheib, Christiana L.; Järve, Mari; Karmin, Monika; Jacobs, Guy S.; Antao, Tiago; Iliescu, Florin Mircea; Kushniarevich, Alena; Ayub, Qasim; Tyler-Smith, Chris; Xue, Yali; Yunusbayev, Bayazit; Tambets, Kristiina; Mallick, Chandana Basu; Saag, Lehti; Pocheshkhova, Elvira; Andriadze, George; Muller, Craig; Westaway, Michael C.; Lambert, David M.; Zoraqi, Grigor; Turdikulova, Shahlo; Dalimova, Dilbar; Sabitov, Zhaxylyk; Sultana, Gazi Nurun Nahar; Lachance, Joseph; Tishkoff, Sarah; Momynaliev, Kuvat; Isakova, Jainagul; Damba, Larisa D.; Gubina, Marina; Nymadawa, Pagbajabyn; Evseeva, Irina; Atramentova, Lubov; Utevska, Olga; Ricaut, François-Xavier; Brucato, Nicolas; Sudoyo, Herawati; Letellier, Thierry; Cox, Murray P.; Barashkov, Nikolay A.; Škaro, Vedrana; Mulaha´, Lejla; Primorac, Dragan; Sahakyan, Hovhannes; Mormina, Maru; Eichstaedt, Christina A.; Lichman, Daria V.; Abdullah, Syafiq; Chaubey, Gyaneshwer; Wee, Joseph T. S.; Mihailov, Evelin; Karunas, Alexandra; Litvinov, Sergei; Khusainova, Rita; Ekomasova, Natalya; Akhmetova, Vita; Khidiyatova, Irina; Marjanović, Damir; Yepiskoposyan, Levon; Behar, Doron M.; Balanovska, Elena; Metspalu, Andres; Derenko, Miroslava; Malyarchuk, Boris; Voevoda, Mikhail; Fedorova, Sardana A.; Osipova, Ludmila P.; Lahr, Marta Mirazón; Gerbault, Pascale; Leavesley, Matthew; Migliano, Andrea Bamberg; Petraglia, Michael; Balanovsky, Oleg; Khusnutdinova, Elza K.; Metspalu, Ene; Thomas, Mark G.; Manica, Andrea; Nielsen, Rasmus; Villems, Richard; Willerslev, Eske; Kivisild, Toomas; Metspalu, Mait

    2016-10-01

    High-coverage whole-genome sequence studies have so far focused on a limited number of geographically restricted populations, or been targeted at specific diseases, such as cancer. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history and refuelled the debate on the mutation rate in humans. Here we present the Estonian Biocentre Human Genome Diversity Panel (EGDP), a dataset of 483 high-coverage human genomes from 148 populations worldwide, including 379 new genomes from 125 populations, which we group into diversity and selection sets. We analyse this dataset to refine estimates of continent-wide patterns of heterozygosity, long- and short-distance gene flow, archaic admixture, and changes in effective population size through time as well as for signals of positive or balancing selection. We find a genetic signature in present-day Papuans that suggests that at least 2% of their genome originates from an early and largely extinct expansion of anatomically modern humans (AMHs) out of Africa. Together with evidence from the western Asian fossil record, and admixture between AMHs and Neanderthals predating the main Eurasian expansion, our results contribute to the mounting evidence for the presence of AMHs out of Africa earlier than 75,000 years ago.

  9. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Valen, Eivind; Velazquez, Amhed Missael Vargas

    2014-01-01

    Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence...... data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery...

  10. Genome-wide meta-analyses identify multiple loci associated with smoking behavior

    NARCIS (Netherlands)

    H. Furberg (Helena); Y. Kim (Yunjung); J. Dackor (Jennifer); E.A. Boerwinkle (Eric); N. Franceschini (Nora); D. Ardissino (Diego); L. Bernardinelli (Luisa); P.M. Mannucci (Pier); F. Mauri (Francesco); P.A. Merlini (Piera); D. Absher (Devin); T.L. Assimes (Themistocles); S.P. Fortmann (Stephen); C. Iribarren (Carlos); J.W. Knowles (Joshua); T. Quertermous (Thomas); L. Ferrucci (Luigi); T. Tanaka (Toshiko); J.C. Bis (Joshua); T. Haritunians (Talin); B. McKnight (Barbara); B.M. Psaty (Bruce); K.D. Taylor (Kent); E.L. Thacker (Evan); P. Almgren (Peter); L. Groop (Leif); C. Ladenvall (Claes); M. Boehnke (Michael); A.U. Jackson (Anne); K.L. Mohlke (Karen); H.M. Stringham (Heather); J. Tuomilehto (Jaakko); E.J. Benjamin (Emelia); S.J. Hwang; D. Levy (Daniel); S.R. Preis; R.S. Vasan (Ramachandran Srini); J. Duan (Jubao); P.V. Gejman (Pablo); D.F. Levinson (Douglas); A.R. Sanders (Alan); J. Shi (Jianxin); E.H. Lips (Esther); J.D. McKay (James); A. Agudo (Antonio); L. Barzan (Luigi); V. Bencko (Vladimir); S. Benhamou (Simone); X. Castellsagué (Xavier); C. Canova (Cristina); D.I. Conway (David); E. Fabianova (Eleonora); L. Foretova (Lenka); V. Janout (Vladimir); C.M. Healy (Claire); I. Holcátová (Ivana); K. Kjaerheim (Kristina); P. Lagiou; J. Lissowska (Jolanta); R. Lowry (Ray); T.V. MacFarlane (Tatiana); D. Mates (Dana); L. Richiardi (Lorenzo); P. Rudnai (Peter); N. Szeszenia-Dabrowska (Neonilia); D. Zaridze; A. Znaor (Ariana); M. Lathrop (Mark); P. Brennan (Paul); S. Bandinelli (Stefania); T.M. Frayling (Timothy); J.M. Guralnik (Jack); Y. Milaneschi (Yuri); J.R.B. Perry (John); D. Altshuler (David); R. Elosua (Roberto); S. Kathiresan (Sekar); G. Lucas (Gavin); O. Melander (Olle); V. Salomaa (Veikko); S.M. Schwartz (Stephen); B.F. Voight (Benjamin); B.W.J.H. Penninx (Brenda); J.H. Smit (Johannes); N. Vogelzangs (Nicole); D.I. Boomsma (Dorret); E.J.C. de Geus (Eco); J.M. Vink (Jacqueline); G.A.H.M. Willemsen (Gonneke); S.J. Chanock (Stephen); F. Gu (Fangyi); S.E. Hankinson (Susan); D. Hunter (David); A. Hofman (Albert); H.W. Tiemeier (Henning); A.G. Uitterlinden (André); P. Tikka-Kleemola (Päivi); S. Walter (Stefan); D.I. Chasman (Daniel); B.M. Everett (Brendan); G. Pare (Guillaume); P.M. Ridker (Paul); M.D. Li (Ming); H.H. Maes (Hermine); J. Audrain-Mcgovern (Janet); D. Posthuma (Danielle); L.M. Thornton (Laura); C. Lerman (Caryn); J. Kaprio (Jaakko); J.E. Rose (Jed); J.P.A. Ioannidis (John); P. Kraft (Peter); D.Y. Lin (Dan); P.F. Sullivan (Patrick); C.J. O'Donnell (Christopher)

    2010-01-01

    textabstractConsistent but indirect evidence has implicated genetic factors in smoking behavior. We report meta-analyses of several smoking phenotypes within cohorts of the Tobacco and Genetics Consortium (n = 74,053). We also partnered with the European Network of Genetic and Genomic Epidemiology

  11. MutSpec: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse cancer genomes.

    Science.gov (United States)

    Ardin, Maude; Cahais, Vincent; Castells, Xavier; Bouaoun, Liacine; Byrnes, Graham; Herceg, Zdenko; Zavadil, Jiri; Olivier, Magali

    2016-04-18

    The nature of somatic mutations observed in human tumors at single gene or genome-wide levels can reveal information on past carcinogenic exposures and mutational processes contributing to tumor development. While large amounts of sequencing data are being generated, the associated analysis and interpretation of mutation patterns that may reveal clues about the natural history of cancer present complex and challenging tasks that require advanced bioinformatics skills. To make such analyses accessible to a wider community of researchers with no programming expertise, we have developed within the web-based user-friendly platform Galaxy a first-of-its-kind package called MutSpec. MutSpec includes a set of tools that perform variant annotation and use advanced statistics for the identification of mutation signatures present in cancer genomes and for comparing the obtained signatures with those published in the COSMIC database and other sources. MutSpec offers an accessible framework for building reproducible analysis pipelines, integrating existing methods and scripts developed in-house with publicly available R packages. MutSpec may be used to analyse data from whole-exome, whole-genome or targeted sequencing experiments performed on human or mouse genomes. Results are provided in various formats including rich graphical outputs. An example is presented to illustrate the package functionalities, the straightforward workflow analysis and the richness of the statistics and publication-grade graphics produced by the tool. MutSpec offers an easy-to-use graphical interface embedded in the popular Galaxy platform that can be used by researchers with limited programming or bioinformatics expertise to analyse mutation signatures present in cancer genomes. MutSpec can thus effectively assist in the discovery of complex mutational processes resulting from exogenous and endogenous carcinogenic insults.

  12. Coalescent-based genome analyses resolve the early branches of the euarchontoglires.

    Directory of Open Access Journals (Sweden)

    Vikas Kumar

    Full Text Available Despite numerous large-scale phylogenomic studies, certain parts of the mammalian tree are extraordinarily difficult to resolve. We used the coding regions from 19 completely sequenced genomes to study the relationships within the super-clade Euarchontoglires (Primates, Rodentia, Lagomorpha, Dermoptera and Scandentia because the placement of Scandentia within this clade is controversial. The difficulty in resolving this issue is due to the short time spans between the early divergences of Euarchontoglires, which may cause incongruent gene trees. The conflict in the data can be depicted by network analyses and the contentious relationships are best reconstructed by coalescent-based analyses. This method is expected to be superior to analyses of concatenated data in reconstructing a species tree from numerous gene trees. The total concatenated dataset used to study the relationships in this group comprises 5,875 protein-coding genes (9,799,170 nucleotides from all orders except Dermoptera (flying lemurs. Reconstruction of the species tree from 1,006 gene trees using coalescent models placed Scandentia as sister group to the primates, which is in agreement with maximum likelihood analyses of concatenated nucleotide sequence data. Additionally, both analytical approaches favoured the Tarsier to be sister taxon to Anthropoidea, thus belonging to the Haplorrhine clade. When divergence times are short such as in radiations over periods of a few million years, even genome scale analyses struggle to resolve phylogenetic relationships. On these short branches processes such as incomplete lineage sorting and possibly hybridization occur and make it preferable to base phylogenomic analyses on coalescent methods.

  13. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Cronn Richard

    2009-12-01

    Full Text Available Abstract Background Molecular evolutionary studies share the common goal of elucidating historical relationships, and the common challenge of adequately sampling taxa and characters. Particularly at low taxonomic levels, recent divergence, rapid radiations, and conservative genome evolution yield limited sequence variation, and dense taxon sampling is often desirable. Recent advances in massively parallel sequencing make it possible to rapidly obtain large amounts of sequence data, and multiplexing makes extensive sampling of megabase sequences feasible. Is it possible to efficiently apply massively parallel sequencing to increase phylogenetic resolution at low taxonomic levels? Results We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome generated using multiplexed massively parallel sequencing. 30/33 ingroup nodes resolved with ≥ 95% bootstrap support; this is a substantial improvement relative to prior studies, and shows massively parallel sequencing-based strategies can produce sufficient high quality sequence to reach support levels originally proposed for the phylogenetic bootstrap. Resampling simulations show that at least the entire plastome is necessary to fully resolve Pinus, particularly in rapidly radiating clades. Meta-analysis of 99 published infrageneric phylogenies shows that whole plastome analysis should provide similar gains across a range of plant genera. A disproportionate amount of phylogenetic information resides in two loci (ycf1, ycf2, highlighting their unusual evolutionary properties. Conclusion Plastome sequencing is now an efficient option for increasing phylogenetic resolution at lower taxonomic levels in plant phylogenetic and population genetic analyses. With continuing improvements in sequencing capacity, the strategies herein should revolutionize efforts requiring dense taxon and character sampling

  14. Comparative analyses of chloroplast genome data representing nine green algae in Sphaeropleales (Chlorophyceae, Chlorophyta

    Directory of Open Access Journals (Sweden)

    Karolina Fučíková

    2016-06-01

    Full Text Available The chloroplast genomes of green algae are highly variable in their architecture. In this article we summarize gene content across newly obtained and published chloroplast genomes in Chlorophyceae, including new data from nine of species in Sphaeropleales (Chlorophyceae, Chlorophyta. We present genome architecture information, including genome synteny analysis across two groups of species. Also, we provide a phylogenetic tree obtained from analysis of gene order data for species in Chlorophyceae with fully sequenced chloroplast genomes. Further analyses and interpretation of the data can be found in “Chloroplast phylogenomic data from the green algal order Sphaeropleales (Chlorophyceae, Chlorophyta reveal complex patterns of sequence evolution” (Fučíková et al., In review [1].

  15. Genomic analyses inform on migration events during the peopling of Eurasia

    KAUST Repository

    Pagani, Luca

    2016-09-20

    High-Coverage whole-genome sequence studies have so far focused on a limited number of geographically restricted populations, or been targeted at specific diseases, such as cancer. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history and refuelled the debate on the mutation rate in humans. Here we present the Estonian Biocentre Human Genome Diversity Panel (EGDP), a dataset of 483 high-coverage human genomes from 148 populations worldwide, including 379 new genomes from 125 populations, which we group into diversity and selection sets. We analyse this dataset to refine estimates of continent-wide patterns of heterozygosity, long-and short-distance gene flow, archaic admixture, and changes in effective population size through time as well as for signals of positive or balancing selection. We find a genetic signature in present-day Papuans that suggests that at least 2% of their genome originates from an early and largely extinct expansion of anatomically modern humans (AMHs) out of Africa. Together with evidence from the western Asian fossil record, and admixture between AMHs and Neanderthals predating the main Eurasian expansion, our results contribute to the mounting evidence for the presence of AMHs out of Africa earlier than 75,000 years ago. © 2016 Macmillan Publishers Limited, part of Springer Nature.

  16. Genomic analyses inform on migration events during the peopling of Eurasia

    KAUST Repository

    Pagani, Luca; Lawson, Daniel John; Jagoda, Evelyn; Mö rseburg, Alexander; Eriksson, Anders; Mitt, Mario; Clemente, Florian; Hudjashov, Georgi; DeGiorgio, Michael; Saag, Lauri; Wall, Jeffrey D.; Cardona, Alexia; Mä gi, Reedik; Sayres, Melissa A. Wilson; Kaewert, Sarah; Inchley, Charlotte; Scheib, Christiana L.; Jä rve, Mari; Karmin, Monika; Jacobs, Guy S.; Antao, Tiago; Iliescu, Florin Mircea; Kushniarevich, Alena; Ayub, Qasim; Tyler-Smith, Chris; Xue, Yali; Yunusbayev, Bayazit; Tambets, Kristiina; Mallick, Chandana Basu; Saag, Lehti; Pocheshkhova, Elvira; Andriadze, George; Muller, Craig; Westaway, Michael C.; Lambert, David M.; Zoraqi, Grigor; Turdikulova, Shahlo; Dalimova, Dilbar; Sabitov, Zhaxylyk; Sultana, Gazi Nurun Nahar; Lachance, Joseph; Tishkoff, Sarah; Momynaliev, Kuvat; Isakova, Jainagul; Damba, Larisa D.; Gubina, Marina; Nymadawa, Pagbajabyn; Evseeva, Irina; Atramentova, Lubov; Utevska, Olga; Ricaut, Franç ois-Xavier; Brucato, Nicolas; Sudoyo, Herawati; Letellier, Thierry; Cox, Murray P.; Barashkov, Nikolay A.; Škaro, Vedrana; Mulahasanovic´ , Lejla; Primorac, Dragan; Sahakyan, Hovhannes; Mormina, Maru; Eichstaedt, Christina A.; Lichman, Daria V.; Abdullah, Syafiq; Chaubey, Gyaneshwer; Wee, Joseph T. S.; Mihailov, Evelin; Karunas, Alexandra; Litvinov, Sergei; Khusainova, Rita; Ekomasova, Natalya; Akhmetova, Vita; Khidiyatova, Irina; Marjanović, Damir; Yepiskoposyan, Levon; Behar, Doron M.; Balanovska, Elena; Metspalu, Andres; Derenko, Miroslava; Malyarchuk, Boris; Voevoda, Mikhail; Fedorova, Sardana A.; Osipova, Ludmila P.; Lahr, Marta Mirazó n; Gerbault, Pascale; Leavesley, Matthew; Migliano, Andrea Bamberg; Petraglia, Michael; Balanovsky, Oleg; Khusnutdinova, Elza K.; Metspalu, Ene; Thomas, Mark G.; Manica, Andrea; Nielsen, Rasmus; Villems, Richard; Willerslev, Eske; Kivisild, Toomas; Metspalu, Mait

    2016-01-01

    High-Coverage whole-genome sequence studies have so far focused on a limited number of geographically restricted populations, or been targeted at specific diseases, such as cancer. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history and refuelled the debate on the mutation rate in humans. Here we present the Estonian Biocentre Human Genome Diversity Panel (EGDP), a dataset of 483 high-coverage human genomes from 148 populations worldwide, including 379 new genomes from 125 populations, which we group into diversity and selection sets. We analyse this dataset to refine estimates of continent-wide patterns of heterozygosity, long-and short-distance gene flow, archaic admixture, and changes in effective population size through time as well as for signals of positive or balancing selection. We find a genetic signature in present-day Papuans that suggests that at least 2% of their genome originates from an early and largely extinct expansion of anatomically modern humans (AMHs) out of Africa. Together with evidence from the western Asian fossil record, and admixture between AMHs and Neanderthals predating the main Eurasian expansion, our results contribute to the mounting evidence for the presence of AMHs out of Africa earlier than 75,000 years ago. © 2016 Macmillan Publishers Limited, part of Springer Nature.

  17. Significance of functional disease-causal/susceptible variants identified by whole-genome analyses for the understanding of human diseases.

    Science.gov (United States)

    Hitomi, Yuki; Tokunaga, Katsushi

    2017-01-01

    Human genome variation may cause differences in traits and disease risks. Disease-causal/susceptible genes and variants for both common and rare diseases can be detected by comprehensive whole-genome analyses, such as whole-genome sequencing (WGS), using next-generation sequencing (NGS) technology and genome-wide association studies (GWAS). Here, in addition to the application of an NGS as a whole-genome analysis method, we summarize approaches for the identification of functional disease-causal/susceptible variants from abundant genetic variants in the human genome and methods for evaluating their functional effects in human diseases, using an NGS and in silico and in vitro functional analyses. We also discuss the clinical applications of the functional disease causal/susceptible variants to personalized medicine.

  18. Genome wide analyses of metal responsive genes in Caenorhabditis elegans

    Directory of Open Access Journals (Sweden)

    Michael eAschner

    2012-04-01

    Full Text Available Metals are major contaminants that influence human health. Many metals have physiologic roles, but excessive levels can be harmful. Advances in technology have made toxicogenomic analyses possible to characterize the effects of metal exposure on the entire genome. Much of what is known about cellular responses to metals has come from mammalian systems; however the use of non-mammalian species is gaining wider attention. Caenorhabditis elegans (C. elegans is a small round worm whose genome has been fully sequenced and its development from egg to adult is well characterized. It is an attractive model for high throughput screens due to its short lifespan, ease of genetic mutability, low cost and high homology with humans. Research performed in C. elegans has led to insights in apoptosis, gene expression and neurodegeneration, all of which can be altered by metal exposure. Additionally, by using worms one can potentially study how the mechanisms that underline differential responses to metals in nematodes and humans, allowing for identification of novel pathways and therapeutic targets. In this review, toxicogenomic studies performed in C. elegans exposed to various metals will be discussed, highlighting how this non-mammalian system can be utilized to study cellular processes and pathways induced by metals. Recent work focusing on neurodegeneration in Parkinson’s disease will be discussed as an example of the usefulness of genetic screens in C. elegans and the novel findings that can be produced.

  19. Complete chloroplast genome sequence of Elodea canadensis and comparative analyses with other monocot plastid genomes.

    Science.gov (United States)

    Huotari, Tea; Korpelainen, Helena

    2012-10-15

    Elodea canadensis is an aquatic angiosperm native to North America. It has attracted great attention due to its invasive nature when transported to new areas in its non-native range. We have determined the complete nucleotide sequence of the chloroplast (cp) genome of Elodea. Taxonomically Elodea is a basal monocot, and only few monocot cp genomes representing early lineages of monocots have been sequenced so far. The genome is a circular double-stranded DNA molecule 156,700 bp in length, and has a typical structure with large (LSC 86,194 bp) and small (SSC 17,810 bp) single-copy regions separated by a pair of inverted repeats (IRs 26,348 bp each). The Elodea cp genome contains 113 unique genes and 16 duplicated genes in the IR regions. A comparative analysis showed that the gene order and organization of the Elodea cp genome is almost identical to that of Amborella trichopoda, a basal angiosperm. The structure of IRs in Elodea is unique among monocot species with the whole cp genome sequenced. In Elodea and another monocot Lemna minor the borders between IRs and LSC are located upstream of rps 19 gene and downstream of trnH-GUG gene, while in most monocots, IR has extended to include both trnH and rps 19 genes. A phylogenetic analysis conducted using Bayesian method, based on the DNA sequences of 81 chloroplast genes from 17 monocot taxa provided support for the placement of Elodea together with Lemna as a basal monocot and the next diverging lineage of monocots after Acorales. In comparison with other monocots, the Elodea cp genome has gone through only few rearrangements or gene losses. IR of Elodea has a unique structure among the monocot species studied so far as its structure is similar to that of a basal angiosperm Amborella. This result together with phylogenetic analyses supports the placement of Elodea as a basal monocot to the next diverging lineage of monocots after Acorales. So far, only few cp genomes representing early lineages of monocots have been

  20. The complete chloroplast genome sequences of Lychnis wilfordii and Silene capitata and comparative analyses with other Caryophyllaceae genomes.

    Science.gov (United States)

    Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai

    2017-01-01

    The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.

  1. Chloroplast phylogenomic analyses resolve deep-level relationships of an intractable bamboo tribe Arundinarieae (poaceae).

    Science.gov (United States)

    Ma, Peng-Fei; Zhang, Yu-Xiao; Zeng, Chun-Xia; Guo, Zhen-Hua; Li, De-Zhu

    2014-11-01

    The temperate woody bamboos constitute a distinct tribe Arundinarieae (Poaceae: Bambusoideae) with high species diversity. Estimating phylogenetic relationships among the 11 major lineages of Arundinarieae has been particularly difficult, owing to a possible rapid radiation and the extremely low rate of sequence divergence. Here, we explore the use of chloroplast genome sequencing for phylogenetic inference. We sampled 25 species (22 temperate bamboos and 3 outgroups) for the complete genome representing eight major lineages of Arundinarieae in an attempt to resolve backbone relationships. Phylogenetic analyses of coding versus noncoding sequences, and of different regions of the genome (large single copy and small single copy, and inverted repeat regions) yielded no well-supported contradicting topologies but potential incongruence was found between the coding and noncoding sequences. The use of various data partitioning schemes in analysis of the complete sequences resulted in nearly identical topologies and node support values, although the partitioning schemes were decisively different from each other as to the fit to the data. Our full genomic data set substantially increased resolution along the backbone and provided strong support for most relationships despite the very short internodes and long branches in the tree. The inferred relationships were also robust to potential confounding factors (e.g., long-branch attraction) and received support from independent indels in the genome. We then added taxa from the three Arundinarieae lineages that were not included in the full-genome data set; each of these were sampled for more than 50% genome sequences. The resulting trees not only corroborated the reconstructed deep-level relationships but also largely resolved the phylogenetic placements of these three additional lineages. Furthermore, adding 129 additional taxa sampled for only eight chloroplast loci to the combined data set yielded almost identical

  2. Deciphering the Cryptic Genome: Genome-wide Analyses of the Rice Pathogen Fusarium fujikuroi Reveal Complex Regulation of Secondary Metabolism and Novel Metabolites

    Science.gov (United States)

    Studt, Lena; Niehaus, Eva-Maria; Espino, Jose J.; Huß, Kathleen; Michielse, Caroline B.; Albermann, Sabine; Wagner, Dominik; Bergner, Sonja V.; Connolly, Lanelle R.; Fischer, Andreas; Reuter, Gunter; Kleigrewe, Karin; Bald, Till; Wingfield, Brenda D.; Ophir, Ron; Freeman, Stanley; Hippler, Michael; Smith, Kristina M.; Brown, Daren W.; Proctor, Robert H.; Münsterkötter, Martin; Freitag, Michael; Humpf, Hans-Ulrich; Güldener, Ulrich; Tudzynski, Bettina

    2013-01-01

    The fungus Fusarium fujikuroi causes “bakanae” disease of rice due to its ability to produce gibberellins (GAs), but it is also known for producing harmful mycotoxins. However, the genetic capacity for the whole arsenal of natural compounds and their role in the fungus' interaction with rice remained unknown. Here, we present a high-quality genome sequence of F. fujikuroi that was assembled into 12 scaffolds corresponding to the 12 chromosomes described for the fungus. We used the genome sequence along with ChIP-seq, transcriptome, proteome, and HPLC-FTMS-based metabolome analyses to identify the potential secondary metabolite biosynthetic gene clusters and to examine their regulation in response to nitrogen availability and plant signals. The results indicate that expression of most but not all gene clusters correlate with proteome and ChIP-seq data. Comparison of the F. fujikuroi genome to those of six other fusaria revealed that only a small number of gene clusters are conserved among these species, thus providing new insights into the divergence of secondary metabolism in the genus Fusarium. Noteworthy, GA biosynthetic genes are present in some related species, but GA biosynthesis is limited to F. fujikuroi, suggesting that this provides a selective advantage during infection of the preferred host plant rice. Among the genome sequences analyzed, one cluster that includes a polyketide synthase gene (PKS19) and another that includes a non-ribosomal peptide synthetase gene (NRPS31) are unique to F. fujikuroi. The metabolites derived from these clusters were identified by HPLC-FTMS-based analyses of engineered F. fujikuroi strains overexpressing cluster genes. In planta expression studies suggest a specific role for the PKS19-derived product during rice infection. Thus, our results indicate that combined comparative genomics and genome-wide experimental analyses identified novel genes and secondary metabolites that contribute to the evolutionary success of F

  3. Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

    Directory of Open Access Journals (Sweden)

    Michael S Brewer

    Full Text Available BACKGROUND: Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. RESULTS: The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly. As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. CONCLUSIONS: The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic

  4. Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda) mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

    Science.gov (United States)

    Brewer, Michael S; Swafford, Lynn; Spruill, Chad L; Bond, Jason E

    2013-01-01

    Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the resulting tree topologies as suspect

  5. Complete mitochondrial genome sequences of three bats species and whole genome mitochondrial analyses reveal patterns of codon bias and lend support to a basal split in Chiroptera.

    Science.gov (United States)

    Meganathan, P R; Pagan, Heidi J T; McCulloch, Eve S; Stevens, Richard D; Ray, David A

    2012-01-15

    Order Chiroptera is a unique group of mammals whose members have attained self-powered flight as their main mode of locomotion. Much speculation persists regarding bat evolution; however, lack of sufficient molecular data hampers evolutionary and conservation studies. Of ~1200 species, complete mitochondrial genome sequences are available for only eleven. Additional sequences should be generated if we are to resolve many questions concerning these fascinating mammals. Herein, we describe the complete mitochondrial genomes of three bats: Corynorhinus rafinesquii, Lasiurus borealis and Artibeus lituratus. We also compare the currently available mitochondrial genomes and analyze codon usage in Chiroptera. C. rafinesquii, L. borealis and A. lituratus mitochondrial genomes are 16438 bp, 17048 bp and 16709 bp, respectively. Genome organization and gene arrangements are similar to other bats. Phylogenetic analyses using complete mitochondrial genome sequences support previously established phylogenetic relationships and suggest utility in future studies focusing on the evolutionary aspects of these species. Comprehensive analyses of available bat mitochondrial genomes reveal distinct nucleotide patterns and synonymous codon preferences corresponding to different chiropteran families. These patterns suggest that mutational and selection forces are acting to different extents within Chiroptera and shape their mitochondrial genomes. Copyright © 2011 Elsevier B.V. All rights reserved.

  6. Comparative analyses identified species-specific functional roles in oral microbial genomes

    Science.gov (United States)

    Chen, Tsute; Gajare, Prasad; Olsen, Ingar; Dewhirst, Floyd E.

    2017-01-01

    ABSTRACT The advent of next generation sequencing is producing more genomic sequences for various strains of many human oral microbial species and allows for insightful functional comparisons at both intra- and inter-species levels. This study performed in-silico functional comparisons for currently available genomic sequences of major species associated with periodontitis including Aggregatibacter actinomycetemcomitans (AA), Porphyromonas gingivalis (PG), Treponema denticola (TD), and Tannerella forsythia (TF), as well as several cariogenic and commensal streptococcal species. Complete or draft sequences were annotated with the RAST to infer structured functional subsystems for each genome. The subsystems profiles were clustered to groups of functions with similar patterns. Functional enrichment and depletion were evaluated based on hypergeometric distribution to identify subsystems that are unique or missing between two groups of genomes. Unique or missing metabolic pathways and biological functions were identified in different species. For example, components involved in flagellar motility were found only in the motile species TD, as expected, with few exceptions scattered in several streptococcal species, likely associated with chemotaxis. Transposable elements were only found in the two Bacteroidales species PG and TF, and half of the AA genomes. Genes involved in CRISPR were prevalent in most oral species. Furthermore, prophage related subsystems were also commonly found in most species except for PG and Streptococcus mutans, in which very few genomes contain prophage components. Comparisons between pathogenic (P) and nonpathogenic (NP) genomes also identified genes potentially important for virulence. Two such comparisons were performed between AA (P) and several A. aphrophilus (NP) strains, and between S. mutans + S. sobrinus (P) and other oral streptococcal species (NP). This comparative genomics approach can be readily used to identify functions unique to

  7. Level II Ergonomic Analyses, Dover AFB, DE

    Science.gov (United States)

    1999-02-01

    IERA-RS-BR-TR-1999-0002 UNITED STATES AIR FORCE IERA Level II Ergonomie Analyses, Dover AFB, DE Andrew Marcotte Marilyn Joyce The Joyce...Project (070401881, Washington, DC 20503. 1. AGENCY USE ONLY (Leave blank) 2. REPORT DATE 4. TITLE AND SUBTITLE Level II Ergonomie Analyses, Dover...1.0 INTRODUCTION 1-1 1.1 Purpose Of The Level II Ergonomie Analyses : 1-1 1.2 Approach 1-1 1.2.1 Initial Shop Selection and Administration of the

  8. Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution.

    Science.gov (United States)

    Renner, Daniel W; Szpara, Moriah L

    2018-01-01

    Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well. Copyright © 2017 Renner and Szpara.

  9. Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution

    Science.gov (United States)

    Renner, Daniel W.

    2017-01-01

    ABSTRACT Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well. PMID:29046445

  10. Dense and accurate whole-chromosome haplotyping of individual genomes

    NARCIS (Netherlands)

    Porubsky, David; Garg, Shilpa; Sanders, Ashley D.; Korbel, Jan O.; Guryev, Victor; Lansdorp, Peter M.; Marschall, Tobias

    2017-01-01

    The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate

  11. Comparative Genomics and Transcriptomics Analyses Reveal Divergent Lifestyle Features of Nematode Endoparasitic Fungus Hirsutella minnesotensis

    Science.gov (United States)

    Lai, Yiling; Liu, Keke; Zhang, Xinyu; Zhang, Xiaoling; Li, Kuan; Wang, Niuniu; Shu, Chi; Wu, Yunpeng; Wang, Chengshu; Bushley, Kathryn E.; Xiang, Meichun; Liu, Xingzhong

    2014-01-01

    Hirsutella minnesotensis [Ophiocordycipitaceae (Hypocreales, Ascomycota)] is a dominant endoparasitic fungus by using conidia that adhere to and penetrate the secondary stage juveniles of soybean cyst nematode. Its genome was de novo sequenced and compared with five entomopathogenic fungi in the Hypocreales and three nematode-trapping fungi in the Orbiliales (Ascomycota). The genome of H. minnesotensis is 51.4 Mb and encodes 12,702 genes enriched with transposable elements up to 32%. Phylogenomic analysis revealed that H. minnesotensis was diverged from entomopathogenic fungi in Hypocreales. Genome of H. minnesotensis is similar to those of entomopathogenic fungi to have fewer genes encoding lectins for adhesion and glycoside hydrolases for cellulose degradation, but is different from those of nematode-trapping fungi to possess more genes for protein degradation, signal transduction, and secondary metabolism. Those results indicate that H. minnesotensis has evolved different mechanism for nematode endoparasitism compared with nematode-trapping fungi. Transcriptomics analyses for the time-scale parasitism revealed the upregulations of lectins, secreted proteases and the genes for biosynthesis of secondary metabolites that could be putatively involved in host surface adhesion, cuticle degradation, and host manipulation. Genome and transcriptome analyses provided comprehensive understanding of the evolution and lifestyle of nematode endoparasitism. PMID:25359922

  12. Genome Sequence of Azospirillum brasilense CBG497 and Comparative Analyses of Azospirillum Core and Accessory Genomes provide Insight into Niche Adaptation

    Science.gov (United States)

    Wisniewski-Dyé, Florence; Lozano, Luis; Acosta-Cruz, Erika; Borland, Stéphanie; Drogue, Benoît; Prigent-Combaret, Claire; Rouy, Zoé; Barbe, Valérie; Mendoza Herrera, Alberto; González, Victor; Mavingui, Patrick

    2012-01-01

    Bacteria of the genus Azospirillum colonize roots of important cereals and grasses, and promote plant growth by several mechanisms, notably phytohormone synthesis. The genomes of several Azospirillum strains belonging to different species, isolated from various host plants and locations, were recently sequenced and published. In this study, an additional genome of an A. brasilense strain, isolated from maize grown on an alkaline soil in the northeast of Mexico, strain CBG497, was obtained. Comparative genomic analyses were performed on this new genome and three other genomes (A. brasilense Sp245, A. lipoferum 4B and Azospirillum sp. B510). The Azospirillum core genome was established and consists of 2,328 proteins, representing between 30% to 38% of the total encoded proteins within a genome. It is mainly chromosomally-encoded and contains 74% of genes of ancestral origin shared with some aquatic relatives. The non-ancestral part of the core genome is enriched in genes involved in signal transduction, in transport and in metabolism of carbohydrates and amino-acids, and in surface properties features linked to adaptation in fluctuating environments, such as soil and rhizosphere. Many genes involved in colonization of plant roots, plant-growth promotion (such as those involved in phytohormone biosynthesis), and properties involved in rhizosphere adaptation (such as catabolism of phenolic compounds, uptake of iron) are restricted to a particular strain and/or species, strongly suggesting niche-specific adaptation. PMID:24705077

  13. Genome Sequence of Azospirillum brasilense CBG497 and Comparative Analyses of Azospirillum Core and Accessory Genomes provide Insight into Niche Adaptation

    Directory of Open Access Journals (Sweden)

    Victor González

    2012-09-01

    Full Text Available Bacteria of the genus Azospirillum colonize roots of important cereals and grasses, and promote plant growth by several mechanisms, notably phytohormone synthesis. The genomes of several Azospirillum strains belonging to different species, isolated from various host plants and locations, were recently sequenced and published. In this study, an additional genome of an A. brasilense strain, isolated from maize grown on an alkaline soil in the northeast of Mexico, strain CBG497, was obtained. Comparative genomic analyses were performed on this new genome and three other genomes (A. brasilense Sp245, A. lipoferum 4B and Azospirillum sp. B510. The Azospirillum core genome was established and consists of 2,328 proteins, representing between 30% to 38% of the total encoded proteins within a genome. It is mainly chromosomally-encoded and contains 74% of genes of ancestral origin shared with some aquatic relatives. The non-ancestral part of the core genome is enriched in genes involved in signal transduction, in transport and in metabolism of carbohydrates and amino-acids, and in surface properties features linked to adaptation in fluctuating environments, such as soil and rhizosphere. Many genes involved in colonization of plant roots, plant-growth promotion (such as those involved in phytohormone biosynthesis, and properties involved in rhizosphere adaptation (such as catabolism of phenolic compounds, uptake of iron are restricted to a particular strain and/or species, strongly suggesting niche-specific adaptation.

  14. Application of the inter-line PCR for the analyse of genomic rearrangements in radiation-transformed mammalian cell lines; Anwendung der Inter-Line PCR zur Analyse von genomischen Veraenderungen in strahlentransformierten Saeugerzellinien

    Energy Technology Data Exchange (ETDEWEB)

    Leibhard, S.; Smida, J. [Muenchen Univ. (Germany). Strahlenbiologisches Inst.; Eckardt-Schupp, F.; Hieber, L. [GSF-Inst. fuer Strahlenbiologie, Oberschleissheim (Germany)

    1996-12-31

    Repetitive DNA sequences of the LINE-family (long interspersed elements) that are widely distributed among the mammalian genome can be activated or altered by the exposure to ionizing radiation [1]. By the integration at new sites in the genome alterations in the expression of genes that are involved in cell transformation and/or carcinogenesis may occur [2, 3]. A new technique - the inter-LINE PCR - has been developed in order to detect and analyse such genomic rearrangements in radiation-transformed cell lines. From the sites of transformation- or tumour-specific changes in the genome it might be possible to develop new tumour markers for diagnostic purpose. (orig.) [Deutsch] Repetitive DNA-Sequenzen der LINE-Familie, die weit verbreitet im Genom von Saeugerzellen vorkommen, koennen durch Exposition mit ionisierender Strahlung aktiviert und veraendert werden [1]. Durch eine Neu- bzw. Reintegration an anderen Positionen im Genom kann es zu bedeutenden Veraenderungen im Genom der Zelle kommen. Die Expression von Genen, die bei den Prozessen der Zelltransformation bzw. der Karzinogenese beteiligt sind, kann dadurch veraendert werden [2, 3]. Mithilfe der von uns entwickelten Inter-LINE PCR und der anschliessenden Analyse der veraenderten Produktmuster nach gelelektrophoretischer Auftrennung koennen solche `genomic rearrangements` unter Beteiligung von LINE-Elementen untersucht und naeher charakterisiert werden. Durch Klonierung und Sequenzierung transformations- bzw. tumorspezifischer PCR-Produkte sollte es moeglich sein Tumormarker fuer diagnostische Zwecke zu entwickeln. Die Methode wurde fuer die Analyse von Zellen des Syrischen Hamster aufgebaut, sie ist jedoch universell fuer alle Saeuger anwendbar. (orig.)

  15. Comparative chloroplast genomics: Analyses including new sequencesfrom the angiosperms Nuphar advena and Ranunculus macranthus

    Energy Technology Data Exchange (ETDEWEB)

    Raubeso, Linda A.; Peery, Rhiannon; Chumley, Timothy W.; Dziubek,Chris; Fourcade, H. Matthew; Boore, Jeffrey L.; Jansen, Robert K.

    2007-03-01

    The number of completely sequenced plastid genomes available is growing rapidly. This new array of sequences presents new opportunities to perform comparative analyses. In comparative studies, it is most useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the new genomes reported here: Nuphar advena (from a basal-most lineage) and Ranunculus macranthus (from the basal group of eudicots). We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages) to evaluate features such as the status of ycf15 and ycf68 as protein coding genes, the distribution of simple sequence repeats (SSRs) and longer dispersed repeats (SDR), and patterns of nucleotide composition.

  16. Analysing human genomes at different scales

    DEFF Research Database (Denmark)

    Liu, Siyang

    The thriving of the Next-Generation sequencing (NGS) technologies in the past decade has dramatically revolutionized the field of human genetics. We are experiencing a wave of several large-scale whole genome sequencing studies of humans in the world. Those studies vary greatly regarding cohort...... will be reflected by the analysis of real data. This thesis covers studies in two human genome sequencing projects that distinctly differ in terms of studied population, sample size and sequencing depth. In the first project, we sequenced 150 Danish individuals from 50 trio families to 78x coverage....... The sophisticated experimental design enables high-quality de novo assembly of the genomes and provides a good opportunity for mapping the structural variations in the human population. We developed the AsmVar approach to discover, genotype and characterize the structural variations from the assemblies. Our...

  17. Comparative analyses of plastid genomes from fourteen Cornales species: inferences for phylogenetic relationships and genome evolution.

    Science.gov (United States)

    Fu, Chao-Nan; Li, Hong-Tao; Milne, Richard; Zhang, Ting; Ma, Peng-Fei; Yang, Jing; Li, De-Zhu; Gao, Lian-Ming

    2017-12-08

    The Cornales is the basal lineage of the asterids, the largest angiosperm clade. Phylogenetic relationships within the order were previously not fully resolved. Fifteen plastid genomes representing 14 species, ten genera and seven families of Cornales were newly sequenced for comparative analyses of genome features, evolution, and phylogenomics based on different partitioning schemes and filtering strategies. All plastomes of the 14 Cornales species had the typical quadripartite structure with a genome size ranging from 156,567 bp to 158,715 bp, which included two inverted repeats (25,859-26,451 bp) separated by a large single-copy region (86,089-87,835 bp) and a small single-copy region (18,250-18,856 bp) region. These plastomes encoded the same set of 114 unique genes including 31 transfer RNA, 4 ribosomal RNA and 79 coding genes, with an identical gene order across all examined Cornales species. Two genes (rpl22 and ycf15) contained premature stop codons in seven and five species respectively. The phylogenetic relationships among all sampled species were fully resolved with maximum support. Different filtering strategies (none, light and strict) of sequence alignment did not have an effect on these relationships. The topology recovered from coding and noncoding data sets was the same as for the whole plastome, regardless of filtering strategy. Moreover, mutational hotspots and highly informative regions were identified. Phylogenetic relationships among families and intergeneric relationships within family of Cornales were well resolved. Different filtering strategies and partitioning schemes do not influence the relationships. Plastid genomes have great potential to resolve deep phylogenetic relationships of plants.

  18. Mobile Genome Express (MGE: A comprehensive automatic genetic analyses pipeline with a mobile device.

    Directory of Open Access Journals (Sweden)

    Jun-Hee Yoon

    Full Text Available The development of next-generation sequencing (NGS technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.

  19. Mobile Genome Express (MGE): A comprehensive automatic genetic analyses pipeline with a mobile device.

    Science.gov (United States)

    Yoon, Jun-Hee; Kim, Thomas W; Mendez, Pedro; Jablons, David M; Kim, Il-Jin

    2017-01-01

    The development of next-generation sequencing (NGS) technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE) that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.

  20. Whole-Genome Analyses of Korean Native and Holstein Cattle Breeds by Massively Parallel Sequencing

    Science.gov (United States)

    Stothard, Paul; Chung, Won-Hyong; Jeon, Heoyn-Jeong; Miller, Stephen P.; Choi, So-Young; Lee, Jeong-Koo; Yang, Bokyoung; Lee, Kyung-Tai; Han, Kwang-Jin; Kim, Hyeong-Cheol; Jeong, Dongkee; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin

    2014-01-01

    A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea—Hanwoo, Jeju Heugu, and Korean Holstein—using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs), of which 54.12% were found to be novel. We also detected 1,063,267 insertions–deletions (InDels) across the genomes (78.92% novel). Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs) were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH) were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding. PMID:24992012

  1. Genome-Wide Identification, Phylogenetic and Expression Analyses of the Ubiquitin-Conjugating Enzyme Gene Family in Maize

    Science.gov (United States)

    Jue, Dengwei; Sang, Xuelian; Lu, Shengqiao; Dong, Chen; Zhao, Qiufang; Chen, Hongliang; Jia, Liqiang

    2015-01-01

    Background Ubiquitination is a post-translation modification where ubiquitin is attached to a substrate. Ubiquitin-conjugating enzymes (E2s) play a major role in the ubiquitin transfer pathway, as well as a variety of functions in plant biological processes. To date, no genome-wide characterization of this gene family has been conducted in maize (Zea mays). Methodology/Principal Findings In the present study, a total of 75 putative ZmUBC genes have been identified and located in the maize genome. Phylogenetic analysis revealed that ZmUBC proteins could be divided into 15 subfamilies, which include 13 ubiquitin-conjugating enzymes (ZmE2s) and two independent ubiquitin-conjugating enzyme variant (UEV) groups. The predicted ZmUBC genes were distributed across 10 chromosomes at different densities. In addition, analysis of exon-intron junctions and sequence motifs in each candidate gene has revealed high levels of conservation within and between phylogenetic groups. Tissue expression analysis indicated that most ZmUBC genes were expressed in at least one of the tissues, indicating that these are involved in various physiological and developmental processes in maize. Moreover, expression profile analyses of ZmUBC genes under different stress treatments (4°C, 20% PEG6000, and 200 mM NaCl) and various expression patterns indicated that these may play crucial roles in the response of plants to stress. Conclusions Genome-wide identification, chromosome organization, gene structure, evolutionary and expression analyses of ZmUBC genes have facilitated in the characterization of this gene family, as well as determined its potential involvement in growth, development, and stress responses. This study provides valuable information for better understanding the classification and putative functions of the UBC-encoding genes of maize. PMID:26606743

  2. Comparison of genome-wide association methods in analyses of admixed populations with complex familial relationships.

    Directory of Open Access Journals (Sweden)

    Naveen K Kadri

    Full Text Available Population structure is known to cause false-positive detection in association studies. We compared the power, precision, and type-I error rates of various association models in analyses of a simulated dataset with structure at the population (admixture from two populations; P and family (K levels. We also compared type-I error rates among models in analyses of publicly available human and dog datasets. The models corrected for none, one, or both structure levels. Correction for K was performed with linear mixed models incorporating familial relationships estimated from pedigrees or genetic markers. Linear models that ignored K were also tested. Correction for P was performed using principal component or structured association analysis. In analyses of simulated and real data, linear mixed models that corrected for K were able to control for type-I error, regardless of whether they also corrected for P. In contrast, correction for P alone in linear models was insufficient. The power and precision of linear mixed models with and without correction for P were similar. Furthermore, power, precision, and type-I error rate were comparable in linear mixed models incorporating pedigree and genomic relationships. In summary, in association studies using samples with both P and K, ancestries estimated using principal components or structured assignment were not sufficient to correct type-I errors. In such cases type-I errors may be controlled by use of linear mixed models with relationships derived from either pedigree or from genetic markers.

  3. Measuring the Levels of Ribonucleotides Embedded in Genomic DNA.

    Science.gov (United States)

    Meroni, Alice; Nava, Giulia M; Sertic, Sarah; Plevani, Paolo; Muzi-Falconi, Marco; Lazzaro, Federico

    2018-01-01

    Ribonucleotides (rNTPs) are incorporated into genomic DNA at a relatively high frequency during replication. They have beneficial effects but, if not removed from the chromosomes, increase genomic instability. Here, we describe a fast method to easily estimate the amounts of embedded ribonucleotides into the genome. The protocol described is performed in Saccharomyces cerevisiae and allows us to quantify altered levels of rNMPs due to different mutations in the replicative polymerase ε. However, this protocol can be easily applied to cells derived from any organism.

  4. Clinical, polysomnographic and genome-wide association analyses of narcolepsy with cataplexy

    DEFF Research Database (Denmark)

    Luca, Gianina; Haba-Rubio, José; Dauvilliers, Yves

    2013-01-01

    diagnosed according to International Classification of Sleep Disorders-2. Demographic and clinical characteristics, polysomnography and multiple sleep latency test data, hypocretin-1 levels, and genome-wide genotypes were available. We found a significantly lower age at sleepiness onset (men versus women...

  5. Genome and metagenome enabled analyses reveal new insight into the global biogeography and potential urea utilization in marine Thaumarchaeota.

    Science.gov (United States)

    Ahlgren, N.; Parada, A. E.; Fuhrman, J. A.

    2016-02-01

    Marine Thaumarchaea are an abundant, important group of marine microbial communities as they fix carbon, oxidize ammonium, and thus contribute to key N and C cycles in the oceans. From an enrichment culture, we have sequenced the complete genome of a new Thaumarchaeota strain, SPOT01. Analysis of this genome and other Thaumarchaeal genomes contributes new insight into its role in N cycling and clarifies the broader biogeography of marine Thaumarchaeal genera. Phylogenomics of Thaumarchaeota genomes reveal coherent separation into clusters roughly equivalent to the genus level, and SPOT01 represents a new genus of marine Thaumarchaea. Competitive fragment recruitment of globally distributed metagenomes from TARA, Ocean Sampling Day, and those generated from a station off California shows that the SPOT01 genus is often the most abundant genus, especially where total Thaumarchaea are most abundant in the overall community. The SPOT01 genome contains urease genes allowing it to use an alternative form of N. Genomic and metagenomic analysis also reveal that among planktonic genomes and populations, the urease genes in general are more frequently found in members of the SPOT01 genus and another genus dominant in deep waters, thus we predict these two genera contribute most significantly to urea utilization among marine Thaumarchaea. Recruitment also revealed broader biogeographic and ecological patterns of the putative genera. The SPOT01 genus was most abundant at colder temperatures (45 degrees). The genus containing Nitrosopumilus maritimus had the highest temperature range, and the genus containing Candidatus Nitrosopelagicus brevis was typically most abundant at intermediate temperatures and intermediate latitudes ( 35-45 degrees). Together these genome and metagenome enabled analyses provide significant new insight into the ecology and biogeochemical contributions of marine archaea.

  6. The mitochondrial genome of Frankliniella intonsa: insights into the evolution of mitochondrial genomes at lower taxonomic levels in Thysanoptera.

    Science.gov (United States)

    Yan, Dankan; Tang, Yunxia; Hu, Min; Liu, Fengquan; Zhang, Dongfang; Fan, Jiaqin

    2014-10-01

    Thrips is an ideal group for studying the evolution of mitochondrial (mt) genomes in the genus and family due to independent rearrangements within this order. The complete sequence of the mitochondrial DNA (mtDNA) of the flower thrips Frankliniella intonsa has been completed and annotated in this study. The circular genome is 15,215bp in length with an A+T content of 75.9% and contains the typical 37 genes and it has triplicate putative control regions. Nucleotide composition is A+T biased, and the majority of the protein-coding genes present opposite CG skew which is reflected by the nucleotide composition, codon and amino acid usage. Although the known thrips have massive gene rearrangements, it showed no reversal of strand asymmetry. Gene rearrangements have been found in the lower taxonomic levels of thrips. Three tRNA genes were translocated in the genus Frankliniella and eight tRNA genes in the family Thripidae. Although the gene arrangements of mt genomes of all three thrips species differ massively from the ancestral insect, they are all very similar to each other, indicating that there was a large rearrangement somewhere before the most recent common ancestor of these three species and very little genomic evolution or rearrangements after then. The extremely similar sequences among the CRs suggest that they are ongoing concerted evolution. Analyses of the up and downstream sequence of CRs reveal that the CR2 is actually the ancestral CR. The three CRs are in the same spot in each of the three thrips mt genomes which have the identical inverted genes. These characteristics might be obtained from the most recent common ancestor of this three thrips. Above observations suggest that the mt genomes of the three thrips keep a single massive rearrangement from the common ancestor and have low evolutionary rates among them. Copyright © 2014 Elsevier Inc. All rights reserved.

  7. Genome-level homology and phylogeny of Shewanella (Gammaproteobacteria: lteromonadales: Shewanellaceae

    Directory of Open Access Journals (Sweden)

    Dikow Rebecca B

    2011-05-01

    Full Text Available Abstract Background The explosion in availability of whole genome data provides the opportunity to build phylogenetic hypotheses based on these data as well as the ability to learn more about the genomes themselves. The biological history of genes and genomes can be investigated based on the taxomonic history provided by the phylogeny. A phylogenetic hypothesis based on complete genome data is presented for the genus Shewanella (Gammaproteobacteria: Alteromonadales: Shewanellaceae. Nineteen taxa from Shewanella (16 species and 3 additional strains of one species as well as three outgroup species representing the genera Aeromonas (Gammaproteobacteria: Aeromonadales: Aeromonadaceae, Alteromonas (Gammaproteobacteria: Alteromonadales: Alteromonadaceae and Colwellia (Gammaproteobacteria: Alteromonadales: Colwelliaceae are included for a total of 22 taxa. Results Putatively homologous regions were found across unannotated genomes and tested with a phylogenetic analysis. Two genome-wide data-sets are considered, one including only those genomic regions for which all taxa are represented, which included 3,361,015 aligned nucleotide base-pairs (bp and a second that additionally includes those regions present in only subsets of taxa, which totaled 12,456,624 aligned bp. Alignment columns in these large data-sets were then randomly sampled to create smaller data-sets. After the phylogenetic hypothesis was generated, genome annotations were projected onto the DNA sequence alignment to compare the historical hypothesis generated by the phylogeny with the functional hypothesis posited by annotation. Conclusions Individual phylogenetic analyses of the 243 locally co-linear genome regions all failed to recover the genome topology, but the smaller data-sets that were random samplings of the large concatenated alignments all produced the genome topology. It is shown that there is not a single orthologous copy of 16S rRNA across the taxon sampling included in this

  8. Cross-disorder genome-wide analyses suggest a complex genetic relationship between Tourette's syndrome and OCD.

    Science.gov (United States)

    Yu, Dongmei; Mathews, Carol A; Scharf, Jeremiah M; Neale, Benjamin M; Davis, Lea K; Gamazon, Eric R; Derks, Eske M; Evans, Patrick; Edlund, Christopher K; Crane, Jacquelyn; Fagerness, Jesen A; Osiecki, Lisa; Gallagher, Patience; Gerber, Gloria; Haddad, Stephen; Illmann, Cornelia; McGrath, Lauren M; Mayerfeld, Catherine; Arepalli, Sampath; Barlassina, Cristina; Barr, Cathy L; Bellodi, Laura; Benarroch, Fortu; Berrió, Gabriel Bedoya; Bienvenu, O Joseph; Black, Donald W; Bloch, Michael H; Brentani, Helena; Bruun, Ruth D; Budman, Cathy L; Camarena, Beatriz; Campbell, Desmond D; Cappi, Carolina; Silgado, Julio C Cardona; Cavallini, Maria C; Chavira, Denise A; Chouinard, Sylvain; Cook, Edwin H; Cookson, M R; Coric, Vladimir; Cullen, Bernadette; Cusi, Daniele; Delorme, Richard; Denys, Damiaan; Dion, Yves; Eapen, Valsama; Egberts, Karin; Falkai, Peter; Fernandez, Thomas; Fournier, Eduardo; Garrido, Helena; Geller, Daniel; Gilbert, Donald L; Girard, Simon L; Grabe, Hans J; Grados, Marco A; Greenberg, Benjamin D; Gross-Tsur, Varda; Grünblatt, Edna; Hardy, John; Heiman, Gary A; Hemmings, Sian M J; Herrera, Luis D; Hezel, Dianne M; Hoekstra, Pieter J; Jankovic, Joseph; Kennedy, James L; King, Robert A; Konkashbaev, Anuar I; Kremeyer, Barbara; Kurlan, Roger; Lanzagorta, Nuria; Leboyer, Marion; Leckman, James F; Lennertz, Leonhard; Liu, Chunyu; Lochner, Christine; Lowe, Thomas L; Lupoli, Sara; Macciardi, Fabio; Maier, Wolfgang; Manunta, Paolo; Marconi, Maurizio; McCracken, James T; Mesa Restrepo, Sandra C; Moessner, Rainald; Moorjani, Priya; Morgan, Jubel; Muller, Heike; Murphy, Dennis L; Naarden, Allan L; Nurmi, Erika; Ochoa, William Cornejo; Ophoff, Roel A; Pakstis, Andrew J; Pato, Michele T; Pato, Carlos N; Piacentini, John; Pittenger, Christopher; Pollak, Yehuda; Rauch, Scott L; Renner, Tobias; Reus, Victor I; Richter, Margaret A; Riddle, Mark A; Robertson, Mary M; Romero, Roxana; Rosário, Maria C; Rosenberg, David; Ruhrmann, Stephan; Sabatti, Chiara; Salvi, Erika; Sampaio, Aline S; Samuels, Jack; Sandor, Paul; Service, Susan K; Sheppard, Brooke; Singer, Harvey S; Smit, Jan H; Stein, Dan J; Strengman, Eric; Tischfield, Jay A; Turiel, Maurizio; Valencia Duarte, Ana V; Vallada, Homero; Veenstra-VanderWeele, Jeremy; Walitza, Susanne; Wang, Ying; Weale, Mike; Weiss, Robert; Wendland, Jens R; Westenberg, Herman G M; Shugart, Yin Yao; Hounie, Ana G; Miguel, Euripedes C; Nicolini, Humberto; Wagner, Michael; Ruiz-Linares, Andres; Cath, Danielle C; McMahon, William; Posthuma, Danielle; Oostra, Ben A; Nestadt, Gerald; Rouleau, Guy A; Purcell, Shaun; Jenike, Michael A; Heutink, Peter; Hanna, Gregory L; Conti, David V; Arnold, Paul D; Freimer, Nelson B; Stewart, S Evelyn; Knowles, James A; Cox, Nancy J; Pauls, David L

    2015-01-01

    Obsessive-compulsive disorder (OCD) and Tourette's syndrome are highly heritable neurodevelopmental disorders that are thought to share genetic risk factors. However, the identification of definitive susceptibility genes for these etiologically complex disorders remains elusive. The authors report a combined genome-wide association study (GWAS) of Tourette's syndrome and OCD. The authors conducted a GWAS in 2,723 cases (1,310 with OCD, 834 with Tourette's syndrome, 579 with OCD plus Tourette's syndrome/chronic tics), 5,667 ancestry-matched controls, and 290 OCD parent-child trios. GWAS summary statistics were examined for enrichment of functional variants associated with gene expression levels in brain regions. Polygenic score analyses were conducted to investigate the genetic architecture within and across the two disorders. Although no individual single-nucleotide polymorphisms (SNPs) achieved genome-wide significance, the GWAS signals were enriched for SNPs strongly associated with variations in brain gene expression levels (expression quantitative loci, or eQTLs), suggesting the presence of true functional variants that contribute to risk of these disorders. Polygenic score analyses identified a significant polygenic component for OCD (p=2×10(-4)), predicting 3.2% of the phenotypic variance in an independent data set. In contrast, Tourette's syndrome had a smaller, nonsignificant polygenic component, predicting only 0.6% of the phenotypic variance (p=0.06). No significant polygenic signal was detected across the two disorders, although the sample is likely underpowered to detect a modest shared signal. Furthermore, the OCD polygenic signal was significantly attenuated when cases with both OCD and co-occurring Tourette's syndrome/chronic tics were included in the analysis (p=0.01). Previous work has shown that Tourette's syndrome and OCD have some degree of shared genetic variation. However, the data from this study suggest that there are also distinct

  9. Genome-wide Comparative Analyses Reveal the Dynamic Evolution of Nucleotide-Binding Leucine-Rich Repeat Gene Family among Solanaceae Plants

    Directory of Open Access Journals (Sweden)

    Eunyoung Seo

    2016-08-01

    Full Text Available Plants have evolved an elaborate innate immune system against invading pathogens. Within this system, intracellular nucleotide-binding leucine-rich repeat (NLR immune receptors are known play critical roles in effector-triggered immunity (ETI plant defense. We performed genome-wide identification and classification of NLR-coding sequences from the genomes of pepper, tomato, and potato using fixed criteria. We then compared genomic duplication and evolution features. We identified intact 267, 443, and 755 NLR-encoding genes in tomato, potato, and pepper genomes, respectively. Phylogenetic analyses and classification of Solanaceae NLRs revealed that the majority of NLR super family members fell into 14 subgroups, including a TIR-NLR (TNL subgroup and 13 non-TNL subgroups. Specific subgroups have expanded in each genome, with the expansion in pepper showing subgroup-specific physical clusters. Comparative analysis of duplications showed distinct duplication patterns within pepper and among Solanaceae plants suggesting subgroup- or species-specific gene duplication events after speciation, resulting in divergent evolution. Taken together, genome-wide analyses of NLR family members provide insights into their evolutionary history in Solanaceae. These findings also provide important foundational knowledge for understanding NLR evolution and will empower broader characterization of disease resistance genes to be used for crop breeding.

  10. An object model for genome information at all levels of resolution

    Energy Technology Data Exchange (ETDEWEB)

    Honda, S.; Parrott, N.W.; Smith, R.; Lawrence, C.

    1993-12-31

    An object model for genome data at all levels of resolution is described. The model was derived by considering the requirements for representing genome related objects in three application domains: genome maps, large-scale DNA sequencing, and exploring functional information in gene and protein sequences. The methodology used for the object-oriented analysis is also described.

  11. Application of the inter-line PCR for the analyse of genomic rearrangements in radiation-transformed mammalian cell lines

    International Nuclear Information System (INIS)

    Leibhard, S.; Smida, J.

    1996-01-01

    Repetitive DNA sequences of the LINE-family (long interspersed elements) that are widely distributed among the mammalian genome can be activated or altered by the exposure to ionizing radiation [1]. By the integration at new sites in the genome alterations in the expression of genes that are involved in cell transformation and/or carcinogenesis may occur [2, 3]. A new technique -the inter-LINE PCR - has been developed in order to detect and analyse such genomic rearrangements in radiation-transformed cell lines. From the sites of transformation- or tumour-specific changes in the genome it might be possible to develop new tumour markers for diagnostic purpose. (orig.) [de

  12. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-01

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human

  13. Resolution effects in reconstructing ancestral genomes.

    Science.gov (United States)

    Zheng, Chunfang; Jeong, Yuji; Turcotte, Madisyn Gabrielle; Sankoff, David

    2018-05-09

    The reconstruction of ancestral genomes must deal with the problem of resolution, necessarily involving a trade-off between trying to identify genomic details and being overwhelmed by noise at higher resolutions. We use the median reconstruction at the synteny block level, of the ancestral genome of the order Gentianales, based on coffee, Rhazya stricta and grape, to exemplify the effects of resolution (granularity) on comparative genomic analyses. We show how decreased resolution blurs the differences between evolving genomes, with respect to rate, mutational process and other characteristics.

  14. Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

    Science.gov (United States)

    Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

    2014-04-01

    Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.

  15. Gene Set Analyses of Genome-Wide Association Studies on 49 Quantitative Traits Measured in a Single Genetic Epidemiology Dataset

    Directory of Open Access Journals (Sweden)

    Jihye Kim

    2013-09-01

    Full Text Available Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait (pcorr < 0.05. Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.

  16. Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

    Science.gov (United States)

    Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

    2018-02-01

    This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.

  17. Human social genomics.

    Directory of Open Access Journals (Sweden)

    Steven W Cole

    2014-08-01

    Full Text Available A growing literature in human social genomics has begun to analyze how everyday life circumstances influence human gene expression. Social-environmental conditions such as urbanity, low socioeconomic status, social isolation, social threat, and low or unstable social status have been found to associate with differential expression of hundreds of gene transcripts in leukocytes and diseased tissues such as metastatic cancers. In leukocytes, diverse types of social adversity evoke a common conserved transcriptional response to adversity (CTRA characterized by increased expression of proinflammatory genes and decreased expression of genes involved in innate antiviral responses and antibody synthesis. Mechanistic analyses have mapped the neural "social signal transduction" pathways that stimulate CTRA gene expression in response to social threat and may contribute to social gradients in health. Research has also begun to analyze the functional genomics of optimal health and thriving. Two emerging opportunities now stand to revolutionize our understanding of the everyday life of the human genome: network genomics analyses examining how systems-level capabilities emerge from groups of individual socially sensitive genomes and near-real-time transcriptional biofeedback to empirically optimize individual well-being in the context of the unique genetic, geographic, historical, developmental, and social contexts that jointly shape the transcriptional realization of our innate human genomic potential for thriving.

  18. Adaptation, Ecology, and Evolution of the Halophilic Stromatolite Archaeon Halococcus hamelinensis Inferred through Genome Analyses

    Directory of Open Access Journals (Sweden)

    Reema K. Gudhka

    2015-01-01

    Full Text Available Halococcus hamelinensis was the first archaeon isolated from stromatolites. These geomicrobial ecosystems are thought to be some of the earliest known on Earth, yet, despite their evolutionary significance, the role of Archaea in these systems is still not well understood. Detailed here is the genome sequencing and analysis of an archaeon isolated from stromatolites. The genome of H. hamelinensis consisted of 3,133,046 base pairs with an average G+C content of 60.08% and contained 3,150 predicted coding sequences or ORFs, 2,196 (68.67% of which were protein-coding genes with functional assignments and 954 (29.83% of which were of unknown function. Codon usage of the H. hamelinensis genome was consistent with a highly acidic proteome, a major adaptive mechanism towards high salinity. Amino acid transport and metabolism, inorganic ion transport and metabolism, energy production and conversion, ribosomal structure, and unknown function COG genes were overrepresented. The genome of H. hamelinensis also revealed characteristics reflecting its survival in its extreme environment, including putative genes/pathways involved in osmoprotection, oxidative stress response, and UV damage repair. Finally, genome analyses indicated the presence of putative transposases as well as positive matches of genes of H. hamelinensis against various genomes of Bacteria, Archaea, and viruses, suggesting the potential for horizontal gene transfer.

  19. Genome-wide analyses reveal a role for peptide hormones in planarian germline development.

    Directory of Open Access Journals (Sweden)

    James J Collins

    Full Text Available Bioactive peptides (i.e., neuropeptides or peptide hormones represent the largest class of cell-cell signaling molecules in metazoans and are potent regulators of neural and physiological function. In vertebrates, peptide hormones play an integral role in endocrine signaling between the brain and the gonads that controls reproductive development, yet few of these molecules have been shown to influence reproductive development in invertebrates. Here, we define a role for peptide hormones in controlling reproductive physiology of the model flatworm, the planarian Schmidtea mediterranea. Based on our observation that defective neuropeptide processing results in defects in reproductive system development, we employed peptidomic and functional genomic approaches to characterize the planarian peptide hormone complement, identifying 51 prohormone genes and validating 142 peptides biochemically. Comprehensive in situ hybridization analyses of prohormone gene expression revealed the unanticipated complexity of the flatworm nervous system and identified a prohormone specifically expressed in the nervous system of sexually reproducing planarians. We show that this member of the neuropeptide Y superfamily is required for the maintenance of mature reproductive organs and differentiated germ cells in the testes. Additionally, comparative analyses of our biochemically validated prohormones with the genomes of the parasitic flatworms Schistosoma mansoni and Schistosoma japonicum identified new schistosome prohormones and validated half of all predicted peptide-encoding genes in these parasites. These studies describe the peptide hormone complement of a flatworm on a genome-wide scale and reveal a previously uncharacterized role for peptide hormones in flatworm reproduction. Furthermore, they suggest new opportunities for using planarians as free-living models for understanding the reproductive biology of flatworm parasites.

  20. Comparative genomics reveals insights into avian genome evolution and adaptation

    Science.gov (United States)

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  1. Whole-genome analyses of DS-1-like human G2P[4] and G8P[4] rotavirus strains from Eastern, Western and Southern Africa.

    Science.gov (United States)

    Nyaga, Martin M; Stucker, Karla M; Esona, Mathew D; Jere, Khuzwayo C; Mwinyi, Bakari; Shonhai, Annie; Tsolenyanu, Enyonam; Mulindwa, Augustine; Chibumbya, Julia N; Adolfine, Hokororo; Halpin, Rebecca A; Roy, Sunando; Stockwell, Timothy B; Berejena, Chipo; Seheri, Mapaseka L; Mwenda, Jason M; Steele, A Duncan; Wentworth, David E; Mphahlele, M Jeffrey

    2014-10-01

    Group A rotaviruses (RVAs) with distinct G and P genotype combinations have been reported globally. We report the genome composition and possible origin of seven G8P[4] and five G2P[4] human RVA strains based on the genetic evolution of all 11 genome segments at the nucleotide level. Twelve RVA ELISA positive stool samples collected in the representative countries of Eastern, Southern and West Africa during the 2007-2012 surveillance seasons were subjected to sequencing using the Ion Torrent PGM and Illumina MiSeq platforms. A reference-based assembly was performed using CLC Bio's clc_ref_assemble_long program, and full-genome consensus sequences were obtained. With the exception of the neutralising antigen, VP7, all study strains exhibited the DS-1-like genome constellation (P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2) and clustered phylogenetically with reference strains having a DS-1-like genetic backbone. Comparison of the nucleotide and amino acid sequences with selected global cognate genome segments revealed nucleotide and amino acid sequence identities of 81.7-100 % and 90.6-100 %, respectively, with NSP4 gene segment showing the most diversity among the strains. Bayesian analyses of all gene sequences to estimate the time of divergence of the lineage indicated that divergence times ranged from 16 to 44 years, except for the NSP4 gene where the lineage seemed to arise in the more distant past at an estimated 203 years ago. However, the long-term effects of changes found within the NSP4 genome segment should be further explored, and thus we recommend continued whole-genome analyses from larger sample sets to determine the evolutionary mechanisms of the DS-1-like strains collected in Africa.

  2. Genome-wide meta-analyses identify multiple loci associated with smoking behavior.

    LENUS (Irish Health Repository)

    2010-05-01

    Consistent but indirect evidence has implicated genetic factors in smoking behavior. We report meta-analyses of several smoking phenotypes within cohorts of the Tobacco and Genetics Consortium (n = 74,053). We also partnered with the European Network of Genetic and Genomic Epidemiology (ENGAGE) and Oxford-GlaxoSmithKline (Ox-GSK) consortia to follow up the 15 most significant regions (n > 140,000). We identified three loci associated with number of cigarettes smoked per day. The strongest association was a synonymous 15q25 SNP in the nicotinic receptor gene CHRNA3 (rs1051730[A], beta = 1.03, standard error (s.e.) = 0.053, P = 2.8 x 10(-73)). Two 10q25 SNPs (rs1329650[G], beta = 0.367, s.e. = 0.059, P = 5.7 x 10(-10); and rs1028936[A], beta = 0.446, s.e. = 0.074, P = 1.3 x 10(-9)) and one 9q13 SNP in EGLN2 (rs3733829[G], beta = 0.333, s.e. = 0.058, P = 1.0 x 10(-8)) also exceeded genome-wide significance for cigarettes per day. For smoking initiation, eight SNPs exceeded genome-wide significance, with the strongest association at a nonsynonymous SNP in BDNF on chromosome 11 (rs6265[C], odds ratio (OR) = 1.06, 95% confidence interval (Cl) 1.04-1.08, P = 1.8 x 10(-8)). One SNP located near DBH on chromosome 9 (rs3025343[G], OR = 1.12, 95% Cl 1.08-1.18, P = 3.6 x 10(-8)) was significantly associated with smoking cessation.

  3. Comparative analyses reveal high levels of conserved colinearity between the finger millet and rice genomes.

    Science.gov (United States)

    Srinivasachary; Dida, Mathews M; Gale, Mike D; Devos, Katrien M

    2007-08-01

    Finger millet is an allotetraploid (2n = 4x = 36) grass that belongs to the Chloridoideae subfamily. A comparative analysis has been carried out to determine the relationship of the finger millet genome with that of rice. Six of the nine finger millet homoeologous groups corresponded to a single rice chromosome each. Each of the remaining three finger millet groups were orthologous to two rice chromosomes, and in all the three cases one rice chromosome was inserted into the centromeric region of a second rice chromosome to give the finger millet chromosomal configuration. All observed rearrangements were, among the grasses, unique to finger millet and, possibly, the Chloridoideae subfamily. Gene orders between rice and finger millet were highly conserved, with rearrangements being limited largely to single marker transpositions and small putative inversions encompassing at most three markers. Only some 10% of markers mapped to non-syntenic positions in rice and finger millet and the majority of these were located in the distal 14% of chromosome arms, supporting a possible correlation between recombination and sequence evolution as has previously been observed in wheat. A comparison of the organization of finger millet, Panicoideae and Pooideae genomes relative to rice allowed us to infer putative ancestral chromosome configurations in the grasses.

  4. Exceptionally high levels of recombination across the honey bee genome.

    Science.gov (United States)

    Beye, Martin; Gattermeier, Irene; Hasselmann, Martin; Gempe, Tanja; Schioett, Morten; Baines, John F; Schlipalius, David; Mougel, Florence; Emore, Christine; Rueppell, Olav; Sirviö, Anu; Guzmán-Novoa, Ernesto; Hunt, Greg; Solignac, Michel; Page, Robert E

    2006-11-01

    The first draft of the honey bee genome sequence and improved genetic maps are utilized to analyze a genome displaying 10 times higher levels of recombination (19 cM/Mb) than previously analyzed genomes of higher eukaryotes. The exceptionally high recombination rate is distributed genome-wide, but varies by two orders of magnitude. Analysis of chromosome, sequence, and gene parameters with respect to recombination showed that local recombination rate is associated with distance to the telomere, GC content, and the number of simple repeats as described for low-recombining genomes. Recombination rate does not decrease with chromosome size. On average 5.7 recombination events per chromosome pair per meiosis are found in the honey bee genome. This contrasts with a wide range of taxa that have a uniform recombination frequency of about 1.6 per chromosome pair. The excess of recombination activity does not support a mechanistic role of recombination in stabilizing pairs of homologous chromosome during chromosome pairing. Recombination rate is associated with gene size, suggesting that introns are larger in regions of low recombination and may improve the efficacy of selection in these regions. Very few transposons and no retrotransposons are present in the high-recombining genome. We propose evolutionary explanations for the exceptionally high genome-wide recombination rate.

  5. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  6. Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria

    Science.gov (United States)

    Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.

    2008-01-01

    Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742

  7. Base-By-Base: single nucleotide-level analysis of whole viral genome alignments.

    Science.gov (United States)

    Brodie, Ryan; Smith, Alex J; Roper, Rachel L; Tcherepanov, Vasily; Upton, Chris

    2004-07-14

    With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools. A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files. Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.

  8. Gigwa-Genotype investigator for genome-wide analyses.

    Science.gov (United States)

    Sempéré, Guilhem; Philippe, Florian; Dereeper, Alexis; Ruiz, Manuel; Sarah, Gautier; Larmande, Pierre

    2016-06-06

    Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions. Here we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats. The Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.

  9. Complex analyses of inverted repeats in mitochondrial genomes revealed their importance and variability.

    Science.gov (United States)

    Cechová, Jana; Lýsek, Jirí; Bartas, Martin; Brázda, Václav

    2018-04-01

    The NCBI database contains mitochondrial DNA (mtDNA) genomes from numerous species. We investigated the presence and locations of inverted repeat sequences (IRs) in these mtDNA sequences, which are known to be important for regulating nuclear genomes. IRs were identified in mtDNA in all species. IR lengths and frequencies correlate with evolutionary age and the greatest variability was detected in subgroups of plants and fungi and the lowest variability in mammals. IR presence is non-random and evolutionary favoured. The frequency of IRs generally decreased with IR length, but not for IRs 24 or 30 bp long, which are 1.5 times more abundant. IRs are enriched in sequences from the replication origin, followed by D-loop, stem-loop and miscellaneous sequences, pointing to the importance of IRs in regulatory regions of mitochondrial DNA. Data were produced using Palindrome analyser, freely available on the web at http://bioinformatics.ibp.cz. vaclav@ibp.cz. Supplementary data are available at Bioinformatics online.

  10. Functional and comparative genomics analyses of pmp22 in medaka fish

    Directory of Open Access Journals (Sweden)

    Kawarabayasi Yutaka

    2009-06-01

    Full Text Available Abstract Background Pmp22, a member of the junction protein family Claudin/EMP/PMP22, plays an important role in myelin formation. Increase of pmp22 transcription causes peripheral neuropathy, Charcot-Marie-Tooth disease type1A (CMT1A. The pathophysiological phenotype of CMT1A is aberrant axonal myelination which induces a reduction in nerve conduction velocity (NCV. Several CMT1A model rodents have been established by overexpressing pmp22. Thus, it is thought that pmp22 expression must be tightly regulated for correct myelin formation in mammals. Interestingly, the myelin sheath is also present in other jawed vertebrates. The purpose of this study is to analyze the evolutionary conservation of the association between pmp22 transcription level and vertebrate myelin formation, and to find the conserved non-coding sequences for pmp22 regulation by comparative genomics analyses between jawed fishes and mammals. Results A transgenic pmp22 over-expression medaka fish line was established. The transgenic fish had approximately one fifth the peripheral NCV values of controls, and aberrant myelination of transgenic fish in the peripheral nerve system (PNS was observed. We successfully confirmed that medaka fish pmp22 has the same exon-intron structure as mammals, and identified some known conserved regulatory motifs. Furthermore, we found novel conserved sequences in the first intron and 3'UTR. Conclusion Medaka fish undergo abnormalities in the PNS when pmp22 transcription increases. This result indicates that an adequate pmp22 transcription level is necessary for correct myelination of jawed vertebrates. Comparison of pmp22 orthologs between distantly related species identifies evolutionary conserved sequences that contribute to precise regulation of pmp22 expression.

  11. Genome-wide-analyses of Listeria monocytogenes from food-processing plants reveal clonal diversity and date the emergence of persisting sequence types.

    Science.gov (United States)

    Knudsen, Gitte M; Nielsen, Jesper Boye; Marvig, Rasmus L; Ng, Yin; Worning, Peder; Westh, Henrik; Gram, Lone

    2017-08-01

    Whole genome sequencing is increasing used in epidemiology, e.g. for tracing outbreaks of food-borne diseases. This requires in-depth understanding of pathogen emergence, persistence and genomic diversity along the food production chain including in food processing plants. We sequenced the genomes of 80 isolates of Listeria monocytogenes sampled from Danish food processing plants over a time-period of 20 years, and analysed the sequences together with 10 public available reference genomes to advance our understanding of interplant and intraplant genomic diversity of L. monocytogenes. Except for three persisting sequence types (ST) based on Multi Locus Sequence Typing being ST7, ST8 and ST121, long-term persistence of clonal groups was limited, and new clones were introduced continuously, potentially from raw materials. No particular gene could be linked to the persistence phenotype. Using time-based phylogenetic analyses of the persistent STs, we estimate the L. monocytogenes evolutionary rate to be 0.18-0.35 single nucleotide polymorphisms/year, suggesting that the persistent STs emerged approximately 100 years ago, which correlates with the onset of industrialization and globalization of the food market. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.

  12. Single-trait and multi-trait genome-wide association analyses identify novel loci for blood pressure in African-ancestry populations

    OpenAIRE

    Liang, Jingjing; Le, Thu H.; Edwards, Digna R. Velez; Tayo, Bamidele O.; Gaulton, Kyle J.; Smith, Jennifer A.; Lu, Yingchang; Jensen, Richard A.; Chen, Guanjie; Yanek, Lisa R.; Schwander, Karen; Tajuddin, Salman M.; Sofer, Tamar; Kim, Wonji; Kayima, James

    2017-01-01

    © 2017 Public Library of Science. All Rights Reserved. Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genom...

  13. Genome size evolution at the speciation level: the cryptic species complex Brachionus plicatilis (Rotifera).

    Science.gov (United States)

    Stelzer, Claus-Peter; Riss, Simone; Stadler, Peter

    2011-04-07

    Studies on genome size variation in animals are rarely done at lower taxonomic levels, e.g., slightly above/below the species level. Yet, such variation might provide important clues on the tempo and mode of genome size evolution. In this study we used the flow-cytometry method to study the evolution of genome size in the rotifer Brachionus plicatilis, a cryptic species complex consisting of at least 14 closely related species. We found an unexpectedly high variation in this species complex, with genome sizes ranging approximately seven-fold (haploid '1C' genome sizes: 0.056-0.416 pg). Most of this variation (67%) could be ascribed to the major clades of the species complex, i.e. clades that are well separated according to most species definitions. However, we also found substantial variation (32%) at lower taxonomic levels--within and among genealogical species--and, interestingly, among species pairs that are not completely reproductively isolated. In one genealogical species, called B. 'Austria', we found greatly enlarged genome sizes that could roughly be approximated as multiples of the genomes of its closest relatives, which suggests that whole-genome duplications have occurred early during separation of this lineage. Overall, genome size was significantly correlated to egg size and body size, even though the latter became non-significant after controlling for phylogenetic non-independence. Our study suggests that substantial genome size variation can build up early during speciation, potentially even among isolated populations. An alternative, but not mutually exclusive interpretation might be that reproductive isolation tends to build up unusually slow in this species complex.

  14. Single-trait and multi-trait genome-wide association analyses identify novel loci for blood pressure in African-ancestry populations.

    Directory of Open Access Journals (Sweden)

    Jingjing Liang

    2017-05-01

    Full Text Available Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genome-wide association studies comprised of 31,968 individuals of African ancestry, and validated our results with additional 54,395 individuals from multi-ethnic studies. These analyses identified nine loci with eleven independent variants which reached genome-wide significance (P < 1.25×10-8 for either systolic and diastolic blood pressure, hypertension, or for combined traits. Single-trait analyses identified two loci (TARID/TCF21 and LLPH/TMBIM4 and multiple-trait analyses identified one novel locus (FRMD3 for blood pressure. At these three loci, as well as at GRP20/CDH17, associated variants had alleles common only in African-ancestry populations. Functional annotation showed enrichment for genes expressed in immune and kidney cells, as well as in heart and vascular cells/tissues. Experiments driven by these findings and using angiotensin-II induced hypertension in mice showed altered kidney mRNA expression of six genes, suggesting their potential role in hypertension. Our study provides new evidence for genes related to hypertension susceptibility, and the need to study African-ancestry populations in order to identify biologic factors contributing to hypertension.

  15. Genome Size Dynamics and Evolution in Monocots

    Directory of Open Access Journals (Sweden)

    Ilia J. Leitch

    2010-01-01

    Full Text Available Monocot genomic diversity includes striking variation at many levels. This paper compares various genomic characters (e.g., range of chromosome numbers and ploidy levels, occurrence of endopolyploidy, GC content, chromosome packaging and organization, genome size between monocots and the remaining angiosperms to discern just how distinctive monocot genomes are. One of the most notable features of monocots is their wide range and diversity of genome sizes, including the species with the largest genome so far reported in plants. This genomic character is analysed in greater detail, within a phylogenetic context. By surveying available genome size and chromosome data it is apparent that different monocot orders follow distinctive modes of genome size and chromosome evolution. Further insights into genome size-evolution and dynamics were obtained using statistical modelling approaches to reconstruct the ancestral genome size at key nodes across the monocot phylogenetic tree. Such approaches reveal that while the ancestral genome size of all monocots was small (1C=1.9 pg, there have been several major increases and decreases during monocot evolution. In addition, notable increases in the rates of genome size-evolution were found in Asparagales and Poales compared with other monocot lineages.

  16. Base-By-Base: Single nucleotide-level analysis of whole viral genome alignments

    Directory of Open Access Journals (Sweden)

    Tcherepanov Vasily

    2004-07-01

    Full Text Available Abstract Background With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes is not feasible without new bioinformatics tools. Results A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1 rapidly identify and correct alignment errors in large, multiple genome alignments; and 2 generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs to retrieve detailed annotation information about the aligned genomes or use information from text files. Conclusion Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.

  17. Genomic taxonomy of vibrios

    Directory of Open Access Journals (Sweden)

    Iida Tetsuya

    2009-10-01

    Full Text Available Abstract Background Vibrio taxonomy has been based on a polyphasic approach. In this study, we retrieve useful taxonomic information (i.e. data that can be used to distinguish different taxonomic levels, such as species and genera from 32 genome sequences of different vibrio species. We use a variety of tools to explore the taxonomic relationship between the sequenced genomes, including Multilocus Sequence Analysis (MLSA, supertrees, Average Amino Acid Identity (AAI, genomic signatures, and Genome BLAST atlases. Our aim is to analyse the usefulness of these tools for species identification in vibrios. Results We have generated four new genome sequences of three Vibrio species, i.e., V. alginolyticus 40B, V. harveyi-like 1DA3, and V. mimicus strains VM573 and VM603, and present a broad analyses of these genomes along with other sequenced Vibrio species. The genome atlas and pangenome plots provide a tantalizing image of the genomic differences that occur between closely related sister species, e.g. V. cholerae and V. mimicus. The vibrio pangenome contains around 26504 genes. The V. cholerae core genome and pangenome consist of 1520 and 6923 genes, respectively. Pangenomes might allow different strains of V. cholerae to occupy different niches. MLSA and supertree analyses resulted in a similar phylogenetic picture, with a clear distinction of four groups (Vibrio core group, V. cholerae-V. mimicus, Aliivibrio spp., and Photobacterium spp.. A Vibrio species is defined as a group of strains that share > 95% DNA identity in MLSA and supertree analysis, > 96% AAI, ≤ 10 genome signature dissimilarity, and > 61% proteome identity. Strains of the same species and species of the same genus will form monophyletic groups on the basis of MLSA and supertree. Conclusion The combination of different analytical and bioinformatics tools will enable the most accurate species identification through genomic computational analysis. This endeavour will culminate in

  18. Evolutionary trajectories of snake genes and genomes revealed by comparative analyses of five-pacer viper

    Science.gov (United States)

    Yin, Wei; Wang, Zong-ji; Li, Qi-ye; Lian, Jin-ming; Zhou, Yang; Lu, Bing-zheng; Jin, Li-jun; Qiu, Peng-xin; Zhang, Pei; Zhu, Wen-bo; Wen, Bo; Huang, Yi-jun; Lin, Zhi-long; Qiu, Bi-tao; Su, Xing-wen; Yang, Huan-ming; Zhang, Guo-jie; Yan, Guang-mei; Zhou, Qi

    2016-01-01

    Snakes have numerous features distinctive from other tetrapods and a rich history of genome evolution that is still obscure. Here, we report the high-quality genome of the five-pacer viper, Deinagkistrodon acutus, and comparative analyses with other representative snake and lizard genomes. We map the evolutionary trajectories of transposable elements (TEs), developmental genes and sex chromosomes onto the snake phylogeny. TEs exhibit dynamic lineage-specific expansion, and many viper TEs show brain-specific gene expression along with their nearby genes. We detect signatures of adaptive evolution in olfactory, venom and thermal-sensing genes and also functional degeneration of genes associated with vision and hearing. Lineage-specific relaxation of functional constraints on respective Hox and Tbx limb-patterning genes supports fossil evidence for a successive loss of forelimbs then hindlimbs during snake evolution. Finally, we infer that the ZW sex chromosome pair had undergone at least three recombination suppression events in the ancestor of advanced snakes. These results altogether forge a framework for our deep understanding into snakes' history of molecular evolution. PMID:27708285

  19. Genomic analyses of the Chlamydia trachomatis core genome show an association between chromosomal genome, plasmid type and disease

    NARCIS (Netherlands)

    Versteeg, Bart; Bruisten, Sylvia M.; Pannekoek, Yvonne; Jolley, Keith A.; Maiden, Martin C. J.; van der Ende, Arie; Harrison, Odile B.

    2018-01-01

    Background: Chlamydia trachomatis (Ct) plasmid has been shown to encode genes essential for infection. We evaluated the population structure of Ct using whole-genome sequence data (WGS). In particular, the relationship between the Ct genome, plasmid and disease was investigated. Results: WGS data

  20. An Integrative Bioinformatics Framework for Genome-scale Multiple Level Network Reconstruction of Rice

    Directory of Open Access Journals (Sweden)

    Liu Lili

    2013-06-01

    Full Text Available Understanding how metabolic reactions translate the genome of an organism into its phenotype is a grand challenge in biology. Genome-wide association studies (GWAS statistically connect genotypes to phenotypes, without any recourse to known molecular interactions, whereas a molecular mechanistic description ties gene function to phenotype through gene regulatory networks (GRNs, protein-protein interactions (PPIs and molecular pathways. Integration of different regulatory information levels of an organism is expected to provide a good way for mapping genotypes to phenotypes. However, the lack of curated metabolic model of rice is blocking the exploration of genome-scale multi-level network reconstruction. Here, we have merged GRNs, PPIs and genome-scale metabolic networks (GSMNs approaches into a single framework for rice via omics’ regulatory information reconstruction and integration. Firstly, we reconstructed a genome-scale metabolic model, containing 4,462 function genes, 2,986 metabolites involved in 3,316 reactions, and compartmentalized into ten subcellular locations. Furthermore, 90,358 pairs of protein-protein interactions, 662,936 pairs of gene regulations and 1,763 microRNA-target interactions were integrated into the metabolic model. Eventually, a database was developped for systematically storing and retrieving the genome-scale multi-level network of rice. This provides a reference for understanding genotype-phenotype relationship of rice, and for analysis of its molecular regulatory network.

  1. Low levels of LTR retrotransposon deletion by ectopic recombination in the gigantic genomes of salamanders.

    Science.gov (United States)

    Frahry, Matthew Blake; Sun, Cheng; Chong, Rebecca A; Mueller, Rachel Lockridge

    2015-02-01

    Across the tree of life, species vary dramatically in nuclear genome size. Mutations that add or remove sequences from genomes-insertions or deletions, or indels-are the ultimate source of this variation. Differences in the tempo and mode of insertion and deletion across taxa have been proposed to contribute to evolutionary diversity in genome size. Among vertebrates, most of the largest genomes are found within the salamanders, an amphibian clade with genome sizes ranging from ~14 to ~120 Gb. Salamander genomes have been shown to experience slower rates of DNA loss through small (i.e., genomes. However, no studies have addressed DNA loss from salamander genomes resulting from larger deletions. Here, we focus on one type of large deletion-ectopic-recombination-mediated removal of LTR retrotransposon sequences. In ectopic recombination, double-strand breaks are repaired using a "wrong" (i.e., ectopic, or non-allelic) template sequence-typically another locus of similar sequence. When breaks occur within the LTR portions of LTR retrotransposons, ectopic-recombination-mediated repair can produce deletions that remove the internal transposon sequence and the equivalent of one of the two LTR sequences. These deletions leave a signature in the genome-a solo LTR sequence. We compared levels of solo LTRs in the genomes of four salamander species with levels present in five vertebrates with smaller genomes. Our results demonstrate that salamanders have low levels of solo LTRs, suggesting that ectopic-recombination-mediated deletion of LTR retrotransposons occurs more slowly than in other vertebrates with smaller genomes.

  2. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    Directory of Open Access Journals (Sweden)

    Weerachai Jaratlerdsiri

    Full Text Available The major histocompatibility complex (MHC is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs.

  3. Shifts in the evolutionary rate and intensity of purifying selection between two Brassica genomes revealed by analyses of orthologous transposons and relics of a whole genome triplication.

    Science.gov (United States)

    Zhao, Meixia; Du, Jianchang; Lin, Feng; Tong, Chaobo; Yu, Jingyin; Huang, Shunmou; Wang, Xiaowu; Liu, Shengyi; Ma, Jianxin

    2013-10-01

    Recent sequencing of the Brassica rapa and Brassica oleracea genomes revealed extremely contrasting genomic features such as the abundance and distribution of transposable elements between the two genomes. However, whether and how these structural differentiations may have influenced the evolutionary rates of the two genomes since their split from a common ancestor are unknown. Here, we investigated and compared the rates of nucleotide substitution between two long terminal repeats (LTRs) of individual orthologous LTR-retrotransposons, the rates of synonymous and non-synonymous substitution among triplicated genes retained in both genomes from a shared whole genome triplication event, and the rates of genetic recombination estimated/deduced by the comparison of physical and genetic distances along chromosomes and ratios of solo LTRs to intact elements. Overall, LTR sequences and genic sequences showed more rapid nucleotide substitution in B. rapa than in B. oleracea. Synonymous substitution of triplicated genes retained from a shared whole genome triplication was detected at higher rates in B. rapa than in B. oleracea. Interestingly, non-synonymous substitution was observed at lower rates in the former than in the latter, indicating shifted densities of purifying selection between the two genomes. In addition to evolutionary asymmetry, orthologous genes differentially regulated and/or disrupted by transposable elements between the two genomes were also characterized. Our analyses suggest that local genomic and epigenomic features, such as recombination rates and chromatin dynamics reshaped by independent proliferation of transposable elements and elimination between the two genomes, are perhaps partially the causes and partially the outcomes of the observed inter-specific asymmetric evolution. © 2013 Purdue University The Plant Journal © 2013 John Wiley & Sons Ltd.

  4. Genome size evolution at the speciation level: The cryptic species complex Brachionus plicatilis (Rotifera

    Directory of Open Access Journals (Sweden)

    Riss Simone

    2011-04-01

    Full Text Available Abstract Background Studies on genome size variation in animals are rarely done at lower taxonomic levels, e.g., slightly above/below the species level. Yet, such variation might provide important clues on the tempo and mode of genome size evolution. In this study we used the flow-cytometry method to study the evolution of genome size in the rotifer Brachionus plicatilis, a cryptic species complex consisting of at least 14 closely related species. Results We found an unexpectedly high variation in this species complex, with genome sizes ranging approximately seven-fold (haploid '1C' genome sizes: 0.056-0.416 pg. Most of this variation (67% could be ascribed to the major clades of the species complex, i.e. clades that are well separated according to most species definitions. However, we also found substantial variation (32% at lower taxonomic levels - within and among genealogical species - and, interestingly, among species pairs that are not completely reproductively isolated. In one genealogical species, called B. 'Austria', we found greatly enlarged genome sizes that could roughly be approximated as multiples of the genomes of its closest relatives, which suggests that whole-genome duplications have occurred early during separation of this lineage. Overall, genome size was significantly correlated to egg size and body size, even though the latter became non-significant after controlling for phylogenetic non-independence. Conclusions Our study suggests that substantial genome size variation can build up early during speciation, potentially even among isolated populations. An alternative, but not mutually exclusive interpretation might be that reproductive isolation tends to build up unusually slow in this species complex.

  5. Genome-wide signatures of flowering adaptation to climate temperature: Regional analyses in a highly diverse native range of Arabidopsis thaliana.

    Science.gov (United States)

    Tabas-Madrid, Daniel; Méndez-Vigo, Belén; Arteaga, Noelia; Marcer, Arnald; Pascual-Montano, Alberto; Weigel, Detlef; Xavier Picó, F; Alonso-Blanco, Carlos

    2018-03-08

    Current global change is fueling an interest to understand the genetic and molecular mechanisms of plant adaptation to climate. In particular, altered flowering time is a common strategy for escape from unfavourable climate temperature. In order to determine the genomic bases underlying flowering time adaptation to this climatic factor, we have systematically analysed a collection of 174 highly diverse Arabidopsis thaliana accessions from the Iberian Peninsula. Analyses of 1.88 million single nucleotide polymorphisms provide evidence for a spatially heterogeneous contribution of demographic and adaptive processes to geographic patterns of genetic variation. Mountains appear to be allele dispersal barriers, whereas the relationship between flowering time and temperature depended on the precise temperature range. Environmental genome-wide associations supported an overall genome adaptation to temperature, with 9.4% of the genes showing significant associations. Furthermore, phenotypic genome-wide associations provided a catalogue of candidate genes underlying flowering time variation. Finally, comparison of environmental and phenotypic genome-wide associations identified known (Twin Sister of FT, FRIGIDA-like 1, and Casein Kinase II Beta chain 1) and new (Epithiospecifer Modifier 1 and Voltage-Dependent Anion Channel 5) genes as candidates for adaptation to climate temperature by altered flowering time. Thus, this regional collection provides an excellent resource to address the spatial complexity of climate adaptation in annual plants. © 2018 John Wiley & Sons Ltd.

  6. A system-level model for the microbial regulatory genome.

    Science.gov (United States)

    Brooks, Aaron N; Reiss, David J; Allard, Antoine; Wu, Wei-Ju; Salvanha, Diego M; Plaisier, Christopher L; Chandrasekaran, Sriram; Pan, Min; Kaur, Amardeep; Baliga, Nitin S

    2014-07-15

    Microbes can tailor transcriptional responses to diverse environmental challenges despite having streamlined genomes and a limited number of regulators. Here, we present data-driven models that capture the dynamic interplay of the environment and genome-encoded regulatory programs of two types of prokaryotes: Escherichia coli (a bacterium) and Halobacterium salinarum (an archaeon). The models reveal how the genome-wide distributions of cis-acting gene regulatory elements and the conditional influences of transcription factors at each of those elements encode programs for eliciting a wide array of environment-specific responses. We demonstrate how these programs partition transcriptional regulation of genes within regulons and operons to re-organize gene-gene functional associations in each environment. The models capture fitness-relevant co-regulation by different transcriptional control mechanisms acting across the entire genome, to define a generalized, system-level organizing principle for prokaryotic gene regulatory networks that goes well beyond existing paradigms of gene regulation. An online resource (http://egrin2.systemsbiology.net) has been developed to facilitate multiscale exploration of conditional gene regulation in the two prokaryotes. © 2014 The Authors. Published under the terms of the CC BY 4.0 license.

  7. Characterization of Aeromonas hydrophila wound pathotypes by comparative genomic and functional analyses of virulence genes.

    Science.gov (United States)

    Grim, Christopher J; Kozlova, Elena V; Sha, Jian; Fitts, Eric C; van Lier, Christina J; Kirtley, Michelle L; Joseph, Sandeep J; Read, Timothy D; Burd, Eileen M; Tall, Ben D; Joseph, Sam W; Horneman, Amy J; Chopra, Ashok K; Shak, Joshua R

    2013-04-23

    Aeromonas hydrophila has increasingly been implicated as a virulent and antibiotic-resistant etiologic agent in various human diseases. In a previously published case report, we described a subject with a polymicrobial wound infection that included a persistent and aggressive strain of A. hydrophila (E1), as well as a more antibiotic-resistant strain of A. hydrophila (E2). To better understand the differences between pathogenic and environmental strains of A. hydrophila, we conducted comparative genomic and functional analyses of virulence-associated genes of these two wound isolates (E1 and E2), the environmental type strain A. hydrophila ATCC 7966(T), and four other isolates belonging to A. aquariorum, A. veronii, A. salmonicida, and A. caviae. Full-genome sequencing of strains E1 and E2 revealed extensive differences between the two and strain ATCC 7966(T). The more persistent wound infection strain, E1, harbored coding sequences for a cytotoxic enterotoxin (Act), a type 3 secretion system (T3SS), flagella, hemolysins, and a homolog of exotoxin A found in Pseudomonas aeruginosa. Corresponding phenotypic analyses with A. hydrophila ATCC 7966(T) and SSU as reference strains demonstrated the functionality of these virulence genes, with strain E1 displaying enhanced swimming and swarming motility, lateral flagella on electron microscopy, the presence of T3SS effector AexU, and enhanced lethality in a mouse model of Aeromonas infection. By combining sequence-based analysis and functional assays, we characterized an A. hydrophila pathotype, exemplified by strain E1, that exhibited increased virulence in a mouse model of infection, likely because of encapsulation, enhanced motility, toxin secretion, and cellular toxicity. Aeromonas hydrophila is a common aquatic bacterium that has increasingly been implicated in serious human infections. While many determinants of virulence have been identified in Aeromonas, rapid identification of pathogenic versus nonpathogenic

  8. Genome-wide meta-analyses of multiancestry cohorts identify multiple new susceptibility loci for refractive error and myopia

    NARCIS (Netherlands)

    Verhoeven, Virginie J. M.; Hysi, Pirro G.; Wojciechowski, Robert; Fan, Qiao; Guggenheim, Jeremy A.; Höhn, René; Macgregor, Stuart; Hewitt, Alex W.; Nag, Abhishek; Cheng, Ching-Yu; Yonova-Doing, Ekaterina; Zhou, Xin; Ikram, M. Kamran; Buitendijk, Gabriëlle H. S.; McMahon, George; Kemp, John P.; Pourcain, Beate St; Simpson, Claire L.; Mäkelä, Kari-Matti; Lehtimäki, Terho; Kähönen, Mika; Paterson, Andrew D.; Hosseini, S. Mohsen; Wong, Hoi Suen; Xu, Liang; Jonas, Jost B.; Pärssinen, Olavi; Wedenoja, Juho; Yip, Shea Ping; Ho, Daniel W. H.; Pang, Chi Pui; Chen, Li Jia; Burdon, Kathryn P.; Craig, Jamie E.; Klein, Barbara E. K.; Klein, Ronald; Haller, Toomas; Metspalu, Andres; Khor, Chiea-Chuen; Tai, E.-Shyong; Aung, Tin; Vithana, Eranga; Tay, Wan-Ting; Barathi, Veluchamy A.; Chen, Peng; Li, Ruoying; Liao, Jiemin; Zheng, Yingfeng; Bergen, Arthur A. B.; Chen, Wei

    2013-01-01

    Refractive error is the most common eye disorder worldwide and is a prominent cause of blindness. Myopia affects over 30% of Western populations and up to 80% of Asians. The CREAM consortium conducted genome-wide meta-analyses, including 37,382 individuals from 27 studies of European ancestry and

  9. Genomic analyses of metal resistance genes in three plant growth promoting bacteria of legume plants in Northwest mine tailings, China.

    Science.gov (United States)

    Xie, Pin; Hao, Xiuli; Herzberg, Martin; Luo, Yantao; Nies, Dietrich H; Wei, Gehong

    2015-01-01

    To better understand the diversity of metal resistance genetic determinant from microbes that survived at metal tailings in northwest of China, a highly elevated level of heavy metal containing region, genomic analyses was conducted using genome sequence of three native metal-resistant plant growth promoting bacteria (PGPB). It shows that: Mesorhizobium amorphae CCNWGS0123 contains metal transporters from P-type ATPase, CDF (Cation Diffusion Facilitator), HupE/UreJ and CHR (chromate ion transporter) family involved in copper, zinc, nickel as well as chromate resistance and homeostasis. Meanwhile, the putative CopA/CueO system is expected to mediate copper resistance in Sinorhizobium meliloti CCNWSX0020 while ZntA transporter, assisted with putative CzcD, determines zinc tolerance in Agrobacterium tumefaciens CCNWGS0286. The greenhouse experiment provides the consistent evidence of the plant growth promoting effects of these microbes on their hosts by nitrogen fixation and/or indoleacetic acid (IAA) secretion, indicating a potential in-site phytoremediation usage in the mining tailing regions of China. Copyright © 2014. Published by Elsevier B.V.

  10. A universal genomic coordinate translator for comparative genomics.

    Science.gov (United States)

    Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G

    2014-06-30

    Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across

  11. Genotyping-by-sequencing for Populus population genomics: an assessment of genome sampling patterns and filtering approaches.

    Directory of Open Access Journals (Sweden)

    Martin P Schilling

    Full Text Available Continuing advances in nucleotide sequencing technology are inspiring a suite of genomic approaches in studies of natural populations. Researchers are faced with data management and analytical scales that are increasing by orders of magnitude. With such dramatic advances comes a need to understand biases and error rates, which can be propagated and magnified in large-scale data acquisition and processing. Here we assess genomic sampling biases and the effects of various population-level data filtering strategies in a genotyping-by-sequencing (GBS protocol. We focus on data from two species of Populus, because this genus has a relatively small genome and is emerging as a target for population genomic studies. We estimate the proportions and patterns of genomic sampling by examining the Populus trichocarpa genome (Nisqually-1, and demonstrate a pronounced bias towards coding regions when using the methylation-sensitive ApeKI restriction enzyme in this species. Using population-level data from a closely related species (P. tremuloides, we also investigate various approaches for filtering GBS data to retain high-depth, informative SNPs that can be used for population genetic analyses. We find a data filter that includes the designation of ambiguous alleles resulted in metrics of population structure and Hardy-Weinberg equilibrium that were most consistent with previous studies of the same populations based on other genetic markers. Analyses of the filtered data (27,910 SNPs also resulted in patterns of heterozygosity and population structure similar to a previous study using microsatellites. Our application demonstrates that technically and analytically simple approaches can readily be developed for population genomics of natural populations.

  12. Comparative genomic analyses of nickel, cobalt and vitamin B12 utilization

    Directory of Open Access Journals (Sweden)

    Gelfand Mikhail S

    2009-02-01

    Full Text Available Abstract Background Nickel (Ni and cobalt (Co are trace elements required for a variety of biological processes. Ni is directly coordinated by proteins, whereas Co is mainly used as a component of vitamin B12. Although a number of Ni and Co-dependent enzymes have been characterized, systematic evolutionary analyses of utilization of these metals are limited. Results We carried out comparative genomic analyses to examine occurrence and evolutionary dynamics of the use of Ni and Co at the level of (i transport systems, and (ii metalloproteomes. Our data show that both metals are widely used in bacteria and archaea. Cbi/NikMNQO is the most common prokaryotic Ni/Co transporter, while Ni-dependent urease and Ni-Fe hydrogenase, and B12-dependent methionine synthase (MetH, ribonucleotide reductase and methylmalonyl-CoA mutase are the most widespread metalloproteins for Ni and Co, respectively. Occurrence of other metalloenzymes showed a mosaic distribution and a new B12-dependent protein family was predicted. Deltaproteobacteria and Methanosarcina generally have larger Ni- and Co-dependent proteomes. On the other hand, utilization of these two metals is limited in eukaryotes, and very few of these organisms utilize both of them. The Ni-utilizing eukaryotes are mostly fungi (except saccharomycotina and plants, whereas most B12-utilizing organisms are animals. The NiCoT transporter family is the most widespread eukaryotic Ni transporter, and eukaryotic urease and MetH are the most common Ni- and B12-dependent enzymes, respectively. Finally, investigation of environmental and other conditions and identity of organisms that show dependence on Ni or Co revealed that host-associated organisms (particularly obligate intracellular parasites and endosymbionts have a tendency for loss of Ni/Co utilization. Conclusion Our data provide information on the evolutionary dynamics of Ni and Co utilization and highlight widespread use of these metals in the three

  13. Genomic Analyses Reveal Demographic History and Temperate Adaptation of the Newly Discovered Honey Bee Subspecies Apis mellifera sinisxinyuan n. ssp.

    Science.gov (United States)

    Chen, Chao; Liu, Zhiguang; Pan, Qi; Chen, Xiao; Wang, Huihua; Guo, Haikun; Liu, Shidong; Lu, Hongfeng; Tian, Shilin; Li, Ruiqiang; Shi, Wei

    2016-05-01

    Studying the genetic signatures of climate-driven selection can produce insights into local adaptation and the potential impacts of climate change on populations. The honey bee (Apis mellifera) is an interesting species to study local adaptation because it originated in tropical/subtropical climatic regions and subsequently spread into temperate regions. However, little is known about the genetic basis of its adaptation to temperate climates. Here, we resequenced the whole genomes of ten individual bees from a newly discovered population in temperate China and downloaded resequenced data from 35 individuals from other populations. We found that the new population is an undescribed subspecies in the M-lineage of A. mellifera (Apis mellifera sinisxinyuan). Analyses of population history show that long-term global temperature has strongly influenced the demographic history of A. m. sinisxinyuan and its divergence from other subspecies. Further analyses comparing temperate and tropical populations identified several candidate genes related to fat body and the Hippo signaling pathway that are potentially involved in adaptation to temperate climates. Our results provide insights into the demographic history of the newly discovered A. m. sinisxinyuan, as well as the genetic basis of adaptation of A. mellifera to temperate climates at the genomic level. These findings will facilitate the selective breeding of A. mellifera to improve the survival of overwintering colonies. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Heritability and Genome-Wide Association Analyses of Serum Uric Acid in Middle and Old-Aged Chinese Twins

    Directory of Open Access Journals (Sweden)

    Weijing Wang

    2018-03-01

    Full Text Available Serum uric acid (SUA, as the end product of purine metabolism, has proven emerging roles in human disorders. Here based on a sample of 379 middle and old-aged Chinese twin pairs, we aimed to explore the magnitude of genetic impact on SUA variation by performing sex-limitation twin modeling analyses and further detect specific genetic variants related to SUA by conducting a genome-wide association study. Monozygotic (MZ twin correlation for SUA level (rMZ = 0.56 was larger than for dizygotic (DZ twin correlation (rDZ = 0.39. The common effects sex-limitation model provided the best fit with additive genetic parameter (A accounting for 46.3%, common or shared environmental parameter (C accounting for 26.3% and unique/nonshared environmental parameter (E accounting for 27.5% for females and 29.9, 33.1, and 37.0% for males, respectively. Although no SUA-related genetic variants reached genome-wide significance level, 25 SNPs were suggestive of association (P < 1 × 10−5. Most of the SNPs were located in an intronic region and detected to have regulatory effects on gene transcription. The cell-type specific enhancer of skeletal muscle was detected which has been reported to implicate SUA. Two promising genetic regions on chromosome 17 around rs2253277 and chromosome 14 around rs11621523 were found. Gene-based analysis found 167 genes nominally associated with SUA level (P < 0.05, including PTGR2, ENTPD5, well-known SLC2A9, etc. Enrichment analysis identified one pathway of transmembrane transport of small molecules and 20 GO gene sets involving in ion transport, transmembrane transporter activity, hydrolase activity acting on acid anhydrides, etc. In conclusion, SUA shows moderate heritability in women and low heritability in men in the Chinese population and genetic variations are significantly involved in functional genes and regulatory domains that mediate SUA level. Our findings provide clues to further elucidate molecular

  15. Bat Biology, Genomes, and the Bat1K Project: To Generate Chromosome-Level Genomes for All Living Bat Species.

    Science.gov (United States)

    Teeling, Emma C; Vernes, Sonja C; Dávalos, Liliana M; Ray, David A; Gilbert, M Thomas P; Myers, Eugene

    2018-02-15

    Bats are unique among mammals, possessing some of the rarest mammalian adaptations, including true self-powered flight, laryngeal echolocation, exceptional longevity, unique immunity, contracted genomes, and vocal learning. They provide key ecosystem services, pollinating tropical plants, dispersing seeds, and controlling insect pest populations, thus driving healthy ecosystems. They account for more than 20% of all living mammalian diversity, and their crown-group evolutionary history dates back to the Eocene. Despite their great numbers and diversity, many species are threatened and endangered. Here we announce Bat1K, an initiative to sequence the genomes of all living bat species (n∼1,300) to chromosome-level assembly. The Bat1K genome consortium unites bat biologists (>148 members as of writing), computational scientists, conservation organizations, genome technologists, and any interested individuals committed to a better understanding of the genetic and evolutionary mechanisms that underlie the unique adaptations of bats. Our aim is to catalog the unique genetic diversity present in all living bats to better understand the molecular basis of their unique adaptations; uncover their evolutionary history; link genotype with phenotype; and ultimately better understand, promote, and conserve bats. Here we review the unique adaptations of bats and highlight how chromosome-level genome assemblies can uncover the molecular basis of these traits. We present a novel sequencing and assembly strategy and review the striking societal and scientific benefits that will result from the Bat1K initiative.

  16. Genomic analyses of tropical beef cattle fertility based on genotyping pools of Brahman cows with unknown pedigree.

    Science.gov (United States)

    Reverter, A; Porto-Neto, L R; Fortes, M R S; McCulloch, R; Lyons, R E; Moore, S; Nicol, D; Henshall, J; Lehnert, S A

    2016-10-01

    We introduce an innovative approach to lowering the overall cost of obtaining genomic EBV (GEBV) and encourage their use in commercial extensive herds of Brahman beef cattle. In our approach, the DNA genotyping of cow herds from 2 independent properties was performed using a high-density bovine SNP chip on DNA from pooled blood samples, grouped according to the result of a pregnancy test following their first and second joining opportunities. For the DNA pooling strategy, 15 to 28 blood samples from the same phenotype and contemporary group were allocated to pools. Across the 2 properties, a total of 183 pools were created representing 4,164 cows. In addition, blood samples from 309 bulls from the same properties were also taken. After genotyping and quality control, 74,584 remaining SNP were used for analyses. Pools and individual DNA samples were related by means of a "hybrid" genomic relationship matrix. The pooled genotyping analysis of 2 large and independent commercial populations of tropical beef cattle was able to recover significant and plausible associations between SNP and pregnancy test outcome. We discuss 24 SNP with significant association ( < 1.0 × 10) and mapped within 40 kb of an annotated gene. We have established a method to estimate the GEBV in young herd bulls for a trait that is currently unable to be predicted at all. In summary, our novel approach allowed us to conduct genomic analyses of fertility in 2 large commercial Brahman herds managed under extensive pastoral conditions.

  17. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree

    Directory of Open Access Journals (Sweden)

    Nagesh A. Kuravadi

    2015-08-01

    Full Text Available Neem (Azadirachta indica A. Juss is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC. Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways.

  18. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree

    Science.gov (United States)

    Rangiah, Kannan; Mahesh, HB; Rajamani, Anantharamanan; Shirke, Meghana D.; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, BN

    2015-01-01

    Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways. PMID:26290780

  19. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2002-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies.

  20. Comparative Analyses of Nonpathogenic, Opportunistic, and Totally Pathogenic Mycobacteria Reveal Genomic and Biochemical Variabilities and Highlight the Survival Attributes of Mycobacterium tuberculosis

    Science.gov (United States)

    Singh, Yadvir; Kohli, Sakshi; Ahmad, Javeed; Ehtesham, Nasreen Z.; Tyagi, Anil K.

    2014-01-01

    ABSTRACT Mycobacterial evolution involves various processes, such as genome reduction, gene cooption, and critical gene acquisition. Our comparative genome size analysis of 44 mycobacterial genomes revealed that the nonpathogenic (NP) genomes were bigger than those of opportunistic (OP) or totally pathogenic (TP) mycobacteria, with the TP genomes being smaller yet variable in size—their genomic plasticity reflected their ability to evolve and survive under various environmental conditions. From the 44 mycobacterial species, 13 species, representing TP, OP, and NP, were selected for genomic-relatedness analyses. Analysis of homologous protein-coding genes shared between Mycobacterium indicus pranii (NP), Mycobacterium intracellulare ATCC 13950 (OP), and Mycobacterium tuberculosis H37Rv (TP) revealed that 4,995 (i.e., ~95%) M. indicaus pranii proteins have homology with M. intracellulare, whereas the homologies among M. indicus pranii, M. intracellulare ATCC 13950, and M. tuberculosis H37Rv were significantly lower. A total of 4,153 (~79%) M. indicus pranii proteins and 4,093 (~79%) M. intracellulare ATCC 13950 proteins exhibited homology with the M. tuberculosis H37Rv proteome, while 3,301 (~82%) and 3,295 (~82%) M. tuberculosis H37Rv proteins showed homology with M. indicus pranii and M. intracellulare ATCC 13950 proteomes, respectively. Comparative metabolic pathway analyses of TP/OP/NP mycobacteria showed enzymatic plasticity between M. indicus pranii (NP) and M. intracellulare ATCC 13950 (OP), Mycobacterium avium 104 (OP), and M. tuberculosis H37Rv (TP). Mycobacterium tuberculosis seems to have acquired novel alternate pathways with possible roles in metabolism, host-pathogen interactions, virulence, and intracellular survival, and by implication some of these could be potential drug targets. PMID:25370496

  1. Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids.

    Science.gov (United States)

    Jansen, Robert K; Kaittanis, Charalambos; Saski, Christopher; Lee, Seung-Bum; Tomkins, Jeffrey; Alverson, Andrew J; Daniell, Henry

    2006-04-09

    The Vitaceae (grape) is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade. However, maximum likelihood analyses place

  2. Phylogenetic analyses of Vitis (Vitaceae based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids

    Directory of Open Access Journals (Sweden)

    Alverson Andrew J

    2006-04-01

    Full Text Available Abstract Background The Vitaceae (grape is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. Results The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade

  3. Genome-wide association analyses of expression phenotypes.

    Science.gov (United States)

    Chen, Gary K; Zheng, Tian; Witte, John S; Goode, Ellen L; Gao, Lei; Hu, Pingzhao; Suh, Young Ju; Suktitipat, Bhoom; Szymczak, Silke; Woo, Jung Hoon; Zhang, Wei

    2007-01-01

    A number of issues arise when analyzing the large amount of data from high-throughput genotype and expression microarray experiments, including design and interpretation of genome-wide association studies of expression phenotypes. These issues were considered by contributions submitted to Group 1 of the Genetic Analysis Workshop 15 (GAW15), which focused on the association of quantitative expression data. These contributions evaluated diverse hypotheses, including those relevant to cancer and obesity research, and used various analytic techniques, many of which were derived from information theory. Several observations from these reports stand out. First, one needs to consider the genetic model of the trait of interest and carefully select which single nucleotide polymorphisms and individuals are included early in the design stage of a study. Second, by targeting specific pathways when analyzing genome-wide data, one can generate more interpretable results than agnostic approaches. Finally, for datasets with small sample sizes but a large number of features like the Genetic Analysis Workshop 15 dataset, machine learning approaches may be more practical than traditional parametric approaches. (c) 2007 Wiley-Liss, Inc.

  4. Aye-aye population genomic analyses highlight an important center of endemism in northern Madagascar.

    Science.gov (United States)

    Perry, George H; Louis, Edward E; Ratan, Aakrosh; Bedoya-Reina, Oscar C; Burhans, Richard C; Lei, Runhua; Johnson, Steig E; Schuster, Stephan C; Miller, Webb

    2013-04-09

    We performed a population genomics study of the aye-aye, a highly specialized nocturnal lemur from Madagascar. Aye-ayes have low population densities and extensive range requirements that could make this flagship species particularly susceptible to extinction. Therefore, knowledge of genetic diversity and differentiation among aye-aye populations is critical for conservation planning. Such information may also advance our general understanding of Malagasy biogeography, as aye-ayes have the largest species distribution of any lemur. We generated and analyzed whole-genome sequence data for 12 aye-ayes from three regions of Madagascar (North, West, and East). We found that the North population is genetically distinct, with strong differentiation from other aye-ayes over relatively short geographic distances. For comparison, the average FST value between the North and East aye-aye populations--separated by only 248 km--is over 2.1-times greater than that observed between human Africans and Europeans. This finding is consistent with prior watershed- and climate-based hypotheses of a center of endemism in northern Madagascar. Taken together, these results suggest a strong and long-term biogeographical barrier to gene flow. Thus, the specific attention that should be directed toward preserving large, contiguous aye-aye habitats in northern Madagascar may also benefit the conservation of other distinct taxonomic units. To help facilitate future ecological- and conservation-motivated population genomic analyses by noncomputational biologists, the analytical toolkit used in this study is available on the Galaxy Web site.

  5. Genome size variation and incidence of polyploidy in Scrophulariaceae sensu lato from the Iberian Peninsula.

    Science.gov (United States)

    Castro, Mariana; Castro, Sílvia; Loureiro, João

    2012-01-01

    In the last decade, genomic studies using DNA markers have strongly influenced the current phylogeny of angiosperms. Genome size and ploidy level have contributed to this discussion, being considered important characters in biosystematics, ecology and population biology. Despite the recent increase in studies related to genome size evolution and polyploidy incidence, only a few are available for Scrophulariaceae. In this context, we assessed the value of genome size, mostly as a taxonomic marker, and the role of polyploidy as a process of genesis and maintenance of plant diversity in Scrophulariaceae sensu lato in the Iberian Peninsula. Large-scale analyses of genome size and ploidy-level variation across the Iberian Peninsula were performed using flow cytometry. One hundred and sixty-two populations of 59 distinct taxa were analysed. A bibliographic review on chromosome counts was also performed. From the 59 sampled taxa, 51 represent first estimates of genome size. The majority of the Scrophulariaceae species presented very small to small genome sizes (2C ≤ 7.0 pg). Furthermore, in most of the analysed genera it was possible to use this character to separate several taxa, independently if these genera were homoploid or heteroploid. Also, some genome-related phenomena were detected, such as intraspecific variation of genome size in some genera and the possible occurrence of dysploidy in Verbascum spp. With respect to polyploidy, despite a few new DNA ploidy levels having been detected in Veronica, no multiple cytotypes have been found in any taxa. This work contributed with important basic scientific knowledge on genome size and polyploid incidence in the Scrophulariaceae, providing important background information for subsequent studies, with several perspectives for future studies being opened.

  6. [Analysis of genomic DNA methylation level in radish under cadmium stress by methylation-sensitive amplified polymorphism technique].

    Science.gov (United States)

    Yang, Jin-Lan; Liu, Li-Wang; Gong, Yi-Qin; Huang, Dan-Qiong; Wang, Feng; He, Ling-Li

    2007-06-01

    The level of cytosine methylation induced by cadmium in radish (Raphanus sativus L.) genome was analysed using the technique of methylation-sensitive amplified polymorphism (MSAP). The MSAP ratios in radish seedling exposed to cadmium chloride at the concentration of 50, 250 and 500 mg/L were 37%, 43% and 51%, respectively, and the control was 34%; the full methylation levels (C(m)CGG in double strands) were at 23%, 25% and 27%, respectively, while the control was 22%. The level of increase in MSAP and full methylation indicated that de novo methylation occurred in some 5'-CCGG sites under Cd stress. There was significant positive correlation between increase of total DNA methylation level and CdCl(2) concentration. Four types of MSAP patterns: de novo methylation, de-methylation, atypical pattern and no changes of methylation pattern were identified among CdCl(2) treatments and the control. DNA methylation alteration in plants treated with CdCl(2) was mainly through de novo methylation.

  7. The Small Nuclear Genomes of Selaginella Are Associated with a Low Rate of Genome Size Evolution.

    Science.gov (United States)

    Baniaga, Anthony E; Arrigo, Nils; Barker, Michael S

    2016-06-03

    The haploid nuclear genome size (1C DNA) of vascular land plants varies over several orders of magnitude. Much of this observed diversity in genome size is due to the proliferation and deletion of transposable elements. To date, all vascular land plant lineages with extremely small nuclear genomes represent recently derived states, having ancestors with much larger genome sizes. The Selaginellaceae represent an ancient lineage with extremely small genomes. It is unclear how small nuclear genomes evolved in Selaginella We compared the rates of nuclear genome size evolution in Selaginella and major vascular plant clades in a comparative phylogenetic framework. For the analyses, we collected 29 new flow cytometry estimates of haploid genome size in Selaginella to augment publicly available data. Selaginella possess some of the smallest known haploid nuclear genome sizes, as well as the lowest rate of genome size evolution observed across all vascular land plants included in our analyses. Additionally, our analyses provide strong support for a history of haploid nuclear genome size stasis in Selaginella Our results indicate that Selaginella, similar to other early diverging lineages of vascular land plants, has relatively low rates of genome size evolution. Further, our analyses highlight that a rapid transition to a small genome size is only one route to an extremely small genome. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  8. Genomic and secondary metabolite analyses of Streptomyces sp. 2AW provide insight into the evolution of the cycloheximide pathway

    Directory of Open Access Journals (Sweden)

    Elizabeth eStulberg

    2016-05-01

    Full Text Available The dearth of new antibiotics in the face of widespread antimicrobial resistance makes developing innovative strategies for discovering new antibiotics critical for the future management of infectious disease. Understanding the genetics and evolution of antibiotic producers will help guide the discovery and bioengineering of novel antibiotics. We discovered an isolate in Alaskan boreal forest soil that had broad antimicrobial activity. We elucidated the corresponding antimicrobial natural products and sequenced the genome of this isolate, designated Streptomyces sp. 2AW. This strain illustrates the chemical virtuosity typical of the Streptomyces genus, producing cycloheximide as well as two other biosynthetically unrelated antibiotics, neutramycin and hygromycin A. Combining bioinformatic and chemical analyses, we identified the gene clusters responsible for antibiotic production. Interestingly, 2AW appears dissimilar from other cycloheximide producers in that the gene encoding the polyketide synthase resides on a separate part of the chromosome from the genes responsible for tailoring cycloheximide-specific modifications. This gene arrangement and our phylogenetic analyses of the gene products suggest that 2AW holds an evolutionarily ancestral lineage of the cycloheximide pathway. Our analyses support the hypothesis that the 2AW glutaramide gene cluster is basal to the lineage wherein cycloheximide production diverged from other glutarimide antibiotics. This study illustrates the power of combining modern biochemical and genomic analyses to gain insight into the evolution of antibiotic-producing microorganisms.

  9. No evidence for genome-wide interactions on plasma fibrinogen by smoking, alcohol consumption and body mass index: results from meta-analyses of 80,607 subjects.

    Directory of Open Access Journals (Sweden)

    Jens Baumert

    Full Text Available Plasma fibrinogen is an acute phase protein playing an important role in the blood coagulation cascade having strong associations with smoking, alcohol consumption and body mass index (BMI. Genome-wide association studies (GWAS have identified a variety of gene regions associated with elevated plasma fibrinogen concentrations. However, little is yet known about how associations between environmental factors and fibrinogen might be modified by genetic variation. Therefore, we conducted large-scale meta-analyses of genome-wide interaction studies to identify possible interactions of genetic variants and smoking status, alcohol consumption or BMI on fibrinogen concentration. The present study included 80,607 subjects of European ancestry from 22 studies. Genome-wide interaction analyses were performed separately in each study for about 2.6 million single nucleotide polymorphisms (SNPs across the 22 autosomal chromosomes. For each SNP and risk factor, we performed a linear regression under an additive genetic model including an interaction term between SNP and risk factor. Interaction estimates were meta-analysed using a fixed-effects model. No genome-wide significant interaction with smoking status, alcohol consumption or BMI was observed in the meta-analyses. The most suggestive interaction was found for smoking and rs10519203, located in the LOC123688 region on chromosome 15, with a p value of 6.2 × 10(-8. This large genome-wide interaction study including 80,607 participants found no strong evidence of interaction between genetic variants and smoking status, alcohol consumption or BMI on fibrinogen concentrations. Further studies are needed to yield deeper insight in the interplay between environmental factors and gene variants on the regulation of fibrinogen concentrations.

  10. Comparative genomic characterization of three Streptococcus parauberis strains in fish pathogen, as assessed by wide-genome analyses.

    Directory of Open Access Journals (Sweden)

    Seong-Won Nho

    Full Text Available Streptococcus parauberis, which is the main causative agent of streptococcosis among olive flounder (Paralichthys olivaceus in northeast Asia, can be distinctly divided into two groups (type I and type II by an agglutination test. Here, the whole genome sequences of two Japanese strains (KRS-02083 and KRS-02109 were determined and compared with the previously determined genome of a Korean strain (KCTC 11537. The genomes of S. parauberis are intermediate in size and have lower GC contents than those of other streptococci. We annotated 2,236 and 2,048 genes in KRS-02083 and KRS-02109, respectively. Our results revealed that the three S. parauberis strains contain different genomic insertions and deletions. In particular, the genomes of Korean and Japanese strains encode different factors for sugar utilization; the former encodes the phosphotransferase system (PTS for sorbose, whereas the latter encodes proteins for lactose hydrolysis, respectively. And the KRS-02109 strain, specifically, was the type II strain found to be able to resist phage infection through the clustered regularly interspaced short palindromic repeats (CRISPR/Cas system and which might contribute valuably to serologically distribution. Thus, our genome-wide association study shows that polymorphisms can affect pathogen responses, providing insight into biological/biochemical pathways and phylogenetic diversity.

  11. Crowdsourcing genomic analyses of ash and ash dieback – power to the people

    Directory of Open Access Journals (Sweden)

    MacLean Dan

    2013-02-01

    Full Text Available Abstract Ash dieback is a devastating fungal disease of ash trees that has swept across Europe and recently reached the UK. This emergent pathogen has received little study in the past and its effect threatens to overwhelm the ash population. In response to this we have produced some initial genomics datasets and taken the unusual step of releasing them to the scientific community for analysis without first performing our own. In this manner we hope to ‘crowdsource’ analyses and bring the expertise of the community to bear on this problem as quickly as possible. Our data has been released through our website at oadb.tsl.ac.uk and a public GitHub repository.

  12. Comparative genomic and proteomic analyses of Clostridium acetobutylicum Rh8 and its parent strain DSM 1731 revealed new understandings on butanol tolerance

    International Nuclear Information System (INIS)

    Bao, Guanhui; Dong, Hongjun; Zhu, Yan; Mao, Shaoming; Zhang, Tianrui; Zhang, Yanping; Chen, Zugen; Li, Yin

    2014-01-01

    Highlights: • Genomes of a butanol tolerant strain and its parent strain were deciphered. • Comparative genomic and proteomic was applied to understand butanol tolerance. • None differentially expressed proteins have mutations in its corresponding genes. • Mutations in ribosome might be responsible for the global difference of proteomics. - Abstract: Clostridium acetobutylicum strain Rh8 is a butanol-tolerant mutant which can tolerate up to 19 g/L butanol, 46% higher than that of its parent strain DSM 1731. We previously performed comparative cytoplasm- and membrane-proteomic analyses to understand the mechanism underlying the improved butanol tolerance of strain Rh8. In this work, we further extended this comparison to the genomic level. Compared with the genome of the parent strain DSM 1731, two insertion sites, four deletion sites, and 67 single nucleotide variations (SNVs) are distributed throughout the genome of strain Rh8. Among the 67 SNVs, 16 SNVs are located in the predicted promoters and intergenic regions; while 29 SNVs are located in the coding sequence, affecting a total of 21 proteins involved in transport, cell structure, DNA replication, and protein translation. The remaining 22 SNVs are located in the ribosomal genes, affecting a total of 12 rRNA genes in different operons. Analysis of previous comparative proteomic data indicated that none of the differentially expressed proteins have mutations in its corresponding genes. Rchange Algorithms analysis indicated that the mutations occurred in the ribosomal genes might change the ribosome RNA thermodynamic characteristics, thus affect the translation strength of these proteins. Take together, the improved butanol tolerance of C. acetobutylicum strain Rh8 might be acquired through regulating the translational process to achieve different expression strength of genes involved in butanol tolerance

  13. Comparative genomic and proteomic analyses of Clostridium acetobutylicum Rh8 and its parent strain DSM 1731 revealed new understandings on butanol tolerance

    Energy Technology Data Exchange (ETDEWEB)

    Bao, Guanhui [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China); University of Chinese Academy of Sciences, Beijing (China); Dong, Hongjun; Zhu, Yan; Mao, Shaoming [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China); Zhang, Tianrui [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China); Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin (China); Zhang, Yanping [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China); Chen, Zugen [Department of Human Genetics, School of Medicine, University of California, Los Angeles, CA 90095 (United States); Li, Yin, E-mail: yli@im.ac.cn [CAS Key Laboratory of Microbial Physiological and Metabolic Engineering, Institute of Microbiology, Chinese Academy of Sciences, Beijing (China)

    2014-08-08

    Highlights: • Genomes of a butanol tolerant strain and its parent strain were deciphered. • Comparative genomic and proteomic was applied to understand butanol tolerance. • None differentially expressed proteins have mutations in its corresponding genes. • Mutations in ribosome might be responsible for the global difference of proteomics. - Abstract: Clostridium acetobutylicum strain Rh8 is a butanol-tolerant mutant which can tolerate up to 19 g/L butanol, 46% higher than that of its parent strain DSM 1731. We previously performed comparative cytoplasm- and membrane-proteomic analyses to understand the mechanism underlying the improved butanol tolerance of strain Rh8. In this work, we further extended this comparison to the genomic level. Compared with the genome of the parent strain DSM 1731, two insertion sites, four deletion sites, and 67 single nucleotide variations (SNVs) are distributed throughout the genome of strain Rh8. Among the 67 SNVs, 16 SNVs are located in the predicted promoters and intergenic regions; while 29 SNVs are located in the coding sequence, affecting a total of 21 proteins involved in transport, cell structure, DNA replication, and protein translation. The remaining 22 SNVs are located in the ribosomal genes, affecting a total of 12 rRNA genes in different operons. Analysis of previous comparative proteomic data indicated that none of the differentially expressed proteins have mutations in its corresponding genes. Rchange Algorithms analysis indicated that the mutations occurred in the ribosomal genes might change the ribosome RNA thermodynamic characteristics, thus affect the translation strength of these proteins. Take together, the improved butanol tolerance of C. acetobutylicum strain Rh8 might be acquired through regulating the translational process to achieve different expression strength of genes involved in butanol tolerance.

  14. EUPAN enables pan-genome studies of a large number of eukaryotic genomes.

    Science.gov (United States)

    Hu, Zhiqiang; Sun, Chen; Lu, Kuang-Chen; Chu, Xixia; Zhao, Yue; Lu, Jinyuan; Shi, Jianxin; Wei, Chaochun

    2017-08-01

    Pan-genome analyses are routinely carried out for bacteria to interpret the within-species gene presence/absence variations (PAVs). However, pan-genome analyses are rare for eukaryotes due to the large sizes and higher complexities of their genomes. Here we proposed EUPAN, a eukaryotic pan-genome analysis toolkit, enabling automatic large-scale eukaryotic pan-genome analyses and detection of gene PAVs at a relatively low sequencing depth. In the previous studies, we demonstrated the effectiveness and high accuracy of EUPAN in the pan-genome analysis of 453 rice genomes, in which we also revealed widespread gene PAVs among individual rice genomes. Moreover, EUPAN can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms. EUPAN is implemented in Perl, R and C ++. It is supported under Linux and preferred for a computer cluster with LSF and SLURM job scheduling system. EUPAN together with its standard operating procedure (SOP) is freely available for non-commercial use (CC BY-NC 4.0) at http://cgm.sjtu.edu.cn/eupan/index.html . ccwei@sjtu.edu.cn or jianxin.shi@sjtu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. Genomic analyses inform on migration events during the peopling of Eurasia

    OpenAIRE

    Pagani, L; Lawson, DJ; Jagoda, E; Mörseburg, A; Eriksson, A; Mitt, M; Clemente, F; Hudjashov, G; Degiorgio, M; Saag, L; Wall, JD; Cardona, A; Mägi, R; Sayres, MAW; Kaewert, S

    2016-01-01

    © 2016 Macmillan Publishers Limited, part of Springer Nature. High-Coverage whole-genome sequence studies have so far focused on a limited number of geographically restricted populations, or been targeted at specific diseases, such as cancer. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history and refuelled the debate on the mutation rate in humans. Here we present the Estonian Biocentre Human Genome D...

  16. Quantifying Temporal Genomic Erosion in Endangered Species.

    Science.gov (United States)

    Díez-Del-Molino, David; Sánchez-Barreiro, Fatima; Barnes, Ian; Gilbert, M Thomas P; Dalén, Love

    2018-03-01

    Many species have undergone dramatic population size declines over the past centuries. Although stochastic genetic processes during and after such declines are thought to elevate the risk of extinction, comparative analyses of genomic data from several endangered species suggest little concordance between genome-wide diversity and current population sizes. This is likely because species-specific life-history traits and ancient bottlenecks overshadow the genetic effect of recent demographic declines. Therefore, we advocate that temporal sampling of genomic data provides a more accurate approach to quantify genetic threats in endangered species. Specifically, genomic data from predecline museum specimens will provide valuable baseline data that enable accurate estimation of recent decreases in genome-wide diversity, increases in inbreeding levels, and accumulation of deleterious genetic variation. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Genomic and proteomic analyses of the fungus Arthrobotrys oligospora provide insights into nematode-trap formation.

    Science.gov (United States)

    Yang, Jinkui; Wang, Lei; Ji, Xinglai; Feng, Yun; Li, Xiaomin; Zou, Chenggang; Xu, Jianping; Ren, Yan; Mi, Qili; Wu, Junli; Liu, Shuqun; Liu, Yu; Huang, Xiaowei; Wang, Haiyan; Niu, Xuemei; Li, Juan; Liang, Lianming; Luo, Yanlu; Ji, Kaifang; Zhou, Wei; Yu, Zefen; Li, Guohong; Liu, Yajun; Li, Lei; Qiao, Min; Feng, Lu; Zhang, Ke-Qin

    2011-09-01

    Nematode-trapping fungi are "carnivorous" and attack their hosts using specialized trapping devices. The morphological development of these traps is the key indicator of their switch from saprophytic to predacious lifestyles. Here, the genome of the nematode-trapping fungus Arthrobotrys oligospora Fres. (ATCC24927) was reported. The genome contains 40.07 Mb assembled sequence with 11,479 predicted genes. Comparative analysis showed that A. oligospora shared many more genes with pathogenic fungi than with non-pathogenic fungi. Specifically, compared to several sequenced ascomycete fungi, the A. oligospora genome has a larger number of pathogenicity-related genes in the subtilisin, cellulase, cellobiohydrolase, and pectinesterase gene families. Searching against the pathogen-host interaction gene database identified 398 homologous genes involved in pathogenicity in other fungi. The analysis of repetitive sequences provided evidence for repeat-induced point mutations in A. oligospora. Proteomic and quantitative PCR (qPCR) analyses revealed that 90 genes were significantly up-regulated at the early stage of trap-formation by nematode extracts and most of these genes were involved in translation, amino acid metabolism, carbohydrate metabolism, cell wall and membrane biogenesis. Based on the combined genomic, proteomic and qPCR data, a model for the formation of nematode trapping device in this fungus was proposed. In this model, multiple fungal signal transduction pathways are activated by its nematode prey to further regulate downstream genes associated with diverse cellular processes such as energy metabolism, biosynthesis of the cell wall and adhesive proteins, cell division, glycerol accumulation and peroxisome biogenesis. This study will facilitate the identification of pathogenicity-related genes and provide a broad foundation for understanding the molecular and evolutionary mechanisms underlying fungi-nematodes interactions.

  18. Analyses of Tissue Culture Adaptation of Human Herpesvirus-6A by Whole Genome Deep Sequencing Redefines the Reference Sequence and Identifies Virus Entry Complex Changes.

    Science.gov (United States)

    Tweedy, Joshua G; Escriva, Eric; Topf, Maya; Gompels, Ursula A

    2017-12-31

    Tissue-culture adaptation of viruses can modulate infection. Laboratory passage and bacterial artificial chromosome (BAC)mid cloning of human cytomegalovirus, HCMV, resulted in genomic deletions and rearrangements altering genes encoding the virus entry complex, which affected cellular tropism, virulence, and vaccine development. Here, we analyse these effects on the reference genome for related betaherpesviruses, Roseolovirus, human herpesvirus 6A (HHV-6A) strain U1102. This virus is also naturally "cloned" by germline subtelomeric chromosomal-integration in approximately 1% of human populations, and accurate references are key to understanding pathological relationships between exogenous and endogenous virus. Using whole genome next-generation deep-sequencing Illumina-based methods, we compared the original isolate to tissue-culture passaged and the BACmid-cloned virus. This re-defined the reference genome showing 32 corrections and 5 polymorphisms. Furthermore, minor variant analyses of passaged and BACmid virus identified emerging populations of a further 32 single nucleotide polymorphisms (SNPs) in 10 loci, half non-synonymous indicating cell-culture selection. Analyses of the BAC-virus genome showed deletion of the BAC cassette via loxP recombination removing green fluorescent protein (GFP)-based selection. As shown for HCMV culture effects, select HHV-6A SNPs mapped to genes encoding mediators of virus cellular entry, including virus envelope glycoprotein genes gB and the gH/gL complex. Comparative models suggest stabilisation of the post-fusion conformation. These SNPs are essential to consider in vaccine-design, antimicrobial-resistance, and pathogenesis.

  19. Comparative scaffolding and gap filling of ancient bacterial genomes applied to two ancient Yersinia pestis genomes

    Science.gov (United States)

    Doerr, Daniel; Chauve, Cedric

    2017-01-01

    Yersinia pestis is the causative agent of the bubonic plague, a disease responsible for several dramatic historical pandemics. Progress in ancient DNA (aDNA) sequencing rendered possible the sequencing of whole genomes of important human pathogens, including the ancient Y. pestis strains responsible for outbreaks of the bubonic plague in London in the 14th century and in Marseille in the 18th century, among others. However, aDNA sequencing data are still characterized by short reads and non-uniform coverage, so assembling ancient pathogen genomes remains challenging and often prevents a detailed study of genome rearrangements. It has recently been shown that comparative scaffolding approaches can improve the assembly of ancient Y. pestis genomes at a chromosome level. In the present work, we address the last step of genome assembly, the gap-filling stage. We describe an optimization-based method AGapEs (ancestral gap estimation) to fill in inter-contig gaps using a combination of a template obtained from related extant genomes and aDNA reads. We show how this approach can be used to refine comparative scaffolding by selecting contig adjacencies supported by a mix of unassembled aDNA reads and comparative signal. We applied our method to two Y. pestis data sets from the London and Marseilles outbreaks, for which we obtained highly improved genome assemblies for both genomes, comprised of, respectively, five and six scaffolds with 95 % of the assemblies supported by ancient reads. We analysed the genome evolution between both ancient genomes in terms of genome rearrangements, and observed a high level of synteny conservation between these strains. PMID:29114402

  20. Integrated proteomic and genomic analysis of colorectal cancer

    Science.gov (United States)

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  1. Genomic analyses inform on migration events during the peopling of Eurasia

    OpenAIRE

    Pagani, L.; Lawson, D. J.; Jagoda, E.; Moerseburg, A.; Eriksson, A.; Mitt, M.; Clemente, F.; Hudjashov, G.; DeGiorgio, M.; Saag, L.; Wall, J. D.; Cardona, A.; Maegi, R.; Sayres, M. A. W.; Kaewert, S.

    2016-01-01

    High-coverage whole-genome sequence studies have so far focused on a limited number1 of geographically restricted populations2, 3, 4, 5, or been targeted at specific diseases, such as cancer6. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history7, 8, 9 and refuelled the debate on the mutation rate in humans10. Here we present the Estonian Biocentre Human Genome Diversity Panel (EGDP), a dataset of 483 h...

  2. Genomic analyses inform on migration events during the peopling of Eurasia

    OpenAIRE

    Pagani, Luca; Lawson, Daniel John; Jagoda, Evelyn; M?rseburg, Alexander; Eriksson, Anders; Mitt, Mario; Clemente, Florian; Hudjashov, Georgi; DeGiorgio, Michael; Saag, Lauri; Wall, Jeffrey D.; Cardona, Alexia; M?gi, Reedik; Wilson Sayres, Melissa A.; Kaewert, Sarah

    2016-01-01

    High-coverage whole-genome sequence studies have so far focused\\ud on a limited number1 of geographically restricted populations2–5,\\ud or been targeted at specific diseases, such as cancer6. Nevertheless,\\ud the availability of high-resolution genomic data has led to the\\ud development of new methodologies for inferring population\\ud history7–9 and refuelled the debate on the mutation rate in humans10.\\ud Here we present the Estonian Biocentre Human Genome Diversity\\ud Panel (EGDP), a datase...

  3. Genomic Analyses Reveal the Influence of Geographic Origin, Migration, and Hybridization on Modern Dog Breed Development

    Directory of Open Access Journals (Sweden)

    Heidi G. Parker

    2017-04-01

    Full Text Available There are nearly 400 modern domestic dog breeds with a unique histories and genetic profiles. To track the genetic signatures of breed development, we have assembled the most diverse dataset of dog breeds, reflecting their extensive phenotypic variation and heritage. Combining genetic distance, migration, and genome-wide haplotype sharing analyses, we uncover geographic patterns of development and independent origins of common traits. Our analyses reveal the hybrid history of breeds and elucidate the effects of immigration, revealing for the first time a suggestion of New World dog within some modern breeds. Finally, we used cladistics and haplotype sharing to show that some common traits have arisen more than once in the history of the dog. These analyses characterize the complexities of breed development, resolving longstanding questions regarding individual breed origination, the effect of migration on geographically distinct breeds, and, by inference, transfer of trait and disease alleles among dog breeds.

  4. Complete sequences of organelle genomes from the medicinal plant Rhazya stricta (Apocynaceae) and contrasting patterns of mitochondrial genome evolution across asterids.

    Science.gov (United States)

    Park, Seongjun; Ruhlman, Tracey A; Sabir, Jamal S M; Mutwakil, Mohammed H Z; Baeshen, Mohammed N; Sabir, Meshaal J; Baeshen, Nabih A; Jansen, Robert K

    2014-05-28

    Rhazya stricta is native to arid regions in South Asia and the Middle East and is used extensively in folk medicine to treat a wide range of diseases. In addition to generating genomic resources for this medicinally important plant, analyses of the complete plastid and mitochondrial genomes and a nuclear transcriptome from Rhazya provide insights into inter-compartmental transfers between genomes and the patterns of evolution among eight asterid mitochondrial genomes. The 154,841 bp plastid genome is highly conserved with gene content and order identical to the ancestral organization of angiosperms. The 548,608 bp mitochondrial genome exhibits a number of phenomena including the presence of recombinogenic repeats that generate a multipartite organization, transferred DNA from the plastid and nuclear genomes, and bidirectional DNA transfers between the mitochondrion and the nucleus. The mitochondrial genes sdh3 and rps14 have been transferred to the nucleus and have acquired targeting presequences. In the case of rps14, two copies are present in the nucleus; only one has a mitochondrial targeting presequence and may be functional. Phylogenetic analyses of both nuclear and mitochondrial copies of rps14 across angiosperms suggests Rhazya has experienced a single transfer of this gene to the nucleus, followed by a duplication event. Furthermore, the phylogenetic distribution of gene losses and the high level of sequence divergence in targeting presequences suggest multiple, independent transfers of both sdh3 and rps14 across asterids. Comparative analyses of mitochondrial genomes of eight sequenced asterids indicates a complicated evolutionary history in this large angiosperm clade with considerable diversity in genome organization and size, repeat, gene and intron content, and amount of foreign DNA from the plastid and nuclear genomes. Organelle genomes of Rhazya stricta provide valuable information for improving the understanding of mitochondrial genome evolution

  5. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  6. Molecular Characterization of Five Potyviruses Infecting Korean Sweet Potatoes Based on Analyses of Complete Genome Sequences

    Directory of Open Access Journals (Sweden)

    Hae-Ryun Kwak

    2015-12-01

    Full Text Available Sweet potatoes (Ipomea batatas L. are grown extensively, in tropical and temperate regions, and are important food crops worldwide. In Korea, potyviruses, including Sweet potato feathery mottle virus (SPFMV, Sweet potato virus C (SPVC, Sweet potato virus G (SPVG, Sweet potato virus 2 (SPV2, and Sweet potato latent virus (SPLV, have been detected in sweet potato fields at a high (~95% incidence. In the present work, complete genome sequences of 18 isolates, representing the five potyviruses mentioned above, were compared with previously reported genome sequences. The complete genomes consisted of 10,081 to 10,830 nucleotides, excluding the poly-A tails. Their genomic organizations were typical of the Potyvirus genus, including one target open reading frame coding for a putative polyprotein. Based on phylogenetic analyses and sequence comparisons, the Korean SPFMV isolates belonged to the strains RC and O with >98% nucleotide sequence identity. Korean SPVC isolates had 99% identity to the Japanese isolate SPVC-Bungo and 70% identity to the SPFMV isolates. The Korean SPVG isolates showed 99% identity to the three previously reported SPVG isolates. Korean SPV2 isolates had 97% identity to the SPV2 GWB-2 isolate from the USA. Korean SPLV isolates had a relatively low (88% nucleotide sequence identity with the Taiwanese SPLV-TW isolates, and they were phylogenetically distantly related to SPFMV isolates. Recombination analysis revealed that possible recombination events occurred in the P1, HC-Pro and NIa-NIb regions of SPFMV and SPLV isolates and these regions were identified as hotspots for recombination in the sweet potato potyviruses.

  7. Convergent evolution of the genomes of marine mammals

    Science.gov (United States)

    Foote, Andrew D.; Liu, Yue; Thomas, Gregg W.C.; Vinař, Tomáš; Alföldi, Jessica; Deng, Jixin; Dugan, Shannon; van Elk, Cornelis E.; Hunter, Margaret; Joshi, Vandita; Khan, Ziad; Kovar, Christie; Lee, Sandra L.; Lindblad-Toh, Kerstin; Mancia, Annalaura; Nielsen, Rasmus; Qin, Xiang; Qu, Jiaxin; Raney, Brian J.; Vijay, Nagarjun; Wolf, Jochen B. W.; Hahn, Matthew W.; Muzny, Donna M.; Worley, Kim C.; Gilbert, M. Thomas P.; Gibbs, Richard A.

    2015-01-01

    Marine mammals from different mammalian orders share several phenotypic traits adapted to the aquatic environment and therefore represent a classic example of convergent evolution. To investigate convergent evolution at the genomic level, we sequenced and performed de novo assembly of the genomes of three species of marine mammals (the killer whale, walrus and manatee) from three mammalian orders that share independently evolved phenotypic adaptations to a marine existence. Our comparative genomic analyses found that convergent amino acid substitutions were widespread throughout the genome and that a subset of these substitutions were in genes evolving under positive selection and putatively associated with a marine phenotype. However, we found higher levels of convergent amino acid substitutions in a control set of terrestrial sister taxa to the marine mammals. Our results suggest that, whereas convergent molecular evolution is relatively common, adaptive molecular convergence linked to phenotypic convergence is comparatively rare.

  8. Association Mapping and the Genomic Consequences of Selection in Sunflower

    Science.gov (United States)

    Mandel, Jennifer R.; Nambeesan, Savithri; Bowers, John E.; Marek, Laura F.; Ebert, Daniel; Rieseberg, Loren H.; Knapp, Steven J.; Burke, John M.

    2013-01-01

    The combination of large-scale population genomic analyses and trait-based mapping approaches has the potential to provide novel insights into the evolutionary history and genome organization of crop plants. Here, we describe the detailed genotypic and phenotypic analysis of a sunflower (Helianthus annuus L.) association mapping population that captures nearly 90% of the allelic diversity present within the cultivated sunflower germplasm collection. We used these data to characterize overall patterns of genomic diversity and to perform association analyses on plant architecture (i.e., branching) and flowering time, successfully identifying numerous associations underlying these agronomically and evolutionarily important traits. Overall, we found variable levels of linkage disequilibrium (LD) across the genome. In general, islands of elevated LD correspond to genomic regions underlying traits that are known to have been targeted by selection during the evolution of cultivated sunflower. In many cases, these regions also showed significantly elevated levels of differentiation between the two major sunflower breeding groups, consistent with the occurrence of divergence due to strong selection. One of these regions, which harbors a major branching locus, spans a surprisingly long genetic interval (ca. 25 cM), indicating the occurrence of an extended selective sweep in an otherwise recombinogenic interval. PMID:23555290

  9. Genome sequence and transcriptome analyses of the thermophilic zygomycete fungus Rhizomucor miehei.

    Science.gov (United States)

    Zhou, Peng; Zhang, Guoqiang; Chen, Shangwu; Jiang, Zhengqiang; Tang, Yanbin; Henrissat, Bernard; Yan, Qiaojuan; Yang, Shaoqing; Chen, Chin-Fu; Zhang, Bing; Du, Zhenglin

    2014-04-21

    The zygomycete fungi like Rhizomucor miehei have been extensively exploited for the production of various enzymes. As a thermophilic fungus, R. miehei is capable of growing at temperatures that approach the upper limits for all eukaryotes. To date, over hundreds of fungal genomes are publicly available. However, Zygomycetes have been rarely investigated both genetically and genomically. Here, we report the genome of R. miehei CAU432 to explore the thermostable enzymatic repertoire of this fungus. The assembled genome size is 27.6-million-base (Mb) with 10,345 predicted protein-coding genes. Even being thermophilic, the G + C contents of fungal whole genome (43.8%) and coding genes (47.4%) are less than 50%. Phylogenetically, R. miehei is more closerly related to Phycomyces blakesleeanus than to Mucor circinelloides and Rhizopus oryzae. The genome of R. miehei harbors a large number of genes encoding secreted proteases, which is consistent with the characteristics of R. miehei being a rich producer of proteases. The transcriptome profile of R. miehei showed that the genes responsible for degrading starch, glucan, protein and lipid were highly expressed. The genome information of R. miehei will facilitate future studies to better understand the mechanisms of fungal thermophilic adaptation and the exploring of the potential of R. miehei in industrial-scale production of thermostable enzymes. Based on the existence of a large repertoire of amylolytic, proteolytic and lipolytic genes in the genome, R. miehei has potential in the production of a variety of such enzymes.

  10. Genetic basis for spontaneous hybrid genome doubling during allopolyploid speciation of common wheat shown by natural variation analyses of the paternal species.

    Directory of Open Access Journals (Sweden)

    Yoshihiro Matsuoka

    Full Text Available The complex process of allopolyploid speciation includes various mechanisms ranging from species crosses and hybrid genome doubling to genome alterations and the establishment of new allopolyploids as persisting natural entities. Currently, little is known about the genetic mechanisms that underlie hybrid genome doubling, despite the fact that natural allopolyploid formation is highly dependent on this phenomenon. We examined the genetic basis for the spontaneous genome doubling of triploid F1 hybrids between the direct ancestors of allohexaploid common wheat (Triticum aestivum L., AABBDD genome, namely Triticumturgidum L. (AABB genome and Aegilopstauschii Coss. (DD genome. An Ae. tauschii intraspecific lineage that is closely related to the D genome of common wheat was identified by population-based analysis. Two representative accessions, one that produces a high-genome-doubling-frequency hybrid when crossed with a T. turgidum cultivar and the other that produces a low-genome-doubling-frequency hybrid with the same cultivar, were chosen from that lineage for further analyses. A series of investigations including fertility analysis, immunostaining, and quantitative trait locus (QTL analysis showed that (1 production of functional unreduced gametes through nonreductional meiosis is an early step key to successful hybrid genome doubling, (2 first division restitution is one of the cytological mechanisms that cause meiotic nonreduction during the production of functional male unreduced gametes, and (3 six QTLs in the Ae. tauschii genome, most of which likely regulate nonreductional meiosis and its subsequent gamete production processes, are involved in hybrid genome doubling. Interlineage comparisons of Ae. tauschii's ability to cause hybrid genome doubling suggested an evolutionary model for the natural variation pattern of the trait in which non-deleterious mutations in six QTLs may have important roles. The findings of this study demonstrated

  11. Genome-Wide Association Meta-Analyses to Identify Common Genetic Variants Associated with Hallux Valgus in Caucasian and African Americans

    Science.gov (United States)

    Hsu, Yi-Hsiang; Liu, Youfang; Hannan, Marian T.; Maixner, William; Smith, Shad B.; Diatchenko, Luda; Golightly, Yvonne M.; Menz, Hylton B.; Kraus, Virginia B.; Doherty, Michael; Wilson, A.G.; Jordan, Joanne M.

    2016-01-01

    Objective Hallux valgus (HV) affects ~36% of Caucasian adults. Although considered highly heritable, the underlying genetic determinants are unclear. We conducted the first genome-wide association study (GWAS) aimed to identify genetic variants associated with HV. Methods HV was assessed in 3 Caucasian cohorts (n=2,263, n=915, and n=1,231 participants, respectively). In each cohort, a GWAS was conducted using 2.5M imputed single nucleotide polymorphisms (SNPs). Mixed-effect regression with the additive genetic model adjusted for age, sex, weight and within-family correlations was used for both sex-specific and combined analyses. To combine GWAS results across cohorts, fixed-effect inverse-variance meta-analyses were used. Following meta-analyses, top-associated findings were also examined in an African American cohort (n=327). Results The proportion of HV variance explained by genome-wide genotyped SNPs was 50% in men and 48% in women. A higher proportion of genetic determinants of HV was sex-specific. The most significantly associated SNP in men was rs9675316 located on chr17q23-a24 near the AXIN2 gene (p=5.46×10−7); the most significantly associated SNP in women was rs7996797 located on chr13q14.1-q14.2 near the ESD gene (p=7.21×10−7). Genome-wide significant SNP-by-sex interaction was found for SNP rs1563374 located on chr11p15.1 near the MRGPRX3 gene (interaction p-value =4.1×10−9). The association signals diminished when combining men and women. Conclusion Findings suggest that the potential pathophysiological mechanisms of HV are complex and strongly underlined by sex-specific interactions. The identified genetic variants imply contribution of biological pathways observed in osteoarthritis as well as new pathways, influencing skeletal development and inflammation. PMID:26337638

  12. Genome reorganization in Nicotiana asymmetric somatic hybrids analysed by in situ hybridization

    International Nuclear Information System (INIS)

    Parokonny, A.S.; Kenton, A.Y.; Gleba, Y.Y.; Bennett, M.D.

    1992-01-01

    In situ hybridization was used to examine genome reorganization in asymmetric somatic hybrids between Nicotiana plumbaginifolia and Nicotiana sylvestris obtained by fusion of gamma-irradiated protoplasts from one of the parents (donor) with non-irradiated protoplasts from the other (recipient). Probing with biotinylated total genomic DNA from either the donor or the recipient species unequivocally identified genetic material from both parents in 31 regenerant plants, each originating from a different nuclear hybrid colony. This method, termed genomic in situ hybridization (GISH), allowed intergenomic translocations containing chromosome segments from both species to be recognized in four regenerants. A probe homologous to the consensus sequence of the Arabidopsis thaliana telomeric repeat (5'-TTTAGGG-3')n, identified telomeres on all chromosomes, including 'mini-chromosomes' originating from the irradiated donor genome. Genomic in situ hybridization to plant chromosomes provides a rapid and reliable means of screening for recombinant genotypes in asymmetric somatic hybrids. Used in combination with other DNA probes, it also contributes to a greater understanding of the events responsible for genomic recovery and restabilization following genetic manipulation in vitro

  13. A Rapid and Efficient Method for Purifying High Quality Total RNA from Peaches (Prunus persica for Functional Genomics Analyses

    Directory of Open Access Journals (Sweden)

    LEE MEISEL

    2005-01-01

    Full Text Available Prunus persica has been proposed as a genomic model for deciduous trees and the Rosaceae family. Optimized protocols for RNA isolation are necessary to further advance studies in this model species such that functional genomics analyses may be performed. Here we present an optimized protocol to rapidly and efficiently purify high quality total RNA from peach fruits (Prunus persica. Isolating high-quality RNA from fruit tissue is often difficult due to large quantities of polysaccharides and polyphenolic compounds that accumulate in this tissue and co-purify with the RNA. Here we demonstrate that a modified version of the method used to isolate RNA from pine trees and the woody plant Cinnamomun tenuipilum is ideal for isolating high quality RNA from the fruits of Prunus persica. This RNA may be used for many functional genomic based experiments such as RT-PCR and the construction of large-insert cDNA libraries.

  14. Microbial ecology in the age of genomics and metagenomics: concepts, tools, and recent advances.

    Science.gov (United States)

    Xu, Jianping

    2006-06-01

    genomes, a result consistent with those from multilocus sequence typing and representational difference analyses. The integration of various levels of ecological analyses coupled to the application and further development of high throughput technologies are accelerating the pace of discovery in microbial ecology.

  15. Modulation of genetic associations with serum urate levels by body-mass-index in humans

    NARCIS (Netherlands)

    J.E. Huffman (Jennifer); E. Albrecht (Eva); A. Teumer (Alexander); M. Mangino (Massimo); K. Kapur (Karen); T. Johnson (Toby); Z. Kutalik (Zoltán); N. Pirastu (Nicola); G. Pistis (Giorgio); L.M. Lopez (Lorna); T. Haller (Toomas); P. Salo (Perttu); A. Goel (Anuj); M. Li (Man); T. Tanaka (Toshiko); A. Dehghan (Abbas); D. Ruggiero; G. Malerba (Giovanni); A.V. Smith (Albert Vernon); Nolte, I.M. (Ilja M.); L. Portas (Laura); Phipps-Green, A. (Amanda); Boteva, L. (Lora); P. Navarro (Pau); A. Johansson (Åsa); A.A. Hicks (Andrew); O. Polasek (Ozren); T. Esko (Tõnu); J. Peden (John); S.E. Harris (Sarah); D. Murgia (Daniela); Wild, S.H. (Sarah H.); A. Tenesa (Albert); A. Tin (Adrienne); E. Mihailov (Evelin); A. Grotevendt (Anne); G.K. Gislason; J. Coresh (Josef); P. d' Adamo (Pio); S. Ulivi (Shelia); P. Vollenweider (Peter); G. Waeber (Gérard); Campbell, S. (Susan); I. Kolcic (Ivana); Fisher, K. (Krista); M. Viigimaa (Margus); Metter, J.E. (Jeffrey E.); C. Masciullo (Corrado); Trabetti, E. (Elisabetta); Bombieri, C. (Cristina); R. Sorice; A. Döring (Angela); G. Reischl (Gunilla); K. Strauch (Konstantin); A. Hofman (Albert); A.G. Uitterlinden (André); M. Waldenberger (Melanie); H.E. Wichmann (Heinz Erich); G. Davies (Gail); A.J. Gow (Alan J.); Dalbeth, N. (Nicola); Stamp, L. (Lisa); Smit, J.H. (Johannes H.); M. Kirin (Mirna); R. Nagaraja (Ramaiah); M. Nauck (Matthias); C. Schurmann (Claudia); K. Budde (Klemens); S.M. Farrington (Susan); E. Theodoratou (Evropi); A. Jula (Antti); V. Salomaa (Veikko); C. Sala (Cinzia); C. Hengstenberg (Christian); M. Burnier (Michel); Mägi, R. (Reedik); N. Klopp (Norman); S. Kloiber (Stefan); S. Schipf (Sabine); S. Ripatti (Samuli); Cabras, S. (Stefano); N. Soranzo (Nicole); G. Homuth (Georg); T. Nutile; P. Munroe (Patricia); N. Hastie (Nick); H. Campbell (H.); I. Rudan (Igor); Cabrera, C. (Claudia); Haley, C. (Chris); O.H. Franco (Oscar); Merriman, T.R. (Tony R.); V. Gudnason (Vilmundur); M. Pirastu (Mario); B.W.J.H. Penninx (Brenda); H. Snieder (Harold); A. Metspalu (Andres); M. Ciullo; P.P. Pramstaller (Peter Paul); C.M. van Duijn (Cornelia); L. Ferrucci (Luigi); G. Gambaro (Giovanni); Deary, I.J. (Ian J.); M.G. Dunlop (Malcolm); J.F. Wilson (James F); P. Gasparini (Paolo); U. Gyllensten (Ulf); T.D. Spector (Timothy); A.F. Wright (Alan); C. Hayward (Caroline); H. Watkins (Hugh); M. Perola (Markus); M. Bochud (Murielle); W.H.L. Kao (Wen); M. Caulfield (Mark); D. Toniolo (Daniela); H. Völzke (Henry); C. Gieger (Christian); A. Köttgen (Anna); V. Vitart (Veronique)

    2015-01-01

    textabstractWe tested for interactions between body mass index (BMI) and common genetic variants affecting serum urate levels, genome-wide, in up to 42569 participants. Both stratified genome-wide association (GWAS) analyses, in lean, overweight and obese individuals, and regression-type analyses in

  16. Molecular and FISH analyses of a 53-kbp intact DNA fragment inserted by biolistics in wheat (Triticum aestivum L.) genome.

    Science.gov (United States)

    Partier, A; Gay, G; Tassy, C; Beckert, M; Feuillet, C; Barret, P

    2017-10-01

    A large, 53-kbp, intact DNA fragment was inserted into the wheat ( Triticum aestivum L.) genome. FISH analyses of individual transgenic events revealed multiple insertions of intact fragments. Transferring large intact DNA fragments containing clusters of resistance genes or complete metabolic pathways into the wheat genome remains a challenge. In a previous work, we showed that the use of dephosphorylated cassettes for wheat transformation enabled the production of simple integration patterns. Here, we used the same technology to produce a cassette containing a 44-kb Arabidopsis thaliana BAC, flanked by one selection gene and one reporter gene. This 53-kb linear cassette was integrated in the bread wheat (Triticum aestivum L.) genome by biolistic transformation. Our results showed that transgenic plants harboring the entire cassette were generated. The inheritability of the cassette was demonstrated in the T1 and T2 generation. Surprisingly, FISH analysis performed on T1 progeny of independent events identified double genomic insertions of intact fragments in non-homoeologous positions. Inheritability of these double insertions was demonstrated by FISH analysis of the T1 generation. Relative conclusions that can be drawn from molecular or FISH analysis are discussed along with future prospects of the engineering of large fragments for wheat transformation or genome editing.

  17. BIGSdb: Scalable analysis of bacterial genome variation at the population level

    Directory of Open Access Journals (Sweden)

    Maiden Martin CJ

    2010-12-01

    Full Text Available Abstract Background The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner. Results The Bacterial Isolate Genome Sequence Database (BIGSDB is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens. The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences. These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses. Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches. LIMS functionality of the software enables linkage to and organisation of laboratory samples. The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database. Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus. The BIGSDB source code and documentation are available at http://pubmlst.org/software/database/bigsdb/. Conclusions Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies. BIGSDB

  18. Genomic Signatures of Reinforcement

    Directory of Open Access Journals (Sweden)

    Austin G. Garner

    2018-04-01

    Full Text Available Reinforcement is the process by which selection against hybridization increases reproductive isolation between taxa. Much research has focused on demonstrating the existence of reinforcement, yet relatively little is known about the genetic basis of reinforcement or the evolutionary conditions under which reinforcement can occur. Inspired by reinforcement’s characteristic phenotypic pattern of reproductive trait divergence in sympatry but not in allopatry, we discuss whether reinforcement also leaves a distinct genomic pattern. First, we describe three patterns of genetic variation we expect as a consequence of reinforcement. Then, we discuss a set of alternative processes and complicating factors that may make the identification of reinforcement at the genomic level difficult. Finally, we consider how genomic analyses can be leveraged to inform if and to what extent reinforcement evolved in the face of gene flow between sympatric lineages and between allopatric and sympatric populations of the same lineage. Our major goals are to understand if genome scans for particular patterns of genetic variation could identify reinforcement, isolate the genetic basis of reinforcement, or infer the conditions under which reinforcement evolved.

  19. Genomic Signatures of Reinforcement

    Science.gov (United States)

    Goulet, Benjamin E.

    2018-01-01

    Reinforcement is the process by which selection against hybridization increases reproductive isolation between taxa. Much research has focused on demonstrating the existence of reinforcement, yet relatively little is known about the genetic basis of reinforcement or the evolutionary conditions under which reinforcement can occur. Inspired by reinforcement’s characteristic phenotypic pattern of reproductive trait divergence in sympatry but not in allopatry, we discuss whether reinforcement also leaves a distinct genomic pattern. First, we describe three patterns of genetic variation we expect as a consequence of reinforcement. Then, we discuss a set of alternative processes and complicating factors that may make the identification of reinforcement at the genomic level difficult. Finally, we consider how genomic analyses can be leveraged to inform if and to what extent reinforcement evolved in the face of gene flow between sympatric lineages and between allopatric and sympatric populations of the same lineage. Our major goals are to understand if genome scans for particular patterns of genetic variation could identify reinforcement, isolate the genetic basis of reinforcement, or infer the conditions under which reinforcement evolved. PMID:29614048

  20. Modulation of Genetic Associations with Serum Urate Levels by Body-Mass-Index in Humans

    NARCIS (Netherlands)

    Huffman, Jennifer E.; Albrecht, Eva; Teumer, Alexander; Mangino, Massimo; Kapur, Karen; Johnson, Toby; Kutalik, Zoltn; Pirastu, Nicola; Pistis, Giorgio; Lopez, Lorna M.; Haller, Toomas; Salo, Perttu; Goel, Anuj; Li, Man; Tanaka, Toshiko; Dehghan, Abbas; Ruggiero, Daniela; Malerba, Giovanni; Smith, Albert V.; Nolte, Ilja M.; Portas, Laura; Phipps-Green, Amanda; Boteva, Lora; Navarro, Pau; Johansson, Asa; Hicks, Andrew A.; Polasek, Ozren; Esko, Tonu; Peden, John F.; Harris, Sarah E.; Murgia, Federico; Wild, Sarah H.; Tenesa, Albert; Tin, Adrienne; Mihailov, Evelin; Grotevendt, Anne; Gislason, Gauti K.; Coresh, Josef; D'Adamo, Pio; Ulivi, Sheila; Vollenweider, Peter; Waeber, Gerard; Campbell, Susan; Kolcic, Ivana; Fisher, Krista; Viigimaa, Margus; Metter, Jeffrey E.; Masciullo, Corrado; Trabetti, Elisabetta; Bombieri, Cristina; Sorice, Rossella; Doering, Angela; Reischl, Eva; Strauch, Konstantin; Hofman, Albert; Uitterlinden, Andre G.; Waldenberger, Melanie; Wichmann, H-Erich; Davies, Gail; Gow, Alan J.; Dalbeth, Nicola; Stamp, Lisa; Smit, Johannes H.; Kirin, Mirna; Nagaraja, Ramaiah; Nauck, Matthias; Schurmann, Claudia; Budde, Kathrin; Farrington, Susan M.; Theodoratou, Evropi; Jula, Antti; Salomaa, Veikko; Sala, Cinzia; Hengstenberg, Christian; Burnier, Michel; Maegi, Reedik; Klopp, Norman; Kloiber, Stefan; Schipf, Sabine; Ripatti, Samuli; Cabras, Stefano; Soranzo, Nicole; Homuth, Georg; Nutile, Teresa; Munroe, Patricia B.; Hastie, Nicholas; Campbell, Harry; Rudan, Igor; Cabrera, Claudia; Haley, Chris; Franco, Oscar H.; Merriman, Tony R.; Gudnason, Vilmundur; Pirastu, Mario; Penninx, Brenda W.; Snieder, Harold; Metspalu, Andres; Ciullo, Marina; Pramstaller, Peter P.; van Duijn, Cornelia M.; Ferrucci, Luigi; Gambaro, Giovanni; Deary, Ian J.; Dunlop, Malcolm G.; Wilson, James F.; Gasparini, Paolo; Gyllensten, Ulf; Spector, Tim D.; Wright, Alan F.; Hayward, Caroline; Watkins, Hugh; Perola, Markus; Bochud, Murielle; Kao, W. H. Linda; Caulfield, Mark; Toniolo, Daniela; Voelzke, Henry; Gieger, Christian; Koettgen, Anna; Vitart, Veronique

    2015-01-01

    We tested for interactions between body mass index (BMI) and common genetic variants affecting serum urate levels, genome-wide, in up to 42569 participants. Both stratified genome-wide association (GWAS) analyses, in lean, overweight and obese individuals, and regression-type analyses in a non

  1. Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas

    Directory of Open Access Journals (Sweden)

    Theo A. Knijnenburg

    2018-04-01

    Full Text Available Summary: DNA damage repair (DDR pathways modulate cancer risk, progression, and therapeutic response. We systematically analyzed somatic alterations to provide a comprehensive view of DDR deficiency across 33 cancer types. Mutations with accompanying loss of heterozygosity were observed in over 1/3 of DDR genes, including TP53 and BRCA1/2. Other prevalent alterations included epigenetic silencing of the direct repair genes EXO5, MGMT, and ALKBH3 in ∼20% of samples. Homologous recombination deficiency (HRD was present at varying frequency in many cancer types, most notably ovarian cancer. However, in contrast to ovarian cancer, HRD was associated with worse outcomes in several other cancers. Protein structure-based analyses allowed us to predict functional consequences of rare, recurrent DDR mutations. A new machine-learning-based classifier developed from gene expression data allowed us to identify alterations that phenocopy deleterious TP53 mutations. These frequent DDR gene alterations in many human cancers have functional consequences that may determine cancer progression and guide therapy. : Knijnenburg et al. present The Cancer Genome Atlas (TCGA Pan-Cancer analysis of DNA damage repair (DDR deficiency in cancer. They use integrative genomic and molecular analyses to identify frequent DDR alterations across 33 cancer types, correlate gene- and pathway-level alterations with genome-wide measures of genome instability and impaired function, and demonstrate the prognostic utility of DDR deficiency scores. Keywords: The Cancer Genome Atlas PanCanAtlas project, DNA damage repair, somatic mutations, somatic copy-number alterations, epigenetic silencing, DNA damage footprints, mutational signatures, integrative statistical analysis, protein structure analysis

  2. The complete chloroplast genome sequence of Dodonaea viscosa: comparative and phylogenetic analyses.

    Science.gov (United States)

    Saina, Josphat K; Gichira, Andrew W; Li, Zhi-Zhong; Hu, Guang-Wan; Wang, Qing-Feng; Liao, Kuo

    2018-02-01

    The plant chloroplast (cp) genome is a highly conserved structure which is beneficial for evolution and systematic research. Currently, numerous complete cp genome sequences have been reported due to high throughput sequencing technology. However, there is no complete chloroplast genome of genus Dodonaea that has been reported before. To better understand the molecular basis of Dodonaea viscosa chloroplast, we used Illumina sequencing technology to sequence its complete genome. The whole length of the cp genome is 159,375 base pairs (bp), with a pair of inverted repeats (IRs) of 27,099 bp separated by a large single copy (LSC) 87,204 bp, and small single copy (SSC) 17,972 bp. The annotation analysis revealed a total of 115 unique genes of which 81 were protein coding, 30 tRNA, and four ribosomal RNA genes. Comparative genome analysis with other closely related Sapindaceae members showed conserved gene order in the inverted and single copy regions. Phylogenetic analysis clustered D. viscosa with other species of Sapindaceae with strong bootstrap support. Finally, a total of 249 SSRs were detected. Moreover, a comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates in D. viscosa showed very low values. The availability of cp genome reported here provides a valuable genetic resource for comprehensive further studies in genetic variation, taxonomy and phylogenetic evolution of Sapindaceae family. In addition, SSR markers detected will be used in further phylogeographic and population structure studies of the species in this genus.

  3. The Contribution of Tissue Level Organization to Genomic Stability Following Low Dose/Low Dose Rate Gamma and Proton Irradiation

    Energy Technology Data Exchange (ETDEWEB)

    Cheryl G. Burrell, Ph.D.

    2012-05-14

    The formation of functional tissue units is necessary in maintaining homeostasis within living systems, with individual cells contributing to these functional units through their three-dimensional organization with integrin and adhesion proteins to form a complex extra-cellular matrix (ECM). This is of particular importance in those tissues susceptible to radiation-induced tumor formation, such as epithelial glands. The assembly of epithelial cells of the thyroid is critical to their normal receipt of, and response to, incoming signals. Traditional tissue culture and live animals present significant challenges to radiation exposure and continuous sampling, however, the production of bioreactor-engineered tissues aims to bridge this gap by improve capabilities in continuous sampling from the same functional tissue, thereby increasing the ability to extrapolate changes induced by radiation to animals and humans in vivo. Our study proposes that the level of tissue organization will affect the induction and persistence of low dose radiation-induced genomic instability. Rat thyroid cells, grown in vitro as 3D tissue analogs in bioreactors and as 2D flask grown cultures were exposed to acute low dose (1, 5, 10 and 200 cGy) gamma rays. To assess immediate (6 hours) and delayed (up to 30 days) responses post-irradiation, various biological endpoints were studied including cytogenetic analyses, apoptosis analysis and cell viability/cytotoxicity analyses. Data assessing caspase 3/7 activity levels show that, this activity varies with time post radiation and that, overall, 3D cultures display more genomic instability (as shown by the lower levels of apoptosis over time) when compared to the 2D cultures. Variation in cell viability levels were only observed at the intermediate and late time points post radiation. Extensive analysis of chromosomal aberrations will give further insight on the whether the level of tissue organization influences genomic instability patterns after

  4. The mitochondrial genome of Phallusia mammillata and Phallusia fumigata (Tunicata, Ascidiacea: high genome plasticity at intra-genus level

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2007-08-01

    Full Text Available Abstract Background Within Chordata, the subphyla Vertebrata and Cephalochordata (lancelets are characterized by a remarkable stability of the mitochondrial (mt genome, with constancy of gene content and almost invariant gene order, whereas the limited mitochondrial data on the subphylum Tunicata suggest frequent and extensive gene rearrangements, observed also within ascidians of the same genus. Results To confirm this evolutionary trend and to better understand the evolutionary dynamics of the mitochondrial genome in Tunicata Ascidiacea, we have sequenced and characterized the complete mt genome of two congeneric ascidian species, Phallusia mammillata and Phallusia fumigata (Phlebobranchiata, Ascidiidae. The two mtDNAs are surprisingly rearranged, both with respect to one another and relative to those of other tunicates and chordates, with gene rearrangements affecting both protein-coding and tRNA genes. The new data highlight the extraordinary variability of ascidian mt genome in base composition, tRNA secondary structure, tRNA gene content, and non-coding regions (number, size, sequence and location. Indeed, both Phallusia genomes lack the trnD gene, show loss/acquisition of DHU-arm in two tRNAs, and have a G+C content two-fold higher than other ascidians. Moreover, the mt genome of P. fumigata presents two identical copies of trnI, an extra tRNA gene with uncertain amino acid specificity, and four almost identical sequence regions. In addition, a truncated cytochrome b, lacking a C-terminal tail that commonly protrudes into the mt matrix, has been identified as a new mt feature probably shared by all tunicates. Conclusion The frequent occurrence of major gene order rearrangements in ascidians both at high taxonomic level and within the same genus makes this taxon an excellent model to study the mechanisms of gene rearrangement, and renders the mt genome an invaluable phylogenetic marker to investigate molecular biodiversity and speciation

  5. Genome-wide meta-analysis uncovers novel loci influencing circulating leptin levels

    DEFF Research Database (Denmark)

    Kilpeläinen, Tuomas O; Carli, Jayne F Martin; Skowronski, Alicja A

    2016-01-01

    . Therefore, we performed a genome-wide association study (GWAS) of circulating leptin levels from 32,161 individuals and followed up loci reaching PFTO....... Although the association of the FTO obesity locus with leptin levels is abolished by adjustment for BMI, associations of the four other loci are independent of adiposity. The GCKR locus was found associated with multiple metabolic traits in previous GWAS and the CCNL1 locus with birth weight. Knockdown...

  6. Genetic and epigenetic alterations induced by different levels of rye genome integration in wheat recipient.

    Science.gov (United States)

    Zheng, X L; Zhou, J P; Zang, L L; Tang, A T; Liu, D Q; Deng, K J; Zhang, Y

    2016-06-17

    The narrow genetic variation present in common wheat (Triticum aestivum) varieties has greatly restricted the improvement of crop yield in modern breeding systems. Alien addition lines have proven to be an effective means to broaden the genetic diversity of common wheat. Wheat-rye addition lines, which are the direct bridge materials for wheat improvement, have been wildly used to produce new wheat cultivars carrying alien rye germplasm. In this study, we investigated the genetic and epigenetic alterations in two sets of wheat-rye disomic addition lines (1R-7R) and the corresponding triticales. We used expressed sequence tag-simple sequence repeat, amplified fragment length polymorphism, and methylation-sensitive amplification polymorphism analyses to analyze the effects of the introduction of alien chromosomes (either the entire genome or sub-genome) to wheat genetic background. We found obvious and diversiform variations in the genomic primary structure, as well as alterations in the extent and pattern of the genomic DNA methylation of the recipient. Meanwhile, these results also showed that introduction of different rye chromosomes could induce different genetic and epigenetic alterations in its recipient, and the genetic background of the parents is an important factor for genomic and epigenetic variation induced by alien chromosome addition.

  7. Genome-wide transcription analyses in rice using tiling microarrays

    DEFF Research Database (Denmark)

    Li, Lei; Wang, Xiangfeng; Stolc, Viktor

    2006-01-01

    . We report here a full-genome transcription analysis of the indica rice subspecies using high-density oligonucleotide tiling microarrays. Our results provided expression data support for the existence of 35,970 (81.9%) annotated gene models and identified 5,464 unique transcribed intergenic regions...... that share similar compositional properties with the annotated exons and have significant homology to other plant proteins. Elucidating and mapping of all transcribed regions revealed an association between global transcription and cytological chromosome features, and an overall similarity of transcriptional......Sequencing and computational annotation revealed several features, including high gene numbers, unusual composition of the predicted genes and a large number of genes lacking homology to known genes, that distinguish the rice (Oryza sativa) genome from that of other fully sequenced model species...

  8. The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

    Science.gov (United States)

    Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

    2016-04-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.

  9. Ensembl Genomes 2016: more genomes, more complexity.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

    2016-01-04

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

    Science.gov (United States)

    Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.

    2015-01-01

    The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards. PMID:25348402

  11. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

    Energy Technology Data Exchange (ETDEWEB)

    Reddy, Tatiparthi B. K. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Thomas, Alex D. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Stamatis, Dimitri [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Bertsch, Jon [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Isbandi, Michelle [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Jansson, Jakob [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Mallajosyula, Jyothi [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Pagani, Ioanna [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Lobos, Elizabeth A. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); King Abdulaziz Univ., Jeddah (Saudi Arabia)

    2014-10-27

    The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Within this paper, we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. Lastly, GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.

  12. Analyses of pig genomes provide insight into porcine demography and evolution

    Science.gov (United States)

    Groenen, Martien A. M.; Archibald, Alan L.; Uenishi, Hirohide; Tuggle, Christopher K.; Takeuchi, Yasuhiro; Rothschild, Max F.; Rogel-Gaillard, Claire; Park, Chankyu; Milan, Denis; Megens, Hendrik-Jan; Li, Shengting; Larkin, Denis M.; Kim, Heebal; Frantz, Laurent A. F.; Caccamo, Mario; Ahn, Hyeonju; Aken, Bronwen L.; Anselmo, Anna; Anthon, Christian; Auvil, Loretta; Badaoui, Bouabid; Beattie, Craig W.; Bendixen, Christian; Berman, Daniel; Blecha, Frank; Blomberg, Jonas; Bolund, Lars; Bosse, Mirte; Botti, Sara; Bujie, Zhan; Bystrom, Megan; Capitanu, Boris; Silva, Denise Carvalho; Chardon, Patrick; Chen, Celine; Cheng, Ryan; Choi, Sang-Haeng; Chow, William; Clark, Richard C.; Clee, Christopher; Crooijmans, Richard P. M. A.; Dawson, Harry D.; Dehais, Patrice; De Sapio, Fioravante; Dibbits, Bert; Drou, Nizar; Du, Zhi-Qiang; Eversole, Kellye; Fadista, João; Fairley, Susan; Faraut, Thomas; Faulkner, Geoffrey J.; Fowler, Katie E.; Fredholm, Merete; Fritz, Eric; Gilbert, James G. R.; Giuffra, Elisabetta; Gorodkin, Jan; Griffin, Darren K.; Harrow, Jennifer L.; Hayward, Alexander; Howe, Kerstin; Hu, Zhi-Liang; Humphray, Sean J.; Hunt, Toby; Hornshøj, Henrik; Jeon, Jin-Tae; Jern, Patric; Jones, Matthew; Jurka, Jerzy; Kanamori, Hiroyuki; Kapetanovic, Ronan; Kim, Jaebum; Kim, Jae-Hwan; Kim, Kyu-Won; Kim, Tae-Hun; Larson, Greger; Lee, Kyooyeol; Lee, Kyung-Tai; Leggett, Richard; Lewin, Harris A.; Li, Yingrui; Liu, Wansheng; Loveland, Jane E.; Lu, Yao; Lunney, Joan K.; Ma, Jian; Madsen, Ole; Mann, Katherine; Matthews, Lucy; McLaren, Stuart; Morozumi, Takeya; Murtaugh, Michael P.; Narayan, Jitendra; Nguyen, Dinh Truong; Ni, Peixiang; Oh, Song-Jung; Onteru, Suneel; Panitz, Frank; Park, Eung-Woo; Park, Hong-Seog; Pascal, Geraldine; Paudel, Yogesh; Perez-Enciso, Miguel; Ramirez-Gonzalez, Ricardo; Reecy, James M.; Zas, Sandra Rodriguez; Rohrer, Gary A.; Rund, Lauretta; Sang, Yongming; Schachtschneider, Kyle; Schraiber, Joshua G.; Schwartz, John; Scobie, Linda; Scott, Carol; Searle, Stephen; Servin, Bertrand; Southey, Bruce R.; Sperber, Goran; Stadler, Peter; Sweedler, Jonathan V.; Tafer, Hakim; Thomsen, Bo; Wali, Rashmi; Wang, Jian; Wang, Jun; White, Simon; Xu, Xun; Yerle, Martine; Zhang, Guojie; Zhang, Jianguo; Zhang, Jie; Zhao, Shuhong; Rogers, Jane; Churcher, Carol; Schook, Lawrence B.

    2013-01-01

    For 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe and Asia. Wild pigs emerged in South East Asia and subsequently spread across Eurasia. Our results reveal a deep phylogenetic split between European and Asian wild boars ~1 million years ago, and a selective sweep analysis indicates selection on genes involved in RNA processing and regulation. Genes associated with immune response and olfaction exhibit fast evolution. Pigs have the largest repertoire of functional olfactory receptor genes, reflecting the importance of smell in this scavenging animal. The pig genome sequence provides an important resource for further improvements of this important livestock species, and our identification of many putative disease-causing variants extends the potential of the pig as a biomedical model. PMID:23151582

  13. Whole genome sequencing as a tool for phylogenetic analysis of clinical strains of Mitis group streptococci

    DEFF Research Database (Denmark)

    Rasmusen, L. H.; Dargis, R.; Iversen, Katrine Højholt

    2016-01-01

    observed in single gene analyses. Species identification based on single gene analysis showed their limitations when more strains were included. In contrast, analyses incorporating more sequence data, like MLSA, SNPs and core-genome analyses, provided more distinct clustering. The core-genome tree showed......Identification of Mitis group streptococci (MGS) to the species level is challenging for routine microbiology laboratories. Correct identification is crucial for the diagnosis of infective endocarditis, identification of treatment failure, and/or infection relapse. Eighty MGS from Danish patients...

  14. Cross-Disorder Genome-Wide Analyses Suggest a Complex Genetic Relationship Between Tourette Syndrome and Obsessive-Compulsive Disorder

    Science.gov (United States)

    Yu, Dongmei; Mathews, Carol A.; Scharf, Jeremiah M.; Neale, Benjamin M.; Davis, Lea K.; Gamazon, Eric R.; Derks, Eske M.; Evans, Patrick; Edlund, Christopher K.; Crane, Jacquelyn; Fagerness, Jesen A.; Osiecki, Lisa; Gallagher, Patience; Gerber, Gloria; Haddad, Stephen; Illmann, Cornelia; McGrath, Lauren M.; Mayerfeld, Catherine; Arepalli, Sampath; Barlassina, Cristina; Barr, Cathy L.; Bellodi, Laura; Benarroch, Fortu; Berrió, Gabriel Bedoya; Bienvenu, O. Joseph; Black, Donald; Bloch, Michael H.; Brentani, Helena; Bruun, Ruth D.; Budman, Cathy L.; Camarena, Beatriz; Campbell, Desmond D.; Cappi, Carolina; Cardona Silgado, Julio C.; Cavallini, Maria C.; Chavira, Denise A.; Chouinard, Sylvain; Cook, Edwin H.; Cookson, M. R.; Coric, Vladimir; Cullen, Bernadette; Cusi, Daniele; Delorme, Richard; Denys, Damiaan; Dion, Yves; Eapen, Valsama; Egberts, Karin; Falkai, Peter; Fernandez, Thomas; Fournier, Eduardo; Garrido, Helena; Geller, Daniel; Gilbert, Donald; Girard, Simon L.; Grabe, Hans J.; Grados, Marco A.; Greenberg, Benjamin D.; Gross-Tsur, Varda; Grünblatt, Edna; Hardy, John; Heiman, Gary A.; Hemmings, Sian M.J.; Herrera, Luis D.; Hezel, Dianne M.; Hoekstra, Pieter J.; Jankovic, Joseph; Kennedy, James L.; King, Robert A.; Konkashbaev, Anuar I.; Kremeyer, Barbara; Kurlan, Roger; Lanzagorta, Nuria; Leboyer, Marion; Leckman, James F.; Lennertz, Leonhard; Liu, Chunyu; Lochner, Christine; Lowe, Thomas L.; Lupoli, Sara; Macciardi, Fabio; Maier, Wolfgang; Manunta, Paolo; Marconi, Maurizio; McCracken, James T.; Mesa Restrepo, Sandra C.; Moessner, Rainald; Moorjani, Priya; Morgan, Jubel; Muller, Heike; Murphy, Dennis L.; Naarden, Allan L.; Ochoa, William Cornejo; Ophoff, Roel A.; Pakstis, Andrew J.; Pato, Michele T.; Pato, Carlos N.; Piacentini, John; Pittenger, Christopher; Pollak, Yehuda; Rauch, Scott L.; Renner, Tobias; Reus, Victor I.; Richter, Margaret A.; Riddle, Mark A.; Robertson, Mary M.; Romero, Roxana; Rosário, Maria C.; Rosenberg, David; Ruhrmann, Stephan; Sabatti, Chiara; Salvi, Erika; Sampaio, Aline S.; Samuels, Jack; Sandor, Paul; Service, Susan K.; Sheppard, Brooke; Singer, Harvey S.; Smit, Jan H.; Stein, Dan J.; Strengman, Eric; Tischfield, Jay A.; Turiel, Maurizio; Valencia Duarte, Ana V.; Vallada, Homero; Veenstra-VanderWeele, Jeremy; Walitza, Susanne; Walkup, John; Wang, Ying; Weale, Mike; Weiss, Robert; Wendland, Jens R.; Westenberg, Herman G.M.; Yao, Yin; Hounie, Ana G.; Miguel, Euripedes C.; Nicolini, Humberto; Wagner, Michael; Ruiz-Linares, Andres; Cath, Danielle C.; McMahon, William; Posthuma, Danielle; Oostra, Ben A.; Nestadt, Gerald; Rouleau, Guy A.; Purcell, Shaun; Jenike, Michael A.; Heutink, Peter; Hanna, Gregory L.; Conti, David V.; Arnold, Paul D.; Freimer, Nelson; Stewart, S. Evelyn; Knowles, James A.; Cox, Nancy J.; Pauls, David L.

    2014-01-01

    Obsessive-compulsive disorder (OCD) and Tourette Syndrome (TS) are highly heritable neurodevelopmental disorders that are thought to share genetic risk factors. However, the identification of definitive susceptibility genes for these etiologically complex disorders remains elusive. Here, we report a combined genome-wide association study (GWAS) of TS and OCD in 2723 cases (1310 with OCD, 834 with TS, 579 with OCD plus TS/chronic tics (CT)), 5667 ancestry-matched controls, and 290 OCD parent-child trios. Although no individual single nucleotide polymorphisms (SNPs) achieved genome-wide significance, the GWAS signals were enriched for SNPs strongly associated with variations in brain gene expression levels, i.e. expression quantitative loci (eQTLs), suggesting the presence of true functional variants that contribute to risk of these disorders. Polygenic score analyses identified a significant polygenic component for OCD (p=2×10−4), predicting 3.2% of the phenotypic variance in an independent data set. In contrast, TS had a smaller, non-significant polygenic component, predicting only 0.6% of the phenotypic variance (p=0.06). No significant polygenic signal was detected across the two disorders, although the sample is likely underpowered to detect a modest shared signal. Furthermore, the OCD polygenic signal was significantly attenuated when cases with both OCD and TS/CT were included in the analysis (p=0.01). Previous work has shown that TS and OCD have some degree of shared genetic variation. However, the data from this study suggest that there are also distinct components to the genetic architectures of TS and OCD. Furthermore, OCD with co-occurring TS/CT may have different underlying genetic susceptibility compared to OCD alone. PMID:25158072

  15. Divergent and convergent modes of interaction between wheat and Puccinia graminis f. sp. tritici isolates revealed by the comparative gene co-expression network and genome analyses.

    Science.gov (United States)

    Rutter, William B; Salcedo, Andres; Akhunova, Alina; He, Fei; Wang, Shichen; Liang, Hanquan; Bowden, Robert L; Akhunov, Eduard

    2017-04-12

    Two opposing evolutionary constraints exert pressure on plant pathogens: one to diversify virulence factors in order to evade plant defenses, and the other to retain virulence factors critical for maintaining a compatible interaction with the plant host. To better understand how the diversified arsenals of fungal genes promote interaction with the same compatible wheat line, we performed a comparative genomic analysis of two North American isolates of Puccinia graminis f. sp. tritici (Pgt). The patterns of inter-isolate divergence in the secreted candidate effector genes were compared with the levels of conservation and divergence of plant-pathogen gene co-expression networks (GCN) developed for each isolate. Comprative genomic analyses revealed substantial level of interisolate divergence in effector gene complement and sequence divergence. Gene Ontology (GO) analyses of the conserved and unique parts of the isolate-specific GCNs identified a number of conserved host pathways targeted by both isolates. Interestingly, the degree of inter-isolate sub-network conservation varied widely for the different host pathways and was positively associated with the proportion of conserved effector candidates associated with each sub-network. While different Pgt isolates tended to exploit similar wheat pathways for infection, the mode of plant-pathogen interaction varied for different pathways with some pathways being associated with the conserved set of effectors and others being linked with the diverged or isolate-specific effectors. Our data suggest that at the intra-species level pathogen populations likely maintain divergent sets of effectors capable of targeting the same plant host pathways. This functional redundancy may play an important role in the dynamic of the "arms-race" between host and pathogen serving as the basis for diverse virulence strategies and creating conditions where mutations in certain effector groups will not have a major effect on the pathogen

  16. Genome chaos: survival strategy during crisis.

    Science.gov (United States)

    Liu, Guo; Stevens, Joshua B; Horne, Steven D; Abdallah, Batoul Y; Ye, Karen J; Bremer, Steven W; Ye, Christine J; Chen, David J; Heng, Henry H

    2014-01-01

    Genome chaos, a process of complex, rapid genome re-organization, results in the formation of chaotic genomes, which is followed by the potential to establish stable genomes. It was initially detected through cytogenetic analyses, and recently confirmed by whole-genome sequencing efforts which identified multiple subtypes including "chromothripsis", "chromoplexy", "chromoanasynthesis", and "chromoanagenesis". Although genome chaos occurs commonly in tumors, both the mechanism and detailed aspects of the process are unknown due to the inability of observing its evolution over time in clinical samples. Here, an experimental system to monitor the evolutionary process of genome chaos was developed to elucidate its mechanisms. Genome chaos occurs following exposure to chemotherapeutics with different mechanisms, which act collectively as stressors. Characterization of the karyotype and its dynamic changes prior to, during, and after induction of genome chaos demonstrates that chromosome fragmentation (C-Frag) occurs just prior to chaotic genome formation. Chaotic genomes seem to form by random rejoining of chromosomal fragments, in part through non-homologous end joining (NHEJ). Stress induced genome chaos results in increased karyotypic heterogeneity. Such increased evolutionary potential is demonstrated by the identification of increased transcriptome dynamics associated with high levels of karyotypic variance. In contrast to impacting on a limited number of cancer genes, re-organized genomes lead to new system dynamics essential for cancer evolution. Genome chaos acts as a mechanism of rapid, adaptive, genome-based evolution that plays an essential role in promoting rapid macroevolution of new genome-defined systems during crisis, which may explain some unwanted consequences of cancer treatment.

  17. Biochemical and full genome sequence analyses of clinical Vibrio cholerae isolates in Mexico reveals the presence of novel V. cholerae strains.

    Science.gov (United States)

    Díaz-Quiñonez, José Alberto; Hernández-Monroy, Irma; Montes-Colima, Norma Angélica; Moreno-Pérez, María Asunción; Galicia-Nicolás, Adriana Guadalupe; López-Martínez, Irma; Ruiz-Matus, Cuitláhuac; Kuri-Morales, Pablo; Ortíz-Alcántara, Joanna María; Garcés-Ayala, Fabiola; Ramírez-González, José Ernesto

    2016-05-01

    The first week of September 2013, the National Epidemiological Surveillance System identified two cases of cholera in Mexico City. The cultures of both samples were confirmed as Vibrio cholerae serogroup O1, serotype Ogawa, biotype El Tor. Initial analyses by PFGE and by PCR-amplification of the virulence genes, suggested that both strains were similar, but different from those previously reported in Mexico. The following week, four more cases were identified in a community in the state of Hidalgo, located 121 km northeast of Mexico City. Thereafter a cholera outbreak started in the region of La Huasteca. Genomic analyses of the four strains obtained in this study confirmed the presence of Pathogenicity Islands VPI-1 and -2, VSP-1 and -2, and of the integrative element SXT. The genomic structure of the 4 isolates was similar to that of V. cholerae strain 2010 EL-1786, identified during the epidemic in Haiti in 2010. Copyright © 2016 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  18. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  19. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines.

    Science.gov (United States)

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.

  20. Comparison of carnivore, omnivore, and herbivore mammalian genomes with a new leopard assembly.

    Science.gov (United States)

    Kim, Soonok; Cho, Yun Sung; Kim, Hak-Min; Chung, Oksung; Kim, Hyunho; Jho, Sungwoong; Seomun, Hong; Kim, Jeongho; Bang, Woo Young; Kim, Changmu; An, Junghwa; Bae, Chang Hwan; Bhak, Youngjune; Jeon, Sungwon; Yoon, Hyejun; Kim, Yumi; Jun, JeHoon; Lee, HyeJin; Cho, Suan; Uphyrkina, Olga; Kostyria, Aleksey; Goodrich, John; Miquelle, Dale; Roelke, Melody; Lewis, John; Yurchenko, Andrey; Bankevich, Anton; Cho, Juok; Lee, Semin; Edwards, Jeremy S; Weber, Jessica A; Cook, Jo; Kim, Sangsoo; Lee, Hang; Manica, Andrea; Lee, Ilbeum; O'Brien, Stephen J; Bhak, Jong; Yeo, Joo-Hong

    2016-10-11

    There are three main dietary groups in mammals: carnivores, omnivores, and herbivores. Currently, there is limited comparative genomics insight into the evolution of dietary specializations in mammals. Due to recent advances in sequencing technologies, we were able to perform in-depth whole genome analyses of representatives of these three dietary groups. We investigated the evolution of carnivory by comparing 18 representative genomes from across Mammalia with carnivorous, omnivorous, and herbivorous dietary specializations, focusing on Felidae (domestic cat, tiger, lion, cheetah, and leopard), Hominidae, and Bovidae genomes. We generated a new high-quality leopard genome assembly, as well as two wild Amur leopard whole genomes. In addition to a clear contraction in gene families for starch and sucrose metabolism, the carnivore genomes showed evidence of shared evolutionary adaptations in genes associated with diet, muscle strength, agility, and other traits responsible for successful hunting and meat consumption. Additionally, an analysis of highly conserved regions at the family level revealed molecular signatures of dietary adaptation in each of Felidae, Hominidae, and Bovidae. However, unlike carnivores, omnivores and herbivores showed fewer shared adaptive signatures, indicating that carnivores are under strong selective pressure related to diet. Finally, felids showed recent reductions in genetic diversity associated with decreased population sizes, which may be due to the inflexible nature of their strict diet, highlighting their vulnerability and critical conservation status. Our study provides a large-scale family level comparative genomic analysis to address genomic changes associated with dietary specialization. Our genomic analyses also provide useful resources for diet-related genetic and health research.

  1. Comparative analyses of the complete mitochondrial genomes of Dosinia clams and their phylogenetic position within Veneridae.

    Science.gov (United States)

    Lv, Changda; Li, Qi; Kong, Lingfeng

    2018-01-01

    Mitochondrial genomes have proved to be a powerful tool in resolving phylogenetic relationship. In order to understand the mitogenome characteristics and phylogenetic position of the genus Dosinia, we sequenced the complete mitochondrial genomes of Dosinia altior and Dosinia troscheli (Bivalvia: Veneridae), compared them with that of Dosinia japonica and established a phylogenetic tree for Veneridae. The mitogenomes of D. altior (17,536 bp) and D. troscheli (17,229 bp) are the two smallest in Veneridae, which include 13 protein-coding genes, 2 ribosomal RNA genes, 22 tRNA genes, and non-coding regions. The mitogenomes of the Dosinia species are similar in size, gene content, AT content, AT- and GC- skews, and gene arrangement. The phylogenetic relationships of family Veneridae were established based on 12 concatenated protein-coding genes using maximum likelihood and Bayesian analyses, which supported that Dosininae and Meretricinae have a closer relationship, with Tapetinae being the sister taxon. The information obtained in this study will contribute to further understanding of the molecular features of bivalve mitogenomes and the evolutionary history of the genus Dosinia.

  2. Comparative genomic analyses of Mycoplasma hyopneumoniae pathogenic 168 strain and its high-passaged attenuated strain

    Science.gov (United States)

    2013-01-01

    Background Mycoplasma hyopneumoniae is the causative agent of porcine enzootic pneumonia (EP), a mild, chronic pneumonia of swine. Despite presenting with low direct mortality, EP is responsible for major economic losses in the pig industry. To identify the virulence-associated determinants of M. hyopneumoniae, we determined the whole genome sequence of M. hyopneumoniae strain 168 and its attenuated high-passage strain 168-L and carried out comparative genomic analyses. Results We performed the first comprehensive analysis of M. hyopneumoniae strain 168 and its attenuated strain and made a preliminary survey of coding sequences (CDSs) that may be related to virulence. The 168-L genome has a highly similar gene content and order to that of 168, but is 4,483 bp smaller because there are 60 insertions and 43 deletions in 168-L. Besides these indels, 227 single nucleotide variations (SNVs) were identified. We further investigated the variants that affected CDSs, and compared them to reported virulence determinants. Notably, almost all of the reported virulence determinants are included in these variants affected CDSs. In addition to variations previously described in mycoplasma adhesins (P97, P102, P146, P159, P216, and LppT), cell envelope proteins (P95), cell surface antigens (P36), secreted proteins and chaperone protein (DnaK), mutations in genes related to metabolism and growth may also contribute to the attenuated virulence in 168-L. Furthermore, many mutations were located in the previously described repeat motif, which may be of primary importance for virulence. Conclusions We studied the virulence attenuation mechanism of M. hyopneumoniae by comparative genomic analysis of virulent strain 168 and its attenuated high-passage strain 168-L. Our findings provide a preliminary survey of CDSs that may be related to virulence. While these include reported virulence-related genes, other novel virulence determinants were also detected. This new information will form

  3. Comparative genomic and phylogenomic analyses of the Bifidobacteriaceae family

    Czech Academy of Sciences Publication Activity Database

    Lugli, G. A.; Milani, C.; Turroni, F.; Duranti, S.; Mancabelli, L.; Mangifesta, M.; Ferrario, C.; Modesto, M.; Mattarelli, P.; Killer, Jiří; van Sinderen, D.

    2017-01-01

    Roč. 18, č. 1 (2017), č. článku 568. ISSN 1471-2164 Institutional support: RVO:67985904 Keywords : Bifidobacteriaceae * genomics * phlogenomics Subject RIV: EE - Microbiology, Virology OBOR OECD: Microbiology Impact factor: 3.729, year: 2016

  4. Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

    Science.gov (United States)

    Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

    2016-07-01

    This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.

  5. Comparative genomic analysis by microbial COGs self-attraction rate.

    Science.gov (United States)

    Santoni, Daniele; Romano-Spica, Vincenzo

    2009-06-21

    Whole genome analysis provides new perspectives to determine phylogenetic relationships among microorganisms. The availability of whole nucleotide sequences allows different levels of comparison among genomes by several approaches. In this work, self-attraction rates were considered for each cluster of orthologous groups of proteins (COGs) class in order to analyse gene aggregation levels in physical maps. Phylogenetic relationships among microorganisms were obtained by comparing self-attraction coefficients. Eighteen-dimensional vectors were computed for a set of 168 completely sequenced microbial genomes (19 archea, 149 bacteria). The components of the vector represent the aggregation rate of the genes belonging to each of 18 COGs classes. Genes involved in nonessential functions or related to environmental conditions showed the highest aggregation rates. On the contrary genes involved in basic cellular tasks showed a more uniform distribution along the genome, except for translation genes. Self-attraction clustering approach allowed classification of Proteobacteria, Bacilli and other species belonging to Firmicutes. Rearrangement and Lateral Gene Transfer events may influence divergences from classical taxonomy. Each set of COG classes' aggregation values represents an intrinsic property of the microbial genome. This novel approach provides a new point of view for whole genome analysis and bacterial characterization.

  6. SNUGB: a versatile genome browser supporting comparative and functional fungal genomics

    Directory of Open Access Journals (Sweden)

    Kim Seungill

    2008-12-01

    Full Text Available Abstract Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed. Results The Seoul National University Genome Browser (SNUGB integrates various types of genomic information derived from 98 fungal/oomycete (137 datasets and 34 plant and animal (38 datasets species, graphically presents germane features and properties of each genome, and supports comparison between genomes. The SNUGB provides three different forms of the data presentation interface, including diagram, table, and text, and six different display options to support visualization and utilization of the stored information. Information for individual species can be quickly accessed via a new tool named the taxonomy browser. In addition, SNUGB offers four useful data annotation/analysis functions, including 'BLAST annotation.' The modular design of SNUGB makes its adoption to support other comparative genomic platforms easy and facilitates continuous expansion. Conclusion The SNUGB serves as a powerful platform supporting comparative and functional genomics within the fungal kingdom and also across other kingdoms. All data and functions are available at the web site http://genomebrowser.snu.ac.kr/.

  7. Full-length genome sequences of porcine epidemic diarrhoea virus strain CV777; Use of NGS to analyse genomic and sub-genomic RNAs

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Boniotti, Maria Beatrice; Papetti, Alice

    2018-01-01

    Porcine epidemic diarrhoea virus, strain CV777, was initially characterized in 1978 as the causative agent of a disease first identified in the UK in 1971. This coronavirus has been widely distributed among laboratories and has been passaged both within pigs and in cell culture. To determine...... the variability between different stocks of the PEDV strain CV777, sequencing of the full-length genome (ca. 28kb) has been performed in 6 different laboratories, using different protocols. Not surprisingly, each of the different full genome sequences were distinct from each other and from the reference sequence...... the analysis of sub-genomic mRNAs from infected cells. It is clearly important to know the features of the specific sample of CV777 being used for experimental studies....

  8. The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons

    Science.gov (United States)

    Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.

    2016-01-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095

  9. Genome-wide analysis of the human Alu Yb-lineage

    Directory of Open Access Journals (Sweden)

    Carter Anthony B

    2004-03-01

    Full Text Available Abstract The Alu Yb-lineage is a 'young' primarily human-specific group of short interspersed element (SINE subfamilies that have integrated throughout the human genome. In this study, we have computationally screened the draft sequence of the human genome for Alu Yb-lineage subfamily members present on autosomal chromosomes. A total of 1,733 Yb Alu subfamily members have integrated into human autosomes. The average ages of Yb-lineage subfamilies, Yb7, Yb8 and Yb9, are estimated as 4.81, 2.39 and 2.32 million years, respectively. In order to determine the contribution of the Alu Yb-lineage to human genomic diversity, 1,202 loci were analysed using polymerase chain reaction (PCR-based assays, which amplify the genomic regions containing individual Yb-lineage subfamily members. Approximately 20 per cent of the Yb-lineage Alu elements are polymorphic for insertion presence/absence in the human genome. Fewer than 0.5 per cent of the Yb loci also demonstrate insertions at orthologous positions in non-human primate genomes. Genomic sequencing of these unusual loci demonstrates that each of the orthologous loci from non-human primate genomes contains older Y, Sg and Sx Alu family members that have been altered, through various mechanisms, into Yb8 sequences. These data suggest that Alu Yb-lineage subfamily members are largely restricted to the human genome. The high copy number, level of insertion polymorphism and estimated age indicate that members of the Alu Yb elements will be useful in a wide range of genetic analyses.

  10. Relationship between Deleterious Variation, Genomic Autozygosity, and Disease Risk: Insights from The 1000 Genomes Project.

    Science.gov (United States)

    Pemberton, Trevor J; Szpiech, Zachary A

    2018-04-05

    Genomic regions of autozygosity (ROAs) represent segments of individual genomes that are homozygous for haplotypes inherited identical-by-descent (IBD) from a common ancestor. ROAs are nonuniformly distributed across the genome, and increased ROA levels are a reported risk factor for numerous complex diseases. Previously, we hypothesized that long ROAs are enriched for deleterious homozygotes as a result of young haplotypes with recent deleterious mutations-relatively untouched by purifying selection-being paired IBD as a consequence of recent parental relatedness, a pattern supported by ROA and whole-exome sequence data on 27 individuals. Here, we significantly bolster support for our hypothesis and expand upon our original analyses using ROA and whole-genome sequence data on 2,436 individuals from The 1000 Genomes Project. Considering CADD deleteriousness scores, we reaffirm our previous observation that long ROAs are enriched for damaging homozygotes worldwide. We show that strongly damaging homozygotes experience greater enrichment than weaker damaging homozygotes, while overall enrichment varies appreciably among populations. Mendelian disease genes and those encoding FDA-approved drug targets have significantly increased rates of gain in damaging homozygotes with increasing ROA coverage relative to all other genes. In genes implicated in eight complex phenotypes for which ROA levels have been identified as a risk factor, rates of gain in damaging homozygotes vary across phenotypes and populations but frequently differ significantly from non-disease genes. These findings highlight the potential confounding effects of population background in the assessment of associations between ROA levels and complex disease risk, which might underlie reported inconsistencies in ROA-phenotype associations. Copyright © 2018 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  11. Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen.

    Science.gov (United States)

    DiGuistini, Scott; Wang, Ye; Liao, Nancy Y; Taylor, Greg; Tanguay, Philippe; Feau, Nicolas; Henrissat, Bernard; Chan, Simon K; Hesse-Orce, Uljana; Alamouti, Sepideh Massoumi; Tsui, Clement K M; Docking, Roderick T; Levasseur, Anthony; Haridas, Sajeet; Robertson, Gordon; Birol, Inanc; Holt, Robert A; Marra, Marco A; Hamelin, Richard C; Hirst, Martin; Jones, Steven J M; Bohlmann, Jörg; Breuil, Colette

    2011-02-08

    In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera (Gc), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc, MPB, and lodgepole pine hosts, we sequenced the ∼30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc, and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ∼100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system.

  12. Comparison of analyses of the QTLMAS XIII common dataset. I: genomic selection

    NARCIS (Netherlands)

    Bastiaansen, J.W.M.; Bink, M.C.A.M.; Coster, A.; Maliepaard, C.A.; Calus, M.P.L.

    2010-01-01

    Background - Genomic selection, the use of markers across the whole genome, receives increasing amounts of attention and is having more and more impact on breeding programs. Development of statistical and computational methods to estimate breeding values based on markers is a very active area of

  13. Nucleotide sequence analyses of genomic RNAs of peanut stunt virus Mi, the type strain representative of a novel PSV subgroup from China

    NARCIS (Netherlands)

    Yan, L.; Xu, Z.; Goldbach, R.W.; Chen, Y.K.; Prins, M.W.

    2005-01-01

    The complete nucleotide sequence of Peanut stunt virus strain Mi (PSV-Mi) from China was determined and compared to other viruses of the genus Cucumovirus. The tripartite genome of PSV-Mi encoded five open reading frames (ORFs) typical of cucumoviruses. Distance analyses of four ORFs indicated that

  14. Genomic ancestry and education level independently influence abdominal fat distributions in a Brazilian admixed population.

    Science.gov (United States)

    França, Giovanny Vinícius Araújo de; De Lucia Rolfe, Emanuella; Horta, Bernardo Lessa; Gigante, Denise Petrucci; Yudkin, John S; Ong, Ken K; Victora, Cesar Gomes

    2017-01-01

    We aimed to identify the independent associations of genomic ancestry and education level with abdominal fat distributions in the 1982 Pelotas birth cohort study, Brazil. In 2,890 participants (1,409 men and 1,481 women), genomic ancestry was assessed using genotype data on 370,539 genome-wide variants to quantify ancestral proportions in each individual. Years of completed education was used to indicate socio-economic position. Visceral fat depth and subcutaneous abdominal fat thickness were measured by ultrasound at age 29-31y; these measures were adjusted for BMI to indicate abdominal fat distributions. Linear regression models were performed, separately by sex. Admixture was observed between European (median proportion 85.3), African (6.6), and Native American (6.3) ancestries, with a strong inverse correlation between the African and European ancestry scores (ρ = -0.93; pabdominal fat distributions in men (both P = 0.001), and inversely associated with subcutaneous abdominal fat distribution in women (p = 0.009). Independent of genomic ancestry, higher education level was associated with lower visceral fat, but higher subcutaneous fat, in both men and women (all pabdominal fat distribution in adults. African ancestry appeared to lower abdominal fat distributions, particularly in men.

  15. Comparative Genetic Analyses of Human Rhinovirus C (HRV-C) Complete Genome from Malaysia

    Science.gov (United States)

    Khaw, Yam Sim; Chan, Yoke Fun; Jafar, Faizatul Lela; Othman, Norlijah; Chee, Hui Yee

    2016-01-01

    Human rhinovirus-C (HRV-C) has been implicated in more severe illnesses than HRV-A and HRV-B, however, the limited number of HRV-C complete genomes (complete 5′ and 3′ non-coding region and open reading frame sequences) has hindered the in-depth genetic study of this virus. This study aimed to sequence seven complete HRV-C genomes from Malaysia and compare their genetic characteristics with the 18 published HRV-Cs. Seven Malaysian HRV-C complete genomes were obtained with newly redesigned primers. The seven genomes were classified as HRV-C6, C12, C22, C23, C26, C42, and pat16 based on the VP4/VP2 and VP1 pairwise distance threshold classification. Five of the seven Malaysian isolates, namely, 3430-MY-10/C22, 8713-MY-10/C23, 8097-MY-11/C26, 1570-MY-10/C42, and 7383-MY-10/pat16 are the first newly sequenced complete HRV-C genomes. All seven Malaysian isolates genomes displayed nucleotide similarity of 63–81% among themselves and 63–96% with other HRV-Cs. Malaysian HRV-Cs had similar putative immunogenic sites, putative receptor utilization and potential antiviral sites as other HRV-Cs. The genomic features of Malaysian isolates were similar to those of other HRV-Cs. Negative selections were frequently detected in HRV-Cs complete coding sequences indicating that these sequences were under functional constraint. The present study showed that HRV-Cs from Malaysia have diverse genetic sequences but share conserved genomic features with other HRV-Cs. This genetic information could provide further aid in the understanding of HRV-C infection. PMID:27199901

  16. Recent developments in genome and exome-wide analyses of plasma lipids.

    Science.gov (United States)

    Lange, Leslie A; Willer, Cristen J; Rich, Stephen S

    2015-04-01

    Genome-wide association scans (GWAS) have identified over 100 human loci associated with variation in lipids. The identification of novel genes and variants that affect lipid levels is made possible by next-generation sequencing, rare variant discovery and analytic advances. The current status of the genetic basis of lipid traits will be presented. Expansion of GWAS sample sizes for lipid traits has not substantially increased the proportion of trait variance explained by common genetic variants (less than 15% of trait variation captured). Although GWAS has discovered novel loci and pathways with putative biological function and impact on cardiovascular disease risk, discovery of the genes in these loci remains challenging. Exome sequencing promises to identify genes with protein-coding variants with a large impact on lipids, as shown for LDL-cholesterol levels associated with novel (PNPLA5) and known (LDLR, PCSK9, APOB) genes. Current results have increased our understanding of the genetic architecture of lipids, expanding the range of effect and frequency for variants identified for lipid traits. Identification of novel lipid-associated gene variants, even if small in effect or rare in the population, could provide important novel drug targets and biological pathways for dyslipidemia.

  17. saSNP Approach for Scalable SNP Analyses of Multiple Bacterial or Viral Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, Shea [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Slezak, Tom [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2010-07-27

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs. The method is fast to compute, finding SNPs and building a SNP phylogeny in seconds to hours. We use it to identify thousands of putative SNPs from all publicly available Filoviridae, Poxviridae, foot-and-mouth disease virus, Bacillus, and Escherichia coli genomes and plasmids. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle as input hundreds of gigabases of sequence in a single run. The algorithm is based on k-mer analysis using a suffix array, so we call it saSNP.

  18. Permanent Draft Genome of Strain ESFC-1: Ecological Genomics of a Newly Discovered Lineage of Filamentous Diazotrophic Cyanobacteria

    Science.gov (United States)

    Everroad, R. Craig; Stuart, Rhona K.; Bebout, Brad M.; Detweiler, Angela M.; Lee, Jackson Zan; Woebken, Dagmar; Bebout, Leslie E.; Pett-Ridge, Jennifer

    2016-01-01

    The nonheterocystous filamentous cyanobacterium, strain ESFC-1, is a recently described member of the order Oscillatoriales within the Cyanobacteria. ESFC-1 has been shown to be a major diazotroph in the intertidal microbial mat system at Elkhorn Slough, CA, USA. Based on phylogenetic analyses of the 16S RNA gene, ESFC-1 appears to belong to a unique, genus-level divergence; the draft genome sequence of this strain has now been determined. Here we report features of this genome as they relate to the ecological functions and capabilities of strain ESFC-1. The 5,632,035 bp genome sequence encodes 4914 protein-coding genes and 92 RNA genes. One striking feature of this cyanobacterium is the apparent lack of either uptake or bi-directional hydrogenases typically expected within a diazotroph. Additionally, a large genomic island is found that contains numerous low GC-content genes and genes related to extracellular polysaccharide production and cell wall synthesis and maintenance.

  19. Phylogenomic analyses data of the avian phylogenomics project

    DEFF Research Database (Denmark)

    Jarvis, Erich D; Mirarab, Siavash; Aberer, Andre J

    2015-01-01

    BACKGROUND: Determining the evolutionary relationships among the major lineages of extant birds has been one of the biggest challenges in systematic biology. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae...... and two of the five Palaeognathae orders. We used these genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomic analyses. FINDINGS: Here we present the datasets associated with the phylogenomic analyses, which include sequence alignment files consisting of nucleotides......ML algorithm or when using statistical binning with the coalescence-based MP-EST algorithm (which we refer to as MP-EST*). Other data sets, such as the coding sequence of some exons, revealed other properties of genome evolution, namely convergence. CONCLUSIONS: The Avian Phylogenomics Project is the largest...

  20. Ultrafast comparison of personal genomes

    OpenAIRE

    Mauldin, Denise; Hood, Leroy; Robinson, Max; Glusman, Gustavo

    2017-01-01

    We present an ultra-fast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into 'genome fingerprints' that can be readily compared across sequencing technologies and reference versions. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. This enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative s...

  1. Genetic variants associated with subjective well-being, depressive symptoms and neuroticism identified through genome-wide analyses

    Science.gov (United States)

    Derringer, Jaime; Gratten, Jacob; Lee, James J; Liu, Jimmy Z; de Vlaming, Ronald; Ahluwalia, Tarunveer S; Buchwald, Jadwiga; Cavadino, Alana; Frazier-Wood, Alexis C; Davies, Gail; Furlotte, Nicholas A; Garfield, Victoria; Geisel, Marie Henrike; Gonzalez, Juan R; Haitjema, Saskia; Karlsson, Robert; van der Laan, Sander W; Ladwig, Karl-Heinz; Lahti, Jari; van der Lee, Sven J; Miller, Michael B; Lind, Penelope A; Liu, Tian; Matteson, Lindsay; Mihailov, Evelin; Minica, Camelia C; Nolte, Ilja M; Mook-Kanamori, Dennis O; van der Most, Peter J; Oldmeadow, Christopher; Qian, Yong; Raitakari, Olli; Rawal, Rajesh; Realo, Anu; Rueedi, Rico; Schmidt, Börge; Smith, Albert V; Stergiakouli, Evie; Tanaka, Toshiko; Taylor, Kent; Thorleifsson, Gudmar; Wedenoja, Juho; Wellmann, Juergen; Westra, Harm-Jan; Willems, Sara M; Zhao, Wei; Amin, Najaf; Bakshi, Andrew; Bergmann, Sven; Bjornsdottir, Gyda; Boyle, Patricia A; Cherney, Samantha; Cox, Simon R; Davis, Oliver S P; Ding, Jun; Direk, Nese; Eibich, Peter; Emeny, Rebecca T; Fatemifar, Ghazaleh; Faul, Jessica D; Ferrucci, Luigi; Forstner, Andreas J; Gieger, Christian; Gupta, Richa; Harris, Tamara B; Harris, Juliette M; Holliday, Elizabeth G; Hottenga, Jouke-Jan; De Jager, Philip L; Kaakinen, Marika A; Kajantie, Eero; Karhunen, Ville; Kolcic, Ivana; Kumari, Meena; Launer, Lenore J; Franke, Lude; Li-Gao, Ruifang; Liewald, David C; Koini, Marisa; Loukola, Anu; Marques-Vidal, Pedro; Montgomery, Grant W; Mosing, Miriam A; Paternoster, Lavinia; Pattie, Alison; Petrovic, Katja E; Pulkki-Råback, Laura; Quaye, Lydia; Räikkönen, Katri; Rudan, Igor; Scott, Rodney J; Smith, Jennifer A; Sutin, Angelina R; Trzaskowski, Maciej; Vinkhuyzen, Anna E; Yu, Lei; Zabaneh, Delilah; Attia, John R; Bennett, David A; Berger, Klaus; Bertram, Lars; Boomsma, Dorret I; Snieder, Harold; Chang, Shun-Chiao; Cucca, Francesco; Deary, Ian J; van Duijn, Cornelia M; Eriksson, Johan G; Bültmann, Ute; de Geus, Eco J C; Groenen, Patrick J F; Gudnason, Vilmundur; Hansen, Torben; Hartman, Catharine A; Haworth, Claire M A; Hayward, Caroline; Heath, Andrew C; Hinds, David A; Hyppönen, Elina; Iacono, William G; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L R; Keltikangas-Järvinen, Liisa; Kraft, Peter; Kubzansky, Laura D; Lehtimäki, Terho; Magnusson, Patrik K E; Martin, Nicholas G; McGue, Matt; Metspalu, Andres; Mills, Melinda; de Mutsert, Renée; Oldehinkel, Albertine J; Pasterkamp, Gerard; Pedersen, Nancy L; Plomin, Robert; Polasek, Ozren; Power, Christine; Rich, Stephen S; Rosendaal, Frits R; den Ruijter, Hester M; Schlessinger, David; Schmidt, Helena; Svento, Rauli; Schmidt, Reinhold; Alizadeh, Behrooz Z; Sørensen, Thorkild I A; Spector, Tim D; Starr, John M; Stefansson, Kari; Steptoe, Andrew; Terracciano, Antonio; Thorsteinsdottir, Unnur; Thurik, A Roy; Timpson, Nicholas J; Tiemeier, Henning; Uitterlinden, André G; Vollenweider, Peter; Wagner, Gert G; Weir, David R; Yang, Jian; Conley, Dalton C; Smith, George Davey; Hofman, Albert; Johannesson, Magnus; Laibson, David I; Medland, Sarah E; Meyer, Michelle N; Pickrell, Joseph K; Esko, Tõnu; Krueger, Robert F; Beauchamp, Jonathan P; Koellinger, Philipp D; Benjamin, Daniel J; Bartels, Meike; Cesarini, David

    2016-01-01

    We conducted genome-wide association studies of three phenotypes: subjective well-being (N = 298,420), depressive symptoms (N = 161,460), and neuroticism (N = 170,910). We identified three variants associated with subjective well-being, two with depressive symptoms, and eleven with neuroticism, including two inversion polymorphisms. The two depressive symptoms loci replicate in an independent depression sample. Joint analyses that exploit the high genetic correlations between the phenotypes (|ρ^| ≈ 0.8) strengthen the overall credibility of the findings, and allow us to identify additional variants. Across our phenotypes, loci regulating expression in central nervous system and adrenal/pancreas tissues are strongly enriched for association. PMID:27089181

  2. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Science.gov (United States)

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  3. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  4. Genome-Wide Approaches to Drosophila Heart Development

    Directory of Open Access Journals (Sweden)

    Manfred Frasch

    2016-05-01

    Full Text Available The development of the dorsal vessel in Drosophila is one of the first systems in which key mechanisms regulating cardiogenesis have been defined in great detail at the genetic and molecular level. Due to evolutionary conservation, these findings have also provided major inputs into studies of cardiogenesis in vertebrates. Many of the major components that control Drosophila cardiogenesis were discovered based on candidate gene approaches and their functions were defined by employing the outstanding genetic tools and molecular techniques available in this system. More recently, approaches have been taken that aim to interrogate the entire genome in order to identify novel components and describe genomic features that are pertinent to the regulation of heart development. Apart from classical forward genetic screens, the availability of the thoroughly annotated Drosophila genome sequence made new genome-wide approaches possible, which include the generation of massive numbers of RNA interference (RNAi reagents that were used in forward genetic screens, as well as studies of the transcriptomes and proteomes of the developing heart under normal and experimentally manipulated conditions. Moreover, genome-wide chromatin immunoprecipitation experiments have been performed with the aim to define the full set of genomic binding sites of the major cardiogenic transcription factors, their relevant target genes, and a more complete picture of the regulatory network that drives cardiogenesis. This review will give an overview on these genome-wide approaches to Drosophila heart development and on computational analyses of the obtained information that ultimately aim to provide a description of this process at the systems level.

  5. Clustering of Pan- and Core-genome of Lactobacillus provides Novel Evolutionary Insights for Differentiation.

    Science.gov (United States)

    Inglin, Raffael C; Meile, Leo; Stevens, Marc J A

    2018-04-24

    Bacterial taxonomy aims to classify bacteria based on true evolutionary events and relies on a polyphasic approach that includes phenotypic, genotypic and chemotaxonomic analyses. Until now, complete genomes are largely ignored in taxonomy. The genus Lactobacillus consists of 173 species and many genomes are available to study taxonomy and evolutionary events. We analyzed and clustered 98 completely sequenced genomes of the genus Lactobacillus and 234 draft genomes of 5 different Lactobacillus species, i.e. L. reuteri, L. delbrueckii, L. plantarum, L. rhamnosus and L. helveticus. The core-genome of the genus Lactobacillus contains 266 genes and the pan-genome 20'800 genes. Clustering of the Lactobacillus pan- and core-genome resulted in two highly similar trees. This shows that evolutionary history is traceable in the core-genome and that clustering of the core-genome is sufficient to explore relationships. Clustering of core- and pan-genomes at species' level resulted in similar trees as well. Detailed analyses of the core-genomes showed that the functional class "genetic information processing" is conserved in the core-genome but that "signaling and cellular processes" is not. The latter class encodes functions that are involved in environmental interactions. Evolution of lactobacilli seems therefore directed by the environment. The type species L. delbrueckii was analyzed in detail and its pan-genome based tree contained two major clades whose members contained different genes yet identical functions. In addition, evidence for horizontal gene transfer between strains of L. delbrueckii, L. plantarum, and L. rhamnosus, and between species of the genus Lactobacillus is presented. Our data provide evidence for evolution of some lactobacilli according to a parapatric-like model for species differentiation. Core-genome trees are useful to detect evolutionary relationships in lactobacilli and might be useful in taxonomic analyses. Lactobacillus' evolution is directed

  6. i-Genome: A database to summarize oligonucleotide data in genomes

    Directory of Open Access Journals (Sweden)

    Chang Yu-Chung

    2004-10-01

    Full Text Available Abstract Background Information on the occurrence of sequence features in genomes is crucial to comparative genomics, evolutionary analysis, the analyses of regulatory sequences and the quantitative evaluation of sequences. Computing the frequencies and the occurrences of a pattern in complete genomes is time-consuming. Results The proposed database provides information about sequence features generated by exhaustively computing the sequences of the complete genome. The repetitive elements in the eukaryotic genomes, such as LINEs, SINEs, Alu and LTR, are obtained from Repbase. The database supports various complete genomes including human, yeast, worm, and 128 microbial genomes. Conclusions This investigation presents and implements an efficiently computational approach to accumulate the occurrences of the oligonucleotides or patterns in complete genomes. A database is established to maintain the information of the sequence features, including the distributions of oligonucleotide, the gene distribution, the distribution of repetitive elements in genomes and the occurrences of the oligonucleotides. The database can provide more effective and efficient way to access the repetitive features in genomes.

  7. A Trichosporonales genome tree based on 27 haploid and three evolutionarily conserved 'natural' hybrid genomes.

    Science.gov (United States)

    Takashima, Masako; Sriswasdi, Sira; Manabe, Ri-Ichiroh; Ohkuma, Moriya; Sugita, Takashi; Iwasaki, Wataru

    2018-01-01

    To construct a backbone tree consisting of basidiomycetous yeasts, draft genome sequences from 25 species of Trichosporonales (Tremellomycetes, Basidiomycota) were generated. In addition to the hybrid genomes of Trichosporon coremiiforme and Trichosporon ovoides that we described previously, we identified an interspecies hybrid genome in Cutaneotrichosporon mucoides (formerly Trichosporon mucoides). This hybrid genome had a gene retention rate of ~55%, and its closest haploid relative was Cutaneotrichosporon dermatis. After constructing the C. mucoides subgenomes, we generated a phylogenetic tree using genome data from the 27 haploid species and the subgenome data from the three hybrid genome species. It was a high-quality tree with 100% bootstrap support for all of the branches. The genome-based tree provided superior resolution compared with previous multi-gene analyses. Although our backbone tree does not include all Trichosporonales genera (e.g. Cryptotrichosporon), it will be valuable for future analyses of genome data. Interest in interspecies hybrid fungal genomes has recently increased because they may provide a basis for new technologies. The three Trichosporonales hybrid genomes described in this study are different from well-characterized hybrid genomes (e.g. those of Saccharomyces pastorianus and Saccharomyces bayanus) because these hybridization events probably occurred in the distant evolutionary past. Hence, they will be useful for studying genome stability following hybridization and speciation events. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Genome-wide linkage, exome sequencing and functional analyses identify ABCB6 as the pathogenic gene of dyschromatosis universalis hereditaria.

    Directory of Open Access Journals (Sweden)

    Hong Liu

    Full Text Available As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH had remained unclear until recently when ABCB6 was reported as a causative gene of DUH.We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation.Genome-wide linkage (assuming autosomal dominant inheritance mode and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirmed the functional role of ABCB6 in melanocytes and pigmentation. Given the involvement of ABCB6 mutations in coloboma, we performed ophthalmological examination of the DUH carriers of ABCB6 mutations and found ocular abnormalities in them.Our study has advanced our understanding of DUH pathogenesis and revealed the shared pathological mechanism between pigmentary DUH and ocular coloboma.

  9. Estimation of main diversification time-points of hantaviruses using phylogenetic analyses of complete genomes.

    Science.gov (United States)

    Castel, Guillaume; Tordo, Noël; Plyusnin, Alexander

    2017-04-02

    Because of the great variability of their reservoir hosts, hantaviruses are excellent models to evaluate the dynamics of virus-host co-evolution. Intriguing questions remain about the timescale of the diversification events that influenced this evolution. In this paper we attempted to estimate the first ever timing of hantavirus diversification based on thirty five available complete genomes representing five major groups of hantaviruses and the assumption of co-speciation of hantaviruses with their respective mammal hosts. Phylogenetic analyses were used to estimate the main diversification points during hantavirus evolution in mammals while host diversification was mostly estimated from independent calibrators taken from fossil records. Our results support an earlier developed hypothesis of co-speciation of known hantaviruses with their respective mammal hosts and hence a common ancestor for all hantaviruses carried by placental mammals. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    OpenAIRE

    Zuccolo, Andrea; Bowers, John E; Estill, James C; Xiong, Zhiyong; Luo, Meizhong; Sebastian, Aswathy; Goicoechea, Jos? Luis; Collura, Kristi; Yu, Yeisoo; Jiao, Yuannian; Duarte, Jill; Tang, Haibao; Ayyampalayam, Saravanaraj; Rounsley, Steve; Kudrna, Dave

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome....

  11. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

    Science.gov (United States)

    Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

    2016-01-01

    Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326

  12. The complete chloroplast genome sequences of five Epimedium species: lights into phylogenetic and taxonomic analyses

    Directory of Open Access Journals (Sweden)

    Yanjun eZhang

    2016-03-01

    Full Text Available Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR region and the single-copy (SC boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants.

  13. The genomic landscape at a late stage of stickleback speciation: High genomic divergence interspersed by small localized regions of introgression.

    Directory of Open Access Journals (Sweden)

    Mark Ravinet

    2018-05-01

    Full Text Available Speciation is a continuous process and analysis of species pairs at different stages of divergence provides insight into how it unfolds. Previous genomic studies on young species pairs have revealed peaks of divergence and heterogeneous genomic differentiation. Yet less known is how localised peaks of differentiation progress to genome-wide divergence during the later stages of speciation in the presence of persistent gene flow. Spanning the speciation continuum, stickleback species pairs are ideal for investigating how genomic divergence builds up during speciation. However, attention has largely focused on young postglacial species pairs, with little knowledge of the genomic signatures of divergence and introgression in older stickleback systems. The Japanese stickleback species pair, composed of the Pacific Ocean three-spined stickleback (Gasterosteus aculeatus and the Japan Sea stickleback (G. nipponicus, which co-occur in the Japanese islands, is at a late stage of speciation. Divergence likely started well before the end of the last glacial period and crosses between Japan Sea females and Pacific Ocean males result in hybrid male sterility. Here we use coalescent analyses and Approximate Bayesian Computation to show that the two species split approximately 0.68-1 million years ago but that they have continued to exchange genes at a low rate throughout divergence. Population genomic data revealed that, despite gene flow, a high level of genomic differentiation is maintained across the majority of the genome. However, we identified multiple, small regions of introgression, occurring mainly in areas of low recombination rate. Our results demonstrate that a high level of genome-wide divergence can establish in the face of persistent introgression and that gene flow can be localized to small genomic regions at the later stages of speciation with gene flow.

  14. The genomic landscape at a late stage of stickleback speciation: High genomic divergence interspersed by small localized regions of introgression.

    Science.gov (United States)

    Ravinet, Mark; Yoshida, Kohta; Shigenobu, Shuji; Toyoda, Atsushi; Fujiyama, Asao; Kitano, Jun

    2018-05-01

    Speciation is a continuous process and analysis of species pairs at different stages of divergence provides insight into how it unfolds. Previous genomic studies on young species pairs have revealed peaks of divergence and heterogeneous genomic differentiation. Yet less known is how localised peaks of differentiation progress to genome-wide divergence during the later stages of speciation in the presence of persistent gene flow. Spanning the speciation continuum, stickleback species pairs are ideal for investigating how genomic divergence builds up during speciation. However, attention has largely focused on young postglacial species pairs, with little knowledge of the genomic signatures of divergence and introgression in older stickleback systems. The Japanese stickleback species pair, composed of the Pacific Ocean three-spined stickleback (Gasterosteus aculeatus) and the Japan Sea stickleback (G. nipponicus), which co-occur in the Japanese islands, is at a late stage of speciation. Divergence likely started well before the end of the last glacial period and crosses between Japan Sea females and Pacific Ocean males result in hybrid male sterility. Here we use coalescent analyses and Approximate Bayesian Computation to show that the two species split approximately 0.68-1 million years ago but that they have continued to exchange genes at a low rate throughout divergence. Population genomic data revealed that, despite gene flow, a high level of genomic differentiation is maintained across the majority of the genome. However, we identified multiple, small regions of introgression, occurring mainly in areas of low recombination rate. Our results demonstrate that a high level of genome-wide divergence can establish in the face of persistent introgression and that gene flow can be localized to small genomic regions at the later stages of speciation with gene flow.

  15. Nomadic lifestyle of Lactobacillus plantarum revealed by comparative genomics of 54 strains isolated from different habitats.

    Science.gov (United States)

    Martino, Maria Elena; Bayjanov, Jumamurat R; Caffrey, Brian E; Wels, Michiel; Joncour, Pauline; Hughes, Sandrine; Gillet, Benjamin; Kleerebezem, Michiel; van Hijum, Sacha A F T; Leulier, François

    2016-12-01

    The ability of bacteria to adapt to diverse environmental conditions is well-known. The process of bacterial adaptation to a niche has been linked to large changes in the genome content, showing that many bacterial genomes reflect the constraints imposed by their habitat. However, some highly versatile bacteria are found in diverse habitats that almost share nothing in common. Lactobacillus plantarum is a lactic acid bacterium that is found in a large variety of habitat. With the aim of unravelling the link between evolution and ecological versatility of L. plantarum, we analysed the genomes of 54 L. plantarum strains isolated from different environments. Comparative genome analysis identified a high level of genomic diversity and plasticity among the strains analysed. Phylogenomic and functional divergence studies coupled with gene-trait matching analyses revealed a mixed distribution of the strains, which was uncoupled from their environmental origin. Our findings revealed the absence of specific genomic signatures marking adaptations of L. plantarum towards the diverse habitats it is associated with. This suggests fundamentally similar trends of genome evolution in L. plantarum, which occur in a manner that is apparently uncoupled from ecological constraint and reflects the nomadic lifestyle of this species. © 2016 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

  16. QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization.

    Science.gov (United States)

    Zhao, Shanrong; Xi, Li; Quan, Jie; Xi, Hualin; Zhang, Ying; von Schack, David; Vincent, Michael; Zhang, Baohong

    2016-01-08

    RNA sequencing (RNA-seq), a next-generation sequencing technique for transcriptome profiling, is being increasingly used, in part driven by the decreasing cost of sequencing. Nevertheless, the analysis of the massive amounts of data generated by large-scale RNA-seq remains a challenge. Multiple algorithms pertinent to basic analyses have been developed, and there is an increasing need to automate the use of these tools so as to obtain results in an efficient and user friendly manner. Increased automation and improved visualization of the results will help make the results and findings of the analyses readily available to experimental scientists. By combing the best open source tools developed for RNA-seq data analyses and the most advanced web 2.0 technologies, we have implemented QuickRNASeq, a pipeline for large-scale RNA-seq data analyses and visualization. The QuickRNASeq workflow consists of three main steps. In Step #1, each individual sample is processed, including mapping RNA-seq reads to a reference genome, counting the numbers of mapped reads, quality control of the aligned reads, and SNP (single nucleotide polymorphism) calling. Step #1 is computationally intensive, and can be processed in parallel. In Step #2, the results from individual samples are merged, and an integrated and interactive project report is generated. All analyses results in the report are accessible via a single HTML entry webpage. Step #3 is the data interpretation and presentation step. The rich visualization features implemented here allow end users to interactively explore the results of RNA-seq data analyses, and to gain more insights into RNA-seq datasets. In addition, we used a real world dataset to demonstrate the simplicity and efficiency of QuickRNASeq in RNA-seq data analyses and interactive visualizations. The seamless integration of automated capabilites with interactive visualizations in QuickRNASeq is not available in other published RNA-seq pipelines. The high degree

  17. Multi-platform whole-genome microarray analyses refine the epigenetic signature of breast cancer metastasis with gene expression and copy number.

    Directory of Open Access Journals (Sweden)

    Joseph Andrews

    2010-01-01

    Full Text Available We have previously identified genome-wide DNA methylation changes in a cell line model of breast cancer metastasis. These complex epigenetic changes that we observed, along with concurrent karyotype analyses, have led us to hypothesize that complex genomic alterations in cancer cells (deletions, translocations and ploidy are superimposed over promoter-specific methylation events that are responsible for gene-specific expression changes observed in breast cancer metastasis.We undertook simultaneous high-resolution, whole-genome analyses of MDA-MB-468GFP and MDA-MB-468GFP-LN human breast cancer cell lines (an isogenic, paired lymphatic metastasis cell line model using Affymetrix gene expression (U133, promoter (1.0R, and SNP/CNV (SNP 6.0 microarray platforms to correlate data from gene expression, epigenetic (DNA methylation, and combination copy number variant/single nucleotide polymorphism microarrays. Using Partek Software and Ingenuity Pathway Analysis we integrated datasets from these three platforms and detected multiple hypomethylation and hypermethylation events. Many of these epigenetic alterations correlated with gene expression changes. In addition, gene dosage events correlated with the karyotypic differences observed between the cell lines and were reflected in specific promoter methylation patterns. Gene subsets were identified that correlated hyper (and hypo methylation with the loss (or gain of gene expression and in parallel, with gene dosage losses and gains, respectively. Individual gene targets from these subsets were also validated for their methylation, expression and copy number status, and susceptible gene pathways were identified that may indicate how selective advantage drives the processes of tumourigenesis and metastasis.Our approach allows more precisely profiling of functionally relevant epigenetic signatures that are associated with cancer progression and metastasis.

  18. Practical Approaches for Detecting Selection in Microbial Genomes.

    Science.gov (United States)

    Hedge, Jessica; Wilson, Daniel J

    2016-02-01

    Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers through the fundamentals underpinning popular methods for measuring selection in pathogens. These methods are transferable to a wide variety of organisms, and the exercises provided are designed for researchers with any level of programming experience.

  19. Post-genomic analyses of fungal lignocellulosic biomass degradation reveal the unexpected potential of the plant pathogen Ustilago maydis

    Directory of Open Access Journals (Sweden)

    Couturier Marie

    2012-02-01

    Full Text Available Abstract Background Filamentous fungi are potent biomass degraders due to their ability to thrive in ligno(hemicellulose-rich environments. During the last decade, fungal genome sequencing initiatives have yielded abundant information on the genes that are putatively involved in lignocellulose degradation. At present, additional experimental studies are essential to provide insights into the fungal secreted enzymatic pools involved in lignocellulose degradation. Results In this study, we performed a wide analysis of 20 filamentous fungi for which genomic data are available to investigate their biomass-hydrolysis potential. A comparison of fungal genomes and secretomes using enzyme activity profiling revealed discrepancies in carbohydrate active enzymes (CAZymes sets dedicated to plant cell wall. Investigation of the contribution made by each secretome to the saccharification of wheat straw demonstrated that most of them individually supplemented the industrial Trichoderma reesei CL847 enzymatic cocktail. Unexpectedly, the most striking effect was obtained with the phytopathogen Ustilago maydis that improved the release of total sugars by 57% and of glucose by 22%. Proteomic analyses of the best-performing secretomes indicated a specific enzymatic mechanism of U. maydis that is likely to involve oxido-reductases and hemicellulases. Conclusion This study provides insight into the lignocellulose-degradation mechanisms by filamentous fungi and allows for the identification of a number of enzymes that are potentially useful to further improve the industrial lignocellulose bioconversion process.

  20. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.......Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...

  1. Characterization of Aeromonas hydrophila Wound Pathotypes by Comparative Genomic and Functional Analyses of Virulence Genes

    Science.gov (United States)

    Grim, Christopher J.; Kozlova, Elena V.; Sha, Jian; Fitts, Eric C.; van Lier, Christina J.; Kirtley, Michelle L.; Joseph, Sandeep J.; Read, Timothy D.; Burd, Eileen M.; Tall, Ben D.; Joseph, Sam W.; Horneman, Amy J.; Chopra, Ashok K.; Shak, Joshua R.

    2013-01-01

    ABSTRACT Aeromonas hydrophila has increasingly been implicated as a virulent and antibiotic-resistant etiologic agent in various human diseases. In a previously published case report, we described a subject with a polymicrobial wound infection that included a persistent and aggressive strain of A. hydrophila (E1), as well as a more antibiotic-resistant strain of A. hydrophila (E2). To better understand the differences between pathogenic and environmental strains of A. hydrophila, we conducted comparative genomic and functional analyses of virulence-associated genes of these two wound isolates (E1 and E2), the environmental type strain A. hydrophila ATCC 7966T, and four other isolates belonging to A. aquariorum, A. veronii, A. salmonicida, and A. caviae. Full-genome sequencing of strains E1 and E2 revealed extensive differences between the two and strain ATCC 7966T. The more persistent wound infection strain, E1, harbored coding sequences for a cytotoxic enterotoxin (Act), a type 3 secretion system (T3SS), flagella, hemolysins, and a homolog of exotoxin A found in Pseudomonas aeruginosa. Corresponding phenotypic analyses with A. hydrophila ATCC 7966T and SSU as reference strains demonstrated the functionality of these virulence genes, with strain E1 displaying enhanced swimming and swarming motility, lateral flagella on electron microscopy, the presence of T3SS effector AexU, and enhanced lethality in a mouse model of Aeromonas infection. By combining sequence-based analysis and functional assays, we characterized an A. hydrophila pathotype, exemplified by strain E1, that exhibited increased virulence in a mouse model of infection, likely because of encapsulation, enhanced motility, toxin secretion, and cellular toxicity. PMID:23611906

  2. Positive Selection Driving Cytoplasmic Genome Evolution of the Medicinally Important Ginseng Plant Genus Panax.

    Science.gov (United States)

    Jiang, Peng; Shi, Feng-Xue; Li, Ming-Rui; Liu, Bao; Wen, Jun; Xiao, Hong-Xing; Li, Lin-Feng

    2018-01-01

    Panax L. (the ginseng genus) is a shade-demanding group within the family Araliaceae and all of its species are of crucial significance in traditional Chinese medicine. Phylogenetic and biogeographic analyses demonstrated that two rounds of whole genome duplications accompanying with geographic and ecological isolations promoted the diversification of Panax species. However, contributions of the cytoplasmic genomes to the adaptive evolution of Panax species remained largely uninvestigated. In this study, we sequenced the chloroplast and mitochondrial genomes of 11 accessions belonging to seven Panax species. Our results show that heterogeneity in nucleotide substitution rate is abundant in both of the two cytoplasmic genomes, with the mitochondrial genome possessing more variants at the total level but the chloroplast showing higher sequence polymorphisms at the genic regions. Genome-wide scanning of positive selection identified five and 12 genes from the chloroplast and mitochondrial genomes, respectively. Functional analyses further revealed that these selected genes play important roles in plant development, cellular metabolism and adaptation. We therefore conclude that positive selection might be one of the potential evolutionary forces that shaped nucleotide variation pattern of these Panax species. In particular, the mitochondrial genes evolved under stronger selective pressure compared to the chloroplast genes.

  3. Functional genomics in renal transplantation and chronic kidney disease

    International Nuclear Information System (INIS)

    Wilflingseder, J.

    2010-01-01

    For the past decade, the development of genomic technology has revolutionized modern biological research. Functional genomic analyses enable biologists to study genetic events on a genome wide scale. Examples of applications are gene discovery, biomarker determination, disease classification, and drug target identification. Global expression profiles performed with microarrays enable a better understanding of molecular signature of human disease, including acute and chronic kidney disease. About 10 % of the population in western industrialized nations suffers from chronic kidney disease (CKD). Treatment of end stage renal disease, the final stage of CKD is performed by either hemo- or peritoneal dialysis or renal transplantation. The preferred treatment is renal transplantation, because of the higher quality of life. But the pathophysiology of the disease on a molecular level is not well enough understood and early biomarkers for acute and chronic kidney disease are missing. In my studies I focused on genomics of allograft biopsies, prevention of delayed graft function after renal transplantation, anemia after renal transplantation, biocompatibility of hemodialysis membranes and peritoneal dialysis fluids and cardiovascular diseases and bone disorders in CKD patients. Gene expression profiles, pathway analysis and protein-protein interaction networks were used to elucidate the underlying pathophysiological mechanism of the disease or phenomena, identifying early biomarkers or predictors of disease state and potentially drug targets. In summery my PhD thesis represents the application of functional genomic analyses in chronic kidney disease and renal transplantation. The results provide a deeper view into the molecular and cellular mechanisms of kidney disease. Nevertheless, future multicenter collaborative studies, meta-analyses of existing data, incorporation of functional genomics into large-scale prospective clinical trials are needed and will give biomedical

  4. Comparative genomics of chondrichthyan Hoxa clusters

    Directory of Open Access Journals (Sweden)

    Zhong Ying-Fu

    2009-09-01

    Full Text Available Abstract Background The chondrichthyan or cartilaginous fish (chimeras, sharks, skates and rays occupy an important phylogenetic position as the sister group to all other jawed vertebrates and as an early lineage to diverge from the vertebrate lineage following two whole genome duplication events in vertebrate evolution. There have been few comparative genomic analyses incorporating data from chondrichthyan fish and none comparing genomic information from within the group. We have sequenced the complete Hoxa cluster of the Little Skate (Leucoraja erinacea and compared to the published Hoxa cluster of the Horn Shark (Heterodontus francisci and to available data from the Elephant Shark (Callorhinchus milii genome project. Results A BAC clone containing the full Little Skate Hoxa cluster was fully sequenced and assembled. Analyses of coding sequences and conserved non-coding elements reveal a strikingly high level of conservation across the cartilaginous fish, with twenty ultraconserved elements (100%,100 bp found between Skate and Horn Shark, compared to three between human and marsupials. We have also identified novel potential non-coding RNAs in the Skate BAC clone, some of which are conserved to other species. Conclusion We find that the Little Skate Hoxa cluster is remarkably similar to the previously published Horn Shark Hoxa cluster with respect to sequence identity, gene size and intergenic distance despite over 180 million years of separation between the two lineages. We suggest that the genomes of cartilaginous fish are more highly conserved than those of tetrapods or teleost fish and so are more likely to have retained ancestral non-coding elements. While useful for isolating homologous DNA, this complicates bioinformatic approaches to identify chondrichthyan-specific non-coding DNA elements

  5. Combined genomic and structural analyses of a cultured magnetotactic bacterium reveals its niche adaptation to a dynamic environment

    Directory of Open Access Journals (Sweden)

    Ana Carolina Vieira Araujo

    2016-10-01

    Full Text Available Abstract Background Magnetotactic bacteria (MTB are a unique group of prokaryotes that have a potentially high impact on global geochemical cycling of significant primary elements because of their metabolic plasticity and the ability to biomineralize iron-rich magnetic particles called magnetosomes. Understanding the genetic composition of the few cultivated MTB along with the unique morphological features of this group of bacteria may provide an important framework for discerning their potential biogeochemical roles in natural environments. Results Genomic and ultrastructural analyses were combined to characterize the cultivated magnetotactic coccus Magnetofaba australis strain IT-1. Cells of this species synthesize a single chain of elongated, cuboctahedral magnetite (Fe3O4 magnetosomes that cause them to align along magnetic field lines while they swim being propelled by two bundles of flagella at velocities up to 300 μm s−1. High-speed microscopy imaging showed the cells move in a straight line rather than in the helical trajectory described for other magnetotactic cocci. Specific genes within the genome of Mf. australis strain IT-1 suggest the strain is capable of nitrogen fixation, sulfur reduction and oxidation, synthesis of intracellular polyphosphate granules and transporting iron with low and high affinity. Mf. australis strain IT-1 and Magnetococcus marinus strain MC-1 are closely related phylogenetically although similarity values between their homologous proteins are not very high. Conclusion Mf. australis strain IT-1 inhabits a constantly changing environment and its complete genome sequence reveals a great metabolic plasticity to deal with these changes. Aside from its chemoautotrophic and chemoheterotrophic metabolism, genomic data indicate the cells are capable of nitrogen fixation, possess high and low affinity iron transporters, and might be capable of reducing and oxidizing a number of sulfur compounds. The relatively

  6. Family genome browser: visualizing genomes with pedigree information.

    Science.gov (United States)

    Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong

    2015-07-15

    Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. Genome-wide identification, phylogeny and expression analyses of SCARECROW-LIKE(SCL) genes in millet (Setaria italica).

    Science.gov (United States)

    Liu, Hongyun; Qin, Jiajia; Fan, Hui; Cheng, Jinjin; Li, Lin; Liu, Zheng

    2017-07-01

    As a member of the GRAS gene family, SCARECROW - LIKE ( SCL ) genes encode transcriptional regulators that are involved in plant information transmission and signal transduction. In this study, 44 SCL genes including two SCARECROW genes in millet were identified to be distributed on eight chromosomes, except chromosome 6. All the millet genes contain motifs 6-8, indicating that these motifs are conserved during the evolution. SCL genes of millet were divided into eight groups based on the phylogenetic relationship and classification of Arabidopsis SCL genes. Several putative millet orthologous genes in Arabidopsis , maize and rice were identified. High throughput RNA sequencing revealed that the expressions of millet SCL genes in root, stem, leaf, spica, and along leaf gradient varied greatly. Analyses combining the gene expression patterns, gene structures, motif compositions, promoter cis -elements identification, alternative splicing of transcripts and phylogenetic relationship of SCL genes indicate that the these genes may play diverse functions. Functionally characterized SCL genes in maize, rice and Arabidopsis would provide us some clues for future characterization of their homologues in millet. To the best of our knowledge, this is the first study of millet SCL genes at the genome wide level. Our work provides a useful platform for functional analysis of SCL genes in millet, a model crop for C 4 photosynthesis and bioenergy studies.

  8. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    Science.gov (United States)

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-04

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Genomes-based phylogeny of the genus Xanthomonas

    Directory of Open Access Journals (Sweden)

    Rodriguez-R Luis M

    2012-03-01

    Full Text Available Abstract Background The genus Xanthomonas comprises several plant pathogenic bacteria affecting a wide range of hosts. Despite the economic, industrial and biological importance of Xanthomonas, the classification and phylogenetic relationships within the genus are still under active debate. Some of the relationships between pathovars and species have not been thoroughly clarified, with old pathovars becoming new species. A change in the genus name has been recently suggested for Xanthomonas albilineans, an early branching species currently located in this genus, but a thorough phylogenomic reconstruction would aid in solving these and other discrepancies in this genus. Results Here we report the results of the genome-wide analysis of DNA sequences from 989 orthologous groups from 17 Xanthomonas spp. genomes available to date, representing all major lineages within the genus. The phylogenetic and computational analyses used in this study have been automated in a Perl package designated Unus, which provides a framework for phylogenomic analyses which can be applied to other datasets at the genomic level. Unus can also be easily incorporated into other phylogenomic pipelines. Conclusions Our phylogeny agrees with previous phylogenetic topologies on the genus, but revealed that the genomes of Xanthomonas citri and Xanthomonas fuscans belong to the same species, and that of Xanthomonas albilineans is basal to the joint clade of Xanthomonas and Xylella fastidiosa. Genome reduction was identified in the species Xanthomonas vasicola in addition to the previously identified reduction in Xanthomonas albilineans. Lateral gene transfer was also observed in two gene clusters.

  10. Photobiomodulation effects on mRNA levels from genomic and chromosome stabilization genes in injured muscle.

    Science.gov (United States)

    da Silva Neto Trajano, Larissa Alexsandra; Trajano, Eduardo Tavares Lima; da Silva Sergio, Luiz Philippe; Teixeira, Adilson Fonseca; Mencalha, Andre Luiz; Stumbo, Ana Carolina; de Souza da Fonseca, Adenilson

    2018-04-26

    Muscle injuries are the most prevalent type of injury in sports. A great number of athletes have relapsed in muscle injuries not being treated properly. Photobiomodulation therapy is an inexpensive and safe technique with many benefits in muscle injury treatment. However, little has been explored about the infrared laser effects on DNA and telomeres in muscle injuries. Thus, the aim of this study was to evaluate photobiomodulation effects on mRNA relative levels from genes related to telomere and genomic stabilization in injured muscle. Wistar male rats were randomly divided into six groups: control, laser 25 mW, laser 75 mW, injury, injury laser 25 mW, and injury laser 75 mW. Photobiomodulation was performed with 904 nm, 3 J/cm 2 at 25 or 75 mW. Cryoinjury was induced by two applications of a metal probe cooled in liquid nitrogen directly on the tibialis anterior muscle. After euthanasia, skeletal muscle samples were withdrawn and total RNA extracted for evaluation of mRNA levels from genomic (ATM and p53) and chromosome stabilization (TRF1 and TRF2) genes by real-time quantitative polymerization chain reaction. Data show that photobiomodulation reduces the mRNA levels from ATM and p53, as well reduces mRNA levels from TRF1 and TRF2 at 25 and 75 mW in injured skeletal muscle. In conclusion, photobiomodulation alters mRNA relative levels from genes related to genomic and telomere stabilization in injured skeletal muscle.

  11. HGVA: the Human Genome Variation Archive

    OpenAIRE

    Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gr?f, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

    2017-01-01

    Abstract High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic...

  12. A whole genome analyses of genetic variants in two Kelantan Malay individuals.

    Science.gov (United States)

    Wan Juhari, Wan Khairunnisa; Md Tamrin, Nur Aida; Mat Daud, Mohd Hanif Ridzuan; Isa, Hatin Wan; Mohd Nasir, Nurfazreen; Maran, Sathiya; Abdul Rajab, Nur Shafawati; Ahmad Amin Noordin, Khairul Bariah; Nik Hassan, Nik Norliza; Tearle, Rick; Razali, Rozaimi; Merican, Amir Feisal; Zilfalil, Bin Alwi

    2014-12-01

    The sequencing of two members of the Royal Kelantan Malay family genomes will provide insights on the Kelantan Malay whole genome sequences. The two Kelantan Malay genomes were analyzed for the SNP markers associated with thalassemia and Helicobacter pylori infection. Helicobacter pylori infection was reported to be low prevalence in the north-east as compared to the west coast of the Peninsular Malaysia and beta-thalassemia was known to be one of the most common inherited and genetic disorder in Malaysia. By combining SNP information from literatures, GWAS study and NCBI ClinVar, 18 unique SNPs were selected for further analysis. From these 18 SNPs, 10 SNPs came from previous study of Helicobacter pylori infection among Malay patients, 6 SNPs were from NCBI ClinVar and 2 SNPs from GWAS studies. The analysis reveals that both Royal Kelantan Malay genomes shared all the 10 SNPs identified by Maran (Single Nucleotide Polymorphims (SNPs) genotypic profiling of Malay patients with and without Helicobacter pylori infection in Kelantan, 2011) and one SNP from GWAS study. In addition, the analysis also reveals that both Royal Kelantan Malay genomes shared 3 SNP markers; HBG1 (rs1061234), HBB (rs1609812) and BCL11A (rs766432) where all three markers were associated with beta-thalassemia. Our findings suggest that the Royal Kelantan Malays carry the SNPs which are associated with protection to Helicobacter pylori infection. In addition they also carry SNPs which are associated with beta-thalassemia. These findings are in line with the findings by other researchers who conducted studies on thalassemia and Helicobacter pylori infection in the non-royal Malay population.

  13. DNA sequence explains seemingly disordered methylation levels in partially methylated domains of Mammalian genomes.

    Directory of Open Access Journals (Sweden)

    Dimos Gaidatzis

    2014-02-01

    Full Text Available For the most part metazoan genomes are highly methylated and harbor only small regions with low or absent methylation. In contrast, partially methylated domains (PMDs, recently discovered in a variety of cell lines and tissues, do not fit this paradigm as they show partial methylation for large portions (20%-40% of the genome. While in PMDs methylation levels are reduced on average, we found that at single CpG resolution, they show extensive variability along the genome outside of CpG islands and DNase I hypersensitive sites (DHS. Methylation levels range from 0% to 100% in a roughly uniform fashion with only little similarity between neighboring CpGs. A comparison of various PMD-containing methylomes showed that these seemingly disordered states of methylation are strongly conserved across cell types for virtually every PMD. Comparative sequence analysis suggests that DNA sequence is a major determinant of these methylation states. This is further substantiated by a purely sequence based model which can predict 31% (R(2 of the variation in methylation. The model revealed CpG density as the main driving feature promoting methylation, opposite to what has been shown for CpG islands, followed by various dinucleotides immediately flanking the CpG and a minor contribution from sequence preferences reflecting nucleosome positioning. Taken together we provide a reinterpretation for the nucleotide-specific methylation levels observed in PMDs, demonstrate their conservation across tissues and suggest that they are mainly determined by specific DNA sequence features.

  14. Genome-level comparisons provide insight into the phylogeny and metabolic diversity of species within the genus Lactococcus.

    Science.gov (United States)

    Yu, Jie; Song, Yuqin; Ren, Yan; Qing, Yanting; Liu, Wenjun; Sun, Zhihong

    2017-11-03

    The genomic diversity of different species within the genus Lactococcus and the relationships between genomic differentiation and environmental factors remain unclear. In this study, type isolates of ten Lactococcus species/subspecies were sequenced to assess their genomic characteristics, metabolic diversity, and phylogenetic relationships. The total genome sizes varied between 1.99 (Lactococcus plantarum) and 2.46 megabases (Mb; L. lactis subsp. lactis), and the G + C content ranged from 34.81 (L. lactis subsp. hordniae) to 39.67% (L. raffinolactis) with an average value of 37.02%. Analysis of genome dynamics indicated that the genus Lactococcus has an open pan-genome, while the core genome size decreased with sequential addition at the genus and species group levels. A phylogenetic dendrogram based on the concatenated amino acid sequences of 643 core genes was largely consistent with the phylogenetic tree obtained by 16S ribosomal RNA (rRNA) genes, but it provided a more robust phylogenetic resolution than the 16S rRNA gene-based analysis. Comparative genomics indicated that species in the genus Lactococcus had high degrees of diversity in genome size, gene content, and carbohydrate metabolism. This may be important for the specific adaptations that allow different Lactococcus species to survive in different environments. These results provide a quantitative basis for understanding the genomic and metabolic diversity within the genus Lactococcus, laying the foundation for future studies on taxonomy and functional genomics.

  15. Genomics using the Assembly of the Mink Genome

    DEFF Research Database (Denmark)

    Guldbrandtsen, Bernt; Cai, Zexi; Sahana, Goutam

    2018-01-01

    The American Mink’s (Neovison vison) genome has recently been sequenced. This opens numerous avenues of research both for studying the basic genetics and physiology of the mink as well as genetic improvement in mink. Using genotyping-by-sequencing (GBS) generated marker data for 2,352 Danish farm...... mink runs of homozygosity (ROH) were detect in mink genomes. Detectable ROH made up on average 1.7% of the genome indicating the presence of at most a moderate level of genomic inbreeding. The fraction of genome regions found in ROH varied. Ten percent of the included regions were never found in ROH....... The ability to detect ROH in the mink genome also demonstrates the general reliability of the new mink genome assembly. Keywords: american mink, run of homozygosity, genome, selection, genomic inbreeding...

  16. phiGENOME: an integrative navigation throughout bacteriophage genomes.

    Science.gov (United States)

    Stano, Matej; Klucar, Lubos

    2011-11-01

    phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright © 2011 Elsevier Inc. All rights reserved.

  17. The plant ontology as a tool for comparative plant anatomy and genomic analyses

    Science.gov (United States)

    Plant science is now a major player in the fields of genomics, gene expression analysis, phenomics and metabolomics. Recent advances in sequencing technologies have led to a windfall of data, with new species being added rapidly to the list of species whose genomes have been decoded. The Plant Ontol...

  18. Unraveling the evolutionary history of the phosphoryl-transfer chain of the phosphoenolpyruvate:phosphotransferase system through phylogenetic analyses and genome context

    Directory of Open Access Journals (Sweden)

    Zúñiga Manuel

    2008-05-01

    Full Text Available Abstract Background The phosphoenolpyruvate phosphotransferase system (PTS plays a major role in sugar transport and in the regulation of essential physiological processes in many bacteria. The PTS couples solute transport to its phosphorylation at the expense of phosphoenolpyruvate (PEP and it consists of general cytoplasmic phosphoryl transfer proteins and specific enzyme II complexes which catalyze the uptake and phosphorylation of solutes. Previous studies have suggested that the evolution of the constituents of the enzyme II complexes has been driven largely by horizontal gene transfer whereas vertical inheritance has been prevalent in the general phosphoryl transfer proteins in some bacterial groups. The aim of this work is to test this hypothesis by studying the evolution of the phosphoryl transfer proteins of the PTS. Results We have analyzed the evolutionary history of the PTS phosphoryl transfer chain (PTS-ptc components in 222 complete genomes by combining phylogenetic methods and analysis of genomic context. Phylogenetic analyses alone were not conclusive for the deepest nodes but when complemented with analyses of genomic context and functional information, the main evolutionary trends of this system could be depicted. Conclusion The PTS-ptc evolved in bacteria after the divergence of early lineages such as Aquificales, Thermotogales and Thermus/Deinococcus. The subsequent evolutionary history of the PTS-ptc varied in different bacterial lineages: vertical inheritance and lineage-specific gene losses mainly explain the current situation in Actinobacteria and Firmicutes whereas horizontal gene transfer (HGT also played a major role in Proteobacteria. Most remarkably, we have identified a HGT event from Firmicutes or Fusobacteria to the last common ancestor of the Enterobacteriaceae, Pasteurellaceae, Shewanellaceae and Vibrionaceae. This transfer led to extensive changes in the metabolic and regulatory networks of these bacteria

  19. Mitogenomic analyses from ancient DNA

    DEFF Research Database (Denmark)

    Paijmans, Johanna L. A.; Gilbert, Tom; Hofreiter, Michael

    2013-01-01

    The analysis of ancient DNA is playing an increasingly important role in conservation genetic, phylogenetic and population genetic analyses, as it allows incorporating extinct species into DNA sequence trees and adds time depth to population genetics studies. For many years, these types of DNA...... analyses (whether using modern or ancient DNA) were largely restricted to the analysis of short fragments of the mitochondrial genome. However, due to many technological advances during the past decade, a growing number of studies have explored the power of complete mitochondrial genome sequences...... yielded major progress with regard to both the phylogenetic positions of extinct species, as well as resolving population genetics questions in both extinct and extant species....

  20. Karyotype and genome size analyses in species of Helichrysum (Asteraceae

    Directory of Open Access Journals (Sweden)

    Narjes Azizi

    2014-09-01

    Full Text Available Karyotype studies were performed in 18 populations of eight Helichrysum species in Iran. Those species showed chromosome numbers of 2n = 2x = 14; 2n = 4x = 24, 28 and 32; 2n = 6x = 36; 2n = 7x = 42; 2n = 8x = 48; 2n = 9x = 54; and 2n = 10x = 60. The chromosome numbers of H. davisianum, H. globiferum, H. leucocephalum and H. oocephalum are reported here for the first time. New ploidy levels are reported for H. oligocephalum (2n = 4x = 24 and H. plicatum (2n = 4x = 32. The chromosomes were metacentric and submetacentric. An ANOVA among H. globiferum and H. leucocephalum populations showed significant differences for the coefficient of variation for chromosome size, total form percentage and the asymmetry indices, indicating that changes in the chromosome structure of Helichrysum species occurred during their diversification. Significant positive correlations among the species and populations studied, in terms of the total chromosome length, lengths of the short arms and lengths of the long arms, indicate that these karyotypic features change simultaneously during speciation events. The genome sizes of Helichrysum species are reported here for first time. The 2C DNA content ranged from 8.13 pg (in H. rubicundum to 18.4 pg (in H. leucocephalum and H. davisianum. We found that C-value correlated significantly with ploidy level, total chromosome length, lengths of the long arms and lengths of the short arms (p<0.05, indicating that changes in chromosome structure are accompanied by changes in DNA content.

  1. Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma.

    Science.gov (United States)

    Wrzeszczynski, Kazimierz O; Frank, Mayu O; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A; Moore Vogel, Julia L; Bruce, Jeffrey N; Lassman, Andrew B; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V; Zody, Michael C; Jobanputra, Vaidehi; Royyuru, Ajay K; Darnell, Robert B

    2017-08-01

    To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. NCT02725684.

  2. Hybridization Capture Using RAD Probes (hyRAD, a New Tool for Performing Genomic Analyses on Collection Specimens.

    Directory of Open Access Journals (Sweden)

    Tomasz Suchan

    Full Text Available In the recent years, many protocols aimed at reproducibly sequencing reduced-genome subsets in non-model organisms have been published. Among them, RAD-sequencing is one of the most widely used. It relies on digesting DNA with specific restriction enzymes and performing size selection on the resulting fragments. Despite its acknowledged utility, this method is of limited use with degraded DNA samples, such as those isolated from museum specimens, as these samples are less likely to harbor fragments long enough to comprise two restriction sites making possible ligation of the adapter sequences (in the case of double-digest RAD or performing size selection of the resulting fragments (in the case of single-digest RAD. Here, we address these limitations by presenting a novel method called hybridization RAD (hyRAD. In this approach, biotinylated RAD fragments, covering a random fraction of the genome, are used as baits for capturing homologous fragments from genomic shotgun sequencing libraries. This simple and cost-effective approach allows sequencing of orthologous loci even from highly degraded DNA samples, opening new avenues of research in the field of museum genomics. Not relying on the restriction site presence, it improves among-sample loci coverage. In a trial study, hyRAD allowed us to obtain a large set of orthologous loci from fresh and museum samples from a non-model butterfly species, with a high proportion of single nucleotide polymorphisms present in all eight analyzed specimens, including 58-year-old museum samples. The utility of the method was further validated using 49 museum and fresh samples of a Palearctic grasshopper species for which the spatial genetic structure was previously assessed using mtDNA amplicons. The application of the method is eventually discussed in a wider context. As it does not rely on the restriction site presence, it is therefore not sensitive to among-sample loci polymorphisms in the restriction sites

  3. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus

    Science.gov (United States)

    Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430

  4. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus.

    Directory of Open Access Journals (Sweden)

    Fagen Li

    Full Text Available Dense genetic maps, along with quantitative trait loci (QTLs detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR, expressed sequence tag (EST derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS, and diversity arrays technology (DArT markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age and wood density (56 months were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.

  5. Transposable element activity, genome regulation and human health.

    Science.gov (United States)

    Wang, Lu; Jordan, I King

    2018-03-02

    A convergence of novel genome analysis technologies is enabling population genomic studies of human transposable elements (TEs). Population surveys of human genome sequences have uncovered thousands of individual TE insertions that segregate as common genetic variants, i.e. TE polymorphisms. These recent TE insertions provide an important source of naturally occurring human genetic variation. Investigators are beginning to leverage population genomic data sets to execute genome-scale association studies for assessing the phenotypic impact of human TE polymorphisms. For example, the expression quantitative trait loci (eQTL) analytical paradigm has recently been used to uncover hundreds of associations between human TE insertion variants and gene expression levels. These include population-specific gene regulatory effects as well as coordinated changes to gene regulatory networks. In addition, analyses of linkage disequilibrium patterns with previously characterized genome-wide association study (GWAS) trait variants have uncovered TE insertion polymorphisms that are likely causal variants for a variety of common complex diseases. Gene regulatory mechanisms that underlie specific disease phenotypes have been proposed for a number of these trait associated TE polymorphisms. These new population genomic approaches hold great promise for understanding how ongoing TE activity contributes to functionally relevant genetic variation within and between human populations. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Practical Approaches for Detecting Selection in Microbial Genomes.

    Directory of Open Access Journals (Sweden)

    Jessica Hedge

    2016-02-01

    Full Text Available Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers through the fundamentals underpinning popular methods for measuring selection in pathogens. These methods are transferable to a wide variety of organisms, and the exercises provided are designed for researchers with any level of programming experience.

  7. The W22 genome: a foundation for maize functional genomics and transposon biology

    Science.gov (United States)

    The maize W22 inbred has served as a platform for maize genetics since the mid twentieth century. To streamline maize genome analyses, we have sequenced and de novo assembled a W22 reference genome using small-read sequencing technologies. We show that significant structural heterogeneity exists in ...

  8. Genome-wide identification and function analyses of heat shock transcription factors in potato

    Directory of Open Access Journals (Sweden)

    Ruimin eTang

    2016-04-01

    Full Text Available Heat shock transcription factors (Hsfs play vital roles in the regulation of tolerance to various stresses in living organisms. To dissect the mechanisms of the Hsfs in potato adaptation to abiotic stresses, genome and transcriptome analyses of Hsf gene family were investigated in Solanum tuberosum L. Twenty-seven StHsf members were identified by bioinformatics and phylogenetic analyses and were classified into A, B and C groups according to their structural and phylogenetic features. StHsfs in the same class shared similar gene structures and conserved motifs. The chromosomal location analysis showed that 27 Hsfs were located in 10 of 12 chromosomes (except chromosome 1 and chromosome 5 and that 18 of these genes formed 9 paralogous pairs. Expression profiles of StHsfs in 12 different organs and tissues uncovered distinct spatial expression patterns of these genes and their potential roles in the process of growth and development. Promoter and quantitative real-time polymerase chain reaction (qRT-PCR detections of StHsfs were conducted and demonstrated that these genes were all responsive to various stresses. StHsf004, StHsf007, StHsf009, StHsf014 and StHsf019 were constitutively expressed under non-stress conditions, and some specific Hsfs became the predominant Hsfs in response to different abiotic stresses, indicating their important and diverse regulatory roles in adverse conditions. A co-expression network between StHsfs and StHsf-co-expressed genes was generated based on the publicly-available potato transcriptomic databases and identified key candidate StHsfs for further functional studies.

  9. Genome-wide selection signatures in Pinzgau cattle

    Directory of Open Access Journals (Sweden)

    Radovan Kasarda

    2015-08-01

    Full Text Available The aim of this study was to identify the evidence of recent selection based on estimation of the integrated Haplotype Score (iHS, population differentiation index (FST and characterize affected regions near QTL associated with traits under strong selection in Pinzgau cattle. In total 21 Austrian and 19 Slovak purebreed bulls genotyped with Illumina bovineHD and  bovineSNP50 BeadChip were used to identify genomic regions under selection. Only autosomal loci with call rate higher than 90%, minor allele frequency higher than 0.01 and Hardy-Weinberg equlibrium limit of 0.001 were included in the subsequent analyses of selection sweeps presence. The final dataset was consisted from 30538 SNPs with 81.86 kb average adjacent SNPs spacing. The iHS score were averaged into non-overlapping 500 kb segments across the genome. The FST values were also plotted against genome position based on sliding windows approach and averaged over 8 consecutive SNPs. Based on integrated Haplotype Score evaluation only 7 regions with iHS score higher than 1.7 was found. The average iHS score observed for each adjacent syntenic regions indicated slight effect of recent selection in analysed group of Pinzgau bulls. The level of genetic differentiation between Austrian and Slovak bulls estimated based on FST index was low. Only 24% of FST values calculated for each SNP was greather than 0.01. By using sliding windows approach was found that 5% of analysed windows had higher value than 0.01. Our results indicated use of similar selection scheme in breeding programs of Slovak and Austrian Pinzgau bulls. The evidence for genome-wide association between signatures of selection and regions affecting complex traits such as milk production was insignificant, because the loci in segments identified as affected by selection were very distant from each other. Identification of genomic regions that may be under pressure of selection for phenotypic traits to better understanding of the

  10. Pathway analyses implicate glial cells in schizophrenia.

    Directory of Open Access Journals (Sweden)

    Laramie E Duncan

    Full Text Available The quest to understand the neurobiology of schizophrenia and bipolar disorder is ongoing with multiple lines of evidence indicating abnormalities of glia, mitochondria, and glutamate in both disorders. Despite high heritability estimates of 81% for schizophrenia and 75% for bipolar disorder, compelling links between findings from neurobiological studies, and findings from large-scale genetic analyses, are only beginning to emerge.Ten publically available gene sets (pathways related to glia, mitochondria, and glutamate were tested for association to schizophrenia and bipolar disorder using MAGENTA as the primary analysis method. To determine the robustness of associations, secondary analyses were performed with: ALIGATOR, INRICH, and Set Screen. Data from the Psychiatric Genomics Consortium (PGC were used for all analyses. There were 1,068,286 SNP-level p-values for schizophrenia (9,394 cases/12,462 controls, and 2,088,878 SNP-level p-values for bipolar disorder (7,481 cases/9,250 controls.The Glia-Oligodendrocyte pathway was associated with schizophrenia, after correction for multiple tests, according to primary analysis (MAGENTA p = 0.0005, 75% requirement for individual gene significance and also achieved nominal levels of significance with INRICH (p = 0.0057 and ALIGATOR (p = 0.022. For bipolar disorder, Set Screen yielded nominally and method-wide significant associations to all three glial pathways, with strongest association to the Glia-Astrocyte pathway (p = 0.002.Consistent with findings of white matter abnormalities in schizophrenia by other methods of study, the Glia-Oligodendrocyte pathway was associated with schizophrenia in our genomic study. These findings suggest that the abnormalities of myelination observed in schizophrenia are at least in part due to inherited factors, contrasted with the alternative of purely environmental causes (e.g. medication effects or lifestyle. While not the primary purpose of our study

  11. The complex jujube genome provides insights into fruit tree biology.

    Science.gov (United States)

    Liu, Meng-Jun; Zhao, Jin; Cai, Qing-Le; Liu, Guo-Cheng; Wang, Jiu-Rui; Zhao, Zhi-Hui; Liu, Ping; Dai, Li; Yan, Guijun; Wang, Wen-Jiang; Li, Xian-Song; Chen, Yan; Sun, Yu-Dong; Liu, Zhi-Guo; Lin, Min-Juan; Xiao, Jing; Chen, Ying-Ying; Li, Xiao-Feng; Wu, Bin; Ma, Yong; Jian, Jian-Bo; Yang, Wei; Yuan, Zan; Sun, Xue-Chao; Wei, Yan-Li; Yu, Li-Li; Zhang, Chi; Liao, Sheng-Guang; He, Rong-Jun; Guang, Xuan-Min; Wang, Zhuo; Zhang, Yue-Yang; Luo, Long-Hai

    2014-10-28

    The jujube (Ziziphus jujuba Mill.), a member of family Rhamnaceae, is a major dry fruit and a traditional herbal medicine for more than one billion people. Here we present a high-quality sequence for the complex jujube genome, the first genome sequence of Rhamnaceae, using an integrated strategy. The final assembly spans 437.65 Mb (98.6% of the estimated) with 321.45 Mb anchored to the 12 pseudo-chromosomes and contains 32,808 genes. The jujube genome has undergone frequent inter-chromosome fusions and segmental duplications, but no recent whole-genome duplication. Further analyses of the jujube-specific genes and transcriptome data from 15 tissues reveal the molecular mechanisms underlying some specific properties of the jujube. Its high vitamin C content can be attributed to a unique high level expression of genes involved in both biosynthesis and regeneration. Our study provides insights into jujube-specific biology and valuable genomic resources for the improvement of Rhamnaceae plants and other fruit trees.

  12. Mining microsatellites in the peach genome: development of new long-core SSR markers for genetic analyses in five Prunus species.

    Science.gov (United States)

    Dettori, Maria Teresa; Micali, Sabrina; Giovinazzi, Jessica; Scalabrin, Simone; Verde, Ignazio; Cipriani, Guido

    2015-01-01

    A wide inventory of molecular markers is nowadays available for individual fingerprinting. Microsatellites, or simple sequence repeats (SSRs), play a relevant role due to their relatively ease of use, their abundance in the plant genomes, and their co-dominant nature, together with the availability of primer sequences in many important agricultural crops. Microsatellites with long-core motifs are more easily scored and were adopted long ago in human genetics but they were developed only in few crops, and Prunus species are not among them. In the present work the peach whole-genome sequence was used to select 216 SSRs containing long-core motifs with tri-, tetra- and penta-nucleotide repeats. Microsatellite primer pairs were designed and tested for polymorphism in the five diploid Prunus species of economic relevance (almond, apricot, Japanese plum, peach and sweet cherry). A set of 26 microsatellite markers covering all the eight chromosomes, was also selected and used in the molecular characterization, population genetics and structure analyses of a representative sample of the five diploid Prunus species, assessing their transportability and effectiveness. The combined probability of identity between two random individuals for the whole set of 26 SSRs was quite low, ranging from 2.30 × 10(-7) in peach to 9.48 × 10(-10) in almond, confirming the usefulness of the proposed set for fingerprinting analyses in Prunus species.

  13. Comparative genomic data of the Avian Phylogenomics Project.

    Science.gov (United States)

    Zhang, Guojie; Li, Bo; Li, Cai; Gilbert, M Thomas P; Jarvis, Erich D; Wang, Jun

    2014-01-01

    The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of

  14. Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context.

    Science.gov (United States)

    Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi

    2007-09-18

    Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.

  15. Genomic analyses of the microsporidian Nosema ceranae, an emergent pathogen of honey bees.

    Directory of Open Access Journals (Sweden)

    R Scott Cornman

    2009-06-01

    Full Text Available Recent steep declines in honey bee health have severely impacted the beekeeping industry, presenting new risks for agricultural commodities that depend on insect pollination. Honey bee declines could reflect increased pressures from parasites and pathogens. The incidence of the microsporidian pathogen Nosema ceranae has increased significantly in the past decade. Here we present a draft assembly (7.86 MB of the N. ceranae genome derived from pyrosequence data, including initial gene models and genomic comparisons with other members of this highly derived fungal lineage. N. ceranae has a strongly AT-biased genome (74% A+T and a diversity of repetitive elements, complicating the assembly. Of 2,614 predicted protein-coding sequences, we conservatively estimate that 1,366 have homologs in the microsporidian Encephalitozoon cuniculi, the most closely related published genome sequence. We identify genes conserved among microsporidia that lack clear homology outside this group, which are of special interest as potential virulence factors in this group of obligate parasites. A substantial fraction of the diminutive N. ceranae proteome consists of novel and transposable-element proteins. For a majority of well-supported gene models, a conserved sense-strand motif can be found within 15 bases upstream of the start codon; a previously uncharacterized version of this motif is also present in E. cuniculi. These comparisons provide insight into the architecture, regulation, and evolution of microsporidian genomes, and will drive investigations into honey bee-Nosema interactions.

  16. G-InforBIO: integrated system for microbial genomics

    Directory of Open Access Journals (Sweden)

    Abe Takashi

    2006-08-01

    Full Text Available Abstract Background Genome databases contain diverse kinds of information, including gene annotations and nucleotide and amino acid sequences. It is not easy to integrate such information for genomic study. There are few tools for integrated analyses of genomic data, therefore, we developed software that enables users to handle, manipulate, and analyze genome data with a variety of sequence analysis programs. Results The G-InforBIO system is a novel tool for genome data management and sequence analysis. The system can import genome data encoded as eXtensible Markup Language documents as formatted text documents, including annotations and sequences, from DNA Data Bank of Japan and GenBank encoded as flat files. The genome database is constructed automatically after importing, and the database can be exported as documents formatted with eXtensible Markup Language or tab-deliminated text. Users can retrieve data from the database by keyword searches, edit annotation data of genes, and process data with G-InforBIO. In addition, information in the G-InforBIO database can be analyzed seamlessly with nine different software programs, including programs for clustering and homology analyses. Conclusion The G-InforBIO system simplifies genome analyses by integrating several available software programs to allow efficient handling and manipulation of genome data. G-InforBIO is freely available from the download site.

  17. Universal features in the genome-level evolution of protein domains.

    Science.gov (United States)

    Cosentino Lagomarsino, Marco; Sellerio, Alessandro L; Heijning, Philip D; Bassetti, Bruno

    2009-01-01

    Protein domains can be used to study proteome evolution at a coarse scale. In particular, they are found on genomes with notable statistical distributions. It is known that the distribution of domains with a given topology follows a power law. We focus on a further aspect: these distributions, and the number of distinct topologies, follow collective trends, or scaling laws, depending on the total number of domains only, and not on genome-specific features. We present a stochastic duplication/innovation model, in the class of the so-called 'Chinese restaurant processes', that explains this observation with two universal parameters, representing a minimal number of domains and the relative weight of innovation to duplication. Furthermore, we study a model variant where new topologies are related to occurrence in genomic data, accounting for fold specificity. Both models have general quantitative agreement with data from hundreds of genomes, which indicates that the domains of a genome are built with a combination of specificity and robust self-organizing phenomena. The latter are related to the basic evolutionary 'moves' of duplication and innovation, and give rise to the observed scaling laws, a priori of the specific evolutionary history of a genome. We interpret this as the concurrent effect of neutral and selective drives, which increase duplication and decrease innovation in larger and more complex genomes. The validity of our model would imply that the empirical observation of a small number of folds in nature may be a consequence of their evolution.

  18. Quantification of trace-level DNA by real-time whole genome amplification.

    Science.gov (United States)

    Kang, Min-Jung; Yu, Hannah; Kim, Sook-Kyung; Park, Sang-Ryoul; Yang, Inchul

    2011-01-01

    Quantification of trace amounts of DNA is a challenge in analytical applications where the concentration of a target DNA is very low or only limited amounts of samples are available for analysis. PCR-based methods including real-time PCR are highly sensitive and widely used for quantification of low-level DNA samples. However, ordinary PCR methods require at least one copy of a specific gene sequence for amplification and may not work for a sub-genomic amount of DNA. We suggest a real-time whole genome amplification method adopting the degenerate oligonucleotide primed PCR (DOP-PCR) for quantification of sub-genomic amounts of DNA. This approach enabled quantification of sub-picogram amounts of DNA independently of their sequences. When the method was applied to the human placental DNA of which amount was accurately determined by inductively coupled plasma-optical emission spectroscopy (ICP-OES), an accurate and stable quantification capability for DNA samples ranging from 80 fg to 8 ng was obtained. In blind tests of laboratory-prepared DNA samples, measurement accuracies of 7.4%, -2.1%, and -13.9% with analytical precisions around 15% were achieved for 400-pg, 4-pg, and 400-fg DNA samples, respectively. A similar quantification capability was also observed for other DNA species from calf, E. coli, and lambda phage. Therefore, when provided with an appropriate standard DNA, the suggested real-time DOP-PCR method can be used as a universal method for quantification of trace amounts of DNA.

  19. Sleep deprivation and activation of morning levels of cellular and genomic markers of inflammation.

    Science.gov (United States)

    Irwin, Michael R; Wang, Minge; Campomayor, Capella O; Collado-Hidalgo, Alicia; Cole, Steve

    2006-09-18

    Inflammation is associated with increased risk of cardiovascular disorders, arthritis, diabetes mellitus, and mortality. The effects of sleep loss on the cellular and genomic mechanisms that contribute to inflammatory cytokine activity are not known. In 30 healthy adults, monocyte intracellular proinflammatory cytokine production was repeatedly assessed during the day across 3 baseline periods and after partial sleep deprivation (awake from 11 pm to 3 am). We analyzed the impact of sleep loss on transcription of proinflammatory cytokine genes and used DNA microarray analyses to characterize candidate transcription-control pathways that might mediate the effects of sleep loss on leukocyte gene expression. In the morning after a night of sleep loss, monocyte production of interleukin 6 and tumor necrosis factor alpha was significantly greater compared with morning levels following uninterrupted sleep. In addition, sleep loss induced a more than 3-fold increase in transcription of interleukin 6 messenger RNA and a 2-fold increase in tumor necrosis factor alpha messenger RNA. Bioinformatics analyses suggested that the inflammatory response was mediated by the nuclear factor kappaB inflammatory signaling system as well as through classic hormone and growth factor response pathways. Sleep loss induces a functional alteration of the monocyte proinflammatory cytokine response. A modest amount of sleep loss also alters molecular processes that drive cellular immune activation and induce inflammatory cytokines; mapping the dynamics of sleep loss on molecular signaling pathways has implications for understanding the role of sleep in altering immune cell physiologic characteristics. Interventions that target sleep might constitute new strategies to constrain inflammation with effects on inflammatory disease risk.

  20. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    Directory of Open Access Journals (Sweden)

    Tamara Smokvina

    Full Text Available Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis

  1. A genome-wide association study reveals variants in ARL15 that influence adiponectin levels

    NARCIS (Netherlands)

    J.B. Richards (Brent); D. Waterworth (Dawn); S. O'Rahilly (Stephen); M.-F. Hivert (Marie-France); R.J.F. Loos (Ruth); J.R.B. Perry (John); T. Tanaka (Toshiko); N.J. Timpson (Nicholas); R.K. Semple (Robert); N. Soranzo (Nicole); K. Song (Kijoung); N. Rocha (Nuno); E. Grundberg (Elin); J. Dupuis (Josée); J.C. Florez (Jose); C. Langenberg (Claudia); I. Prokopenko (Inga); R. Saxena (Richa); R. Sladek (Rob); Y.S. Aulchenko (Yurii); D.M. Evans (David); G. Waeber (Gérard); M.S. Burnett; N. Sattar (Naveed); J. Devaney (Joseph); C. Willenborg (Christina); A. Hingorani (Aroon); J.C.M. Witteman (Jacqueline); P. Vollenweider (Peter); B. Glaser (Beate); C. Hengstenberg (Christian); L. Ferrucci (Luigi); D. Melzer (David); K. Stark (Klaus); J. Deanfield (John); J. Winogradow (Janina); M. Grassl (Martina); A.S. Hall (Alistair); J.M. Egan (Josephine); J.R. Thompson (John); S.L. Ricketts (Sally); I.R. König (Inke); W. Reinhard (Wibke); S.M. Grundy (Scott); H.E. Wichmann (Heinz Erich); P. Barter (Phil); R. Mahley (Robert); Y.A. Kesaniemi (Antero); D.J. Rader (Daniel); M.P. Reilly (Muredach); S.E. Epstein (Stephen); A.F.R. Stewart (Alexandre); P. Tikka-Kleemola (Päivi); H. Schunkert (Heribert); K.A. Burling (Keith); J. Erdmann (Jeanette); P. Deloukas (Panagiotis); T. Pastinen (Tomi); N.J. Samani (Nilesh); R. McPherson (Ruth); G.D. Smith; T.M. Frayling (Timothy); N.J. Wareham (Nick); J.B. Meigs (James); V. Mooser (Vincent); T.D. Spector (Timothy)

    2009-01-01

    textabstractThe adipocyte-derived protein adiponectin is highly heritable and inversely associated with risk of type 2 diabetes mellitus (T2D) and coronary heart disease (CHD). We meta-analyzed 3 genome-wide association studies for circulating adiponectin levels (n = 8,531) and sought validation of

  2. Analysis of 90 Mb of the potato genome reveals conservation of gene structures and order with tomato but divergence in repetitive sequence composition

    Directory of Open Access Journals (Sweden)

    O'Brien Kimberly

    2008-06-01

    Full Text Available Abstract Background The Solanaceae family contains a number of important crop species including potato (Solanum tuberosum which is grown for its underground storage organ known as a tuber. Albeit the 4th most important food crop in the world, other than a collection of ~220,000 Expressed Sequence Tags, limited genomic sequence information is currently available for potato and advances in potato yield and nutrition content would be greatly assisted through access to a complete genome sequence. While morphologically diverse, Solanaceae species such as potato, tomato, pepper, and eggplant share not only genes but also gene order thereby permitting highly informative comparative genomic analyses. Results In this study, we report on analysis 89.9 Mb of potato genomic sequence representing 10.2% of the genome generated through end sequencing of a potato bacterial artificial chromosome (BAC clone library (87 Mb and sequencing of 22 potato BAC clones (2.9 Mb. The GC content of potato is very similar to Solanum lycopersicon (tomato and other dicotyledonous species yet distinct from the monocotyledonous grass species, Oryza sativa. Parallel analyses of repetitive sequences in potato and tomato revealed substantial differences in their abundance, 34.2% in potato versus 46.3% in tomato, which is consistent with the increased genome size per haploid genome of these two Solanum species. Specific classes and types of repetitive sequences were also differentially represented between these two species including a telomeric-related repetitive sequence, ribosomal DNA, and a number of unclassified repetitive sequences. Comparative analyses between tomato and potato at the gene level revealed a high level of conservation of gene content, genic feature, and gene order although discordances in synteny were observed. Conclusion Genomic level analyses of potato and tomato confirm that gene sequence and gene order are conserved between these solanaceous species and that

  3. Use of genomic recursions and algorithm for proven and young animals for single-step genomic BLUP analyses--a simulation study.

    Science.gov (United States)

    Fragomeni, B O; Lourenco, D A L; Tsuruta, S; Masuda, Y; Aguilar, I; Misztal, I

    2015-10-01

    The purpose of this study was to examine accuracy of genomic selection via single-step genomic BLUP (ssGBLUP) when the direct inverse of the genomic relationship matrix (G) is replaced by an approximation of G(-1) based on recursions for young genotyped animals conditioned on a subset of proven animals, termed algorithm for proven and young animals (APY). With the efficient implementation, this algorithm has a cubic cost with proven animals and linear with young animals. Ten duplicate data sets mimicking a dairy cattle population were simulated. In a first scenario, genomic information for 20k genotyped bulls, divided in 7k proven and 13k young bulls, was generated for each replicate. In a second scenario, 5k genotyped cows with phenotypes were included in the analysis as young animals. Accuracies (average for the 10 replicates) in regular EBV were 0.72 and 0.34 for proven and young animals, respectively. When genomic information was included, they increased to 0.75 and 0.50. No differences between genomic EBV (GEBV) obtained with the regular G(-1) and the approximated G(-1) via the recursive method were observed. In the second scenario, accuracies in GEBV (0.76, 0.51 and 0.59 for proven bulls, young males and young females, respectively) were also higher than those in EBV (0.72, 0.35 and 0.49). Again, no differences between GEBV with regular G(-1) and with recursions were observed. With the recursive algorithm, the number of iterations to achieve convergence was reduced from 227 to 206 in the first scenario and from 232 to 209 in the second scenario. Cows can be treated as young animals in APY without reducing the accuracy. The proposed algorithm can be implemented to reduce computing costs and to overcome current limitations on the number of genotyped animals in the ssGBLUP method. © 2015 Blackwell Verlag GmbH.

  4. Genome sequence and genetic diversity of European ash trees.

    Science.gov (United States)

    Sollars, Elizabeth S A; Harper, Andrea L; Kelly, Laura J; Sambles, Christine M; Ramirez-Gonzalez, Ricardo H; Swarbreck, David; Kaithakottil, Gemy; Cooper, Endymion D; Uauy, Cristobal; Havlickova, Lenka; Worswick, Gemma; Studholme, David J; Zohren, Jasmin; Salmon, Deborah L; Clavijo, Bernardo J; Li, Yi; He, Zhesi; Fellgett, Alison; McKinney, Lea Vig; Nielsen, Lene Rostgaard; Douglas, Gerry C; Kjær, Erik Dahl; Downie, J Allan; Boshier, David; Lee, Steve; Clark, Jo; Grant, Murray; Bancroft, Ian; Caccamo, Mario; Buggs, Richard J A

    2017-01-12

    Ash trees (genus Fraxinus, family Oleaceae) are widespread throughout the Northern Hemisphere, but are being devastated in Europe by the fungus Hymenoscyphus fraxineus, causing ash dieback, and in North America by the herbivorous beetle Agrilus planipennis. Here we sequence the genome of a low-heterozygosity Fraxinus excelsior tree from Gloucestershire, UK, annotating 38,852 protein-coding genes of which 25% appear ash specific when compared with the genomes of ten other plant species. Analyses of paralogous genes suggest a whole-genome duplication shared with olive (Olea europaea, Oleaceae). We also re-sequence 37 F. excelsior trees from Europe, finding evidence for apparent long-term decline in effective population size. Using our reference sequence, we re-analyse association transcriptomic data, yielding improved markers for reduced susceptibility to ash dieback. Surveys of these markers in British populations suggest that reduced susceptibility to ash dieback may be more widespread in Great Britain than in Denmark. We also present evidence that susceptibility of trees to H. fraxineus is associated with their iridoid glycoside levels. This rapid, integrated, multidisciplinary research response to an emerging health threat in a non-model organism opens the way for mitigation of the epidemic.

  5. Analyses of a whole-genome inter-clade recombination map of hepatitis delta virus suggest a host polymerase-driven and viral RNA structure-promoted template-switching mechanism for viral RNA recombination

    Science.gov (United States)

    Chao, Mei; Wang, Tzu-Chi; Lin, Chia-Chi; Yung-Liang Wang, Robert; Lin, Wen-Bin; Lee, Shang-En; Cheng, Ying-Yu; Yeh, Chau-Ting; Iang, Shan-Bei

    2017-01-01

    The genome of hepatitis delta virus (HDV) is a 1.7-kb single-stranded circular RNA that folds into an unbranched rod-like structure and has ribozyme activity. HDV redirects host RNA polymerase(s) (RNAP) to perform viral RNA-directed RNA transcription. RNA recombination is known to contribute to the genetic heterogeneity of HDV, but its molecular mechanism is poorly understood. Here, we established a whole-genome HDV-1/HDV-4 recombination map using two cloned sequences coexisting in cultured cells. Our functional analyses of the resulting chimeric delta antigens (the only viral-encoded protein) and recombinant genomes provide insights into how recombination promotes the genotypic and phenotypic diversity of HDV. Our examination of crossover distribution and subsequent mutagenesis analyses demonstrated that ribozyme activity on HDV genome, which is required for viral replication, also contributes to the generation of an inter-clade junction. These data provide circumstantial evidence supporting our contention that HDV RNA recombination occurs via a replication-dependent mechanism. Furthermore, we identify an intrinsic asymmetric bulge on the HDV genome, which appears to promote recombination events in the vicinity. We therefore propose a mammalian RNAP-driven and viral-RNA-structure-promoted template-switching mechanism for HDV genetic recombination. The present findings improve our understanding of the capacities of the host RNAP beyond typical DNA-directed transcription. PMID:28977829

  6. [Development of Plant Metabolomics and Medicinal Plant Genomics].

    Science.gov (United States)

    Saito, Kazuki

    2018-01-01

     A variety of chemicals produced by plants, often referred to as 'phytochemicals', have been used as medicines, food, fuels and industrial raw materials. Recent advances in the study of genomics and metabolomics in plant science have accelerated our understanding of the mechanisms, regulation and evolution of the biosynthesis of specialized plant products. We can now address such questions as how the metabolomic diversity of plants is originated at the levels of genome, and how we should apply this knowledge to drug discovery, industry and agriculture. Our research group has focused on metabolomics-based functional genomics over the last 15 years and we have developed a new research area called 'Phytochemical Genomics'. In this review, the development of a research platform for plant metabolomics is discussed first, to provide a better understanding of the chemical diversity of plants. Then, representative applications of metabolomics to functional genomics in a model plant, Arabidopsis thaliana, are described. The extension of integrated multi-omics analyses to non-model specialized plants, e.g., medicinal plants, is presented, including the identification of novel genes, metabolites and networks for the biosynthesis of flavonoids, alkaloids, sulfur-containing metabolites and terpenoids. Further, functional genomics studies on a variety of medicinal plants is presented. I also discuss future trends in pharmacognosy and related sciences.

  7. Phylogenetic tree based on complete genomes using fractal and correlation analyses without sequence alignment

    Directory of Open Access Journals (Sweden)

    Zu-Guo Yu

    2006-06-01

    Full Text Available The complete genomes of living organisms have provided much information on their phylogenetic relationships. Similarly, the complete genomes of chloroplasts have helped resolve the evolution of this organelle in photosynthetic eukaryotes. In this review, we describe two algorithms to construct phylogenetic trees based on the theories of fractals and dynamic language using complete genomes. These algorithms were developed by our research group in the past few years. Our distance-based phylogenetic tree of 109 prokaryotes and eukaryotes agrees with the biologists' "tree of life" based on the 16S-like rRNA genes in a majority of basic branchings and most lower taxa. Our phylogenetic analysis also shows that the chloroplast genomes are separated into two major clades corresponding to chlorophytes s.l. and rhodophytes s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution.

  8. The Naked Mole Rat Genome Resource: facilitating analyses of cancer and longevity-related adaptations.

    Science.gov (United States)

    Keane, Michael; Craig, Thomas; Alföldi, Jessica; Berlin, Aaron M; Johnson, Jeremy; Seluanov, Andrei; Gorbunova, Vera; Di Palma, Federica; Lindblad-Toh, Kerstin; Church, George M; de Magalhães, João Pedro

    2014-12-15

    The naked mole rat (Heterocephalus glaber) is an exceptionally long-lived and cancer-resistant rodent native to East Africa. Although its genome was previously sequenced, here we report a new assembly sequenced by us with substantially higher N50 values for scaffolds and contigs. We analyzed the annotation of this new improved assembly and identified candidate genomic adaptations which may have contributed to the evolution of the naked mole rat's extraordinary traits, including in regions of p53, and the hyaluronan receptors CD44 and HMMR (RHAMM). Furthermore, we developed a freely available web portal, the Naked Mole Rat Genome Resource (http://www.naked-mole-rat.org), featuring the data and results of our analysis, to assist researchers interested in the genome and genes of the naked mole rat, and also to facilitate further studies on this fascinating species. © The Author 2014. Published by Oxford University Press.

  9. Quantitative and qualitative proteome characteristics extracted from in-depth integrated genomics and proteomics analysis

    NARCIS (Netherlands)

    Low, T.Y.; van Heesch, S.; van den Toorn, H.; Giansanti, P.; Cristobal, A.; Toonen, P.; Schafer, S.; Hubner, N.; van Breukelen, B.; Mohammed, S.; Cuppen, E.; Heck, A.J.R.; Guryev, V.

    2013-01-01

    Quantitative and qualitative protein characteristics are regulated at genomic, transcriptomic, and posttranscriptional levels. Here, we integrated in-depth transcriptome and proteome analyses of liver tissues from two rat strains to unravel the interactions within and between these layers. We

  10. Genome-Wide Tuning of Protein Expression Levels to Rapidly Engineer Microbial Traits.

    Science.gov (United States)

    Freed, Emily F; Winkler, James D; Weiss, Sophie J; Garst, Andrew D; Mutalik, Vivek K; Arkin, Adam P; Knight, Rob; Gill, Ryan T

    2015-11-20

    The reliable engineering of biological systems requires quantitative mapping of predictable and context-independent expression over a broad range of protein expression levels. However, current techniques for modifying expression levels are cumbersome and are not amenable to high-throughput approaches. Here we present major improvements to current techniques through the design and construction of E. coli genome-wide libraries using synthetic DNA cassettes that can tune expression over a ∼10(4) range. The cassettes also contain molecular barcodes that are optimized for next-generation sequencing, enabling rapid and quantitative tracking of alleles that have the highest fitness advantage. We show these libraries can be used to determine which genes and expression levels confer greater fitness to E. coli under different growth conditions.

  11. The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp. population genomics

    DEFF Research Database (Denmark)

    Gopalakrishnan, Shyam; Samaniego Castruita, Jose Alfredo; Sinding, Mikkel Holger Strander

    2017-01-01

    Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data - that of a......Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data...... that regardless of the reference genome choice, most evolutionary genomic analyses yield qualitatively similar results, including those exploring the structure between the wolves and dogs using admixture and principal component analysis. However, we do observe differences in the genomic coverage of re-mapped...

  12. Between Two Fern Genomes

    Science.gov (United States)

    2014-01-01

    Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves. PMID:25324969

  13. Genomic sequencing and analyses of HearMNPV—a new Multinucleocapsid nucleopolyhedrovirus isolated from Helicoverpa armigera

    Directory of Open Access Journals (Sweden)

    Tang Ping

    2012-08-01

    Full Text Available Abstract Background HearMNPV, a nucleopolyhedrovirus (NPV, which infects the cotton bollworm, Helicoverpa armigera, comprises multiple rod-shaped nucleocapsids in virion(as detected by electron microscopy. HearMNPV shows a different host range compared with H. armigera single-nucleocapsid NPV (HearSNPV. To better understand HearMNPV, the HearMNPV genome was sequenced and analyzed. Methods The morphology of HearMNPV was observed by electron microscope. The qPCR was used to determine the replication kinetics of HearMNPV infectious for H. armigera in vivo. A random genomic library of HearMNPV was constructed according to the “partial filling-in” method, the sequence and organization of the HearMNPV genome was analyzed and compared with sequence data from other baculoviruses. Results Real time qPCR showed that HearMNPV DNA replication included a decreasing phase, latent phase, exponential phase, and a stationary phase during infection of H. armigera. The HearMNPV genome consists of 154,196 base pairs, with a G + C content of 40.07%. 162 putative ORFs were detected in the HearMNPV genome, which represented 90.16% of the genome. The remaining 9.84% constitute four homologous regions and other non-coding regions. The gene content and gene arrangement in HearMNPV were most similar to those of Mamestra configurata NPV-B (MacoNPV-B, but was different to HearSNPV. Comparison of the genome of HearMNPV and MacoNPV-B suggested that HearMNPV has a deletion of a 5.4-kb fragment containing five ORFs. In addition, HearMNPV orf66, bro genes, and hrs are different to the corresponding parts of the MacoNPV-B genome. Conclusions HearMNPV can replicate in vivo in H. armigera and in vitro, and is a new NPV isolate distinguished from HearSNPV. HearMNPV is most closely related to MacoNPV-B, but has a distinct genomic structure, content, and organization.

  14. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    Science.gov (United States)

    Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

    2017-01-01

    Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  15. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    Directory of Open Access Journals (Sweden)

    Matthias Christen

    Full Text Available Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  16. The Genomic Code: Genome Evolution and Potential Applications

    KAUST Repository

    Bernardi, Giorgio

    2016-01-25

    The genome of metazoans is organized according to a genomic code which comprises three laws: 1) Compositional correlations hold between contiguous coding and non-coding sequences, as well as among the three codon positions of protein-coding genes; these correlations are the consequence of the fact that the genomes under consideration consist of fairly homogeneous, long (≥200Kb) sequences, the isochores; 2) Although isochores are defined on the basis of purely compositional properties, GC levels of isochores are correlated with all tested structural and functional properties of the genome; 3) GC levels of isochores are correlated with chromosome architecture from interphase to metaphase; in the case of interphase the correlation concerns isochores and the three-dimensional “topological associated domains” (TADs); in the case of mitotic chromosomes, the correlation concerns isochores and chromosomal bands. Finally, the genomic code is the fourth and last pillar of molecular biology, the first three pillars being 1) the double helix structure of DNA; 2) the regulation of gene expression in prokaryotes; and 3) the genetic code.

  17. Genome analyses of an aggressive and invasive lineage of the Irish potato famine pathogen.

    Directory of Open Access Journals (Sweden)

    David E L Cooke

    Full Text Available Pest and pathogen losses jeopardise global food security and ever since the 19(th century Irish famine, potato late blight has exemplified this threat. The causal oomycete pathogen, Phytophthora infestans, undergoes major population shifts in agricultural systems via the successive emergence and migration of asexual lineages. The phenotypic and genotypic bases of these selective sweeps are largely unknown but management strategies need to adapt to reflect the changing pathogen population. Here, we used molecular markers to document the emergence of a lineage, termed 13_A2, in the European P. infestans population, and its rapid displacement of other lineages to exceed 75% of the pathogen population across Great Britain in less than three years. We show that isolates of the 13_A2 lineage are among the most aggressive on cultivated potatoes, outcompete other aggressive lineages in the field, and overcome previously effective forms of plant host resistance. Genome analyses of a 13_A2 isolate revealed extensive genetic and expression polymorphisms particularly in effector genes. Copy number variations, gene gains and losses, amino-acid replacements and changes in expression patterns of disease effector genes within the 13_A2 isolate likely contribute to enhanced virulence and aggressiveness to drive this population displacement. Importantly, 13_A2 isolates carry intact and in planta induced Avrblb1, Avrblb2 and Avrvnt1 effector genes that trigger resistance in potato lines carrying the corresponding R immune receptor genes Rpi-blb1, Rpi-blb2, and Rpi-vnt1.1. These findings point towards a strategy for deploying genetic resistance to mitigate the impact of the 13_A2 lineage and illustrate how pathogen population monitoring, combined with genome analysis, informs the management of devastating disease epidemics.

  18. Characterization of the first complete genome sequence of an Impatiens necrotic spot orthotospovirus isolate from the United States and worldwide phylogenetic analyses of INSV isolates.

    Science.gov (United States)

    Zhao, Kaixi; Margaria, Paolo; Rosa, Cristina

    2018-05-10

    Impatiens necrotic spot orthotospovirus (INSV) can impact economically important ornamental plants and vegetables worldwide. Characterization studies on INSV are limited. For most INSV isolates, there are no complete genome sequences available. This lack of genomic information has a negative impact on the understanding of the INSV genetic diversity and evolution. Here we report the first complete nucleotide sequence of a US INSV isolate. INSV-UP01 was isolated from an impatiens in Pennsylvania, US. RT-PCR was used to clone its full-length genome and Vector NTI to assemble overlapping sequences. Phylogenetic trees were constructed by using MEGA7 software to show the phylogenetic relationships with other available INSV sequences worldwide. This US isolate has genome and biological features classical of INSV species and clusters in the Western Hemisphere clade, but its origin appears to be recent. Furthermore, INSV-UP01 might have been involved in a recombination event with an Italian isolate belonging to the Asian clade. Our analyses support that INSV isolates infect a broad plant-host range they group by geographic origin and not by host, and are subjected to frequent recombination events. These results justify the need to generate and analyze complete genome sequences of orthotospoviruses in general and INSV in particular.

  19. Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context

    Directory of Open Access Journals (Sweden)

    Gardner Timothy S

    2007-09-01

    Full Text Available Abstract Background Lightweight genome viewer (lwgv is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. Results lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. Conclusion lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.

  20. Quantitative and Qualitative Proteome Characteristics Extracted from In-Depth Integrated Genomics and Proteomics Analysis

    NARCIS (Netherlands)

    Low, Teck Yew; van Heesch, Sebastiaan; van den Toorn, Henk; Giansanti, Piero; Cristobal, Alba; Toonen, Pim; Schafer, Sebastian; Huebner, Norbert; van Breukelen, Bas; Mohammed, Shabaz; Cuppen, Edwin; Heck, Albert J. R.; Guryev, Victor

    2013-01-01

    Quantitative and qualitative protein characteristics are regulated at genomic, transcriptomic, and post-transcriptional levels. Here, we integrated in-depth transcriptome and proteome analyses of liver tissues from two rat strains to unravel the interactions within and between these layers. We

  1. Evolution of genes and genomes on the Drosophila phylogeny

    DEFF Research Database (Denmark)

    Clark, Andrew G; Eisen, Michael B; Smith, Douglas R

    2007-01-01

    Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the ......Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here...... tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila...

  2. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  3. Genome-wide divergence and linkage disequilibrium analyses for Capsicum baccatum revealed by genome-anchored single nucleotide polymorphisms

    Science.gov (United States)

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to show the distribution of these 2 important incompatible cultivated pepper species. Estimated mean nucleotide...

  4. Architecture Level Safety Analyses for Safety-Critical Systems

    Directory of Open Access Journals (Sweden)

    K. S. Kushal

    2017-01-01

    Full Text Available The dependency of complex embedded Safety-Critical Systems across Avionics and Aerospace domains on their underlying software and hardware components has gradually increased with progression in time. Such application domain systems are developed based on a complex integrated architecture, which is modular in nature. Engineering practices assured with system safety standards to manage the failure, faulty, and unsafe operational conditions are very much necessary. System safety analyses involve the analysis of complex software architecture of the system, a major aspect in leading to fatal consequences in the behaviour of Safety-Critical Systems, and provide high reliability and dependability factors during their development. In this paper, we propose an architecture fault modeling and the safety analyses approach that will aid in identifying and eliminating the design flaws. The formal foundations of SAE Architecture Analysis & Design Language (AADL augmented with the Error Model Annex (EMV are discussed. The fault propagation, failure behaviour, and the composite behaviour of the design flaws/failures are considered for architecture safety analysis. The illustration of the proposed approach is validated by implementing the Speed Control Unit of Power-Boat Autopilot (PBA system. The Error Model Annex (EMV is guided with the pattern of consideration and inclusion of probable failure scenarios and propagation of fault conditions in the Speed Control Unit of Power-Boat Autopilot (PBA. This helps in validating the system architecture with the detection of the error event in the model and its impact in the operational environment. This also provides an insight of the certification impact that these exceptional conditions pose at various criticality levels and design assurance levels and its implications in verifying and validating the designs.

  5. The complete mitochondrial genomes of three parasitic nematodes of birds: a unique gene order and insights into nematode phylogeny

    Science.gov (United States)

    2013-01-01

    Background Analyses of mitochondrial (mt) genome sequences in recent years challenge the current working hypothesis of Nematoda phylogeny proposed from morphology, ecology and nuclear small subunit rRNA gene sequences, and raise the need to sequence additional mt genomes for a broad range of nematode lineages. Results We sequenced the complete mt genomes of three Ascaridia species (family Ascaridiidae) that infest chickens, pigeons and parrots, respectively. These three Ascaridia species have an identical arrangement of mt genes to each other but differ substantially from other nematodes. Phylogenetic analyses of the mt genome sequences of the Ascaridia species, together with 62 other nematode species, support the monophylies of seven high-level taxa of the phylum Nematoda: 1) the subclass Dorylaimia; 2) the orders Rhabditida, Trichinellida and Mermithida; 3) the suborder Rhabditina; and 4) the infraorders Spiruromorpha and Oxyuridomorpha. Analyses of mt genome sequences, however, reject the monophylies of the suborders Spirurina and Tylenchina, and the infraorders Rhabditomorpha, Panagrolaimomorpha and Tylenchomorpha. Monophyly of the infraorder Ascaridomorpha varies depending on the methods of phylogenetic analysis. The Ascaridomorpha was more closely related to the infraorders Rhabditomorpha and Diplogasteromorpha (suborder Rhabditina) than they were to the other two infraorders of the Spirurina: Oxyuridorpha and Spiruromorpha. The closer relationship among Ascaridomorpha, Rhabditomorpha and Diplogasteromorpha was also supported by a shared common pattern of mitochondrial gene arrangement. Conclusions Analyses of mitochondrial genome sequences and gene arrangement has provided novel insights into the phylogenetic relationships among several major lineages of nematodes. Many lineages of nematodes, however, are underrepresented or not represented in these analyses. Expanding taxon sampling is necessary for future phylogenetic studies of nematodes with mt genome

  6. Multi-level Bayesian analyses for single- and multi-vehicle freeway crashes.

    Science.gov (United States)

    Yu, Rongjie; Abdel-Aty, Mohamed

    2013-09-01

    This study presents multi-level analyses for single- and multi-vehicle crashes on a mountainous freeway. Data from a 15-mile mountainous freeway section on I-70 were investigated. Both aggregate and disaggregate models for the two crash conditions were developed. Five years of crash data were used in the aggregate investigation, while the disaggregate models utilized one year of crash data along with real-time traffic and weather data. For the aggregate analyses, safety performance functions were developed for the purpose of revealing the contributing factors for each crash type. Two methodologies, a Bayesian bivariate Poisson-lognormal model and a Bayesian hierarchical Poisson model with correlated random effects, were estimated to simultaneously analyze the two crash conditions with consideration of possible correlations. Except for the factors related to geometric characteristics, two exposure parameters (annual average daily traffic and segment length) were included. Two different sets of significant explanatory and exposure variables were identified for the single-vehicle (SV) and multi-vehicle (MV) crashes. It was found that the Bayesian bivariate Poisson-lognormal model is superior to the Bayesian hierarchical Poisson model, the former with a substantially lower DIC and more significant variables. In addition to the aggregate analyses, microscopic real-time crash risk evaluation models were developed for the two crash conditions. Multi-level Bayesian logistic regression models were estimated with the random parameters accounting for seasonal variations, crash-unit-level diversity and segment-level random effects capturing unobserved heterogeneity caused by the geometric characteristics. The model results indicate that the effects of the selected variables on crash occurrence vary across seasons and crash units; and that geometric characteristic variables contribute to the segment variations: the more unobserved heterogeneity have been accounted, the better

  7. Genomic Biomarkers for Personalized Medicine: Development and Validation in Clinical Studies

    Directory of Open Access Journals (Sweden)

    Shigeyuki Matsui

    2013-01-01

    Full Text Available The establishment of high-throughput technologies has brought substantial advances to our understanding of the biology of many diseases at the molecular level and increasing expectations on the development of innovative molecularly targeted treatments and molecular biomarkers or diagnostic tests in the context of clinical studies. In this review article, we position the two critical statistical analyses of high-dimensional genomic data, gene screening and prediction, in the framework of development and validation of genomic biomarkers or signatures, through taking into consideration the possible different strategies for developing genomic signatures. A wide variety of biomarker-based clinical trial designs to assess clinical utility of a biomarker or a new treatment with a companion biomarker are also discussed.

  8. Whole genome sequencing analyses of Listeria monocytogenes that persisted in a milkshake machine for a year and caused illnesses in Washington State

    OpenAIRE

    Li, Zhen; P?rez-Osorio, Ailyn; Wang, Yu; Eckmann, Kaye; Glover, William A.; Allard, Marc W.; Brown, Eric W.; Chen, Yi

    2017-01-01

    Background In 2015, in addition to a United States multistate outbreak linked to contaminated ice cream, another outbreak linked to ice cream was reported in the Pacific Northwest of the United States. It was a hospital-acquired outbreak linked to milkshakes, made from contaminated ice cream mixes and milkshake maker, served to patients. Here we performed multiple analyses on isolates associated with this outbreak: pulsed-field gel electrophoresis (PFGE), whole genome single nucleotide polymo...

  9. Comparative genomics reveals cotton-specific virulence factors in flexible genomic regions in Verticillium dahliae and evidence of horizontal gene transfer from Fusarium.

    Science.gov (United States)

    Chen, Jie-Yin; Liu, Chun; Gui, Yue-Jing; Si, Kai-Wei; Zhang, Dan-Dan; Wang, Jie; Short, Dylan P G; Huang, Jin-Qun; Li, Nan-Yang; Liang, Yong; Zhang, Wen-Qi; Yang, Lin; Ma, Xue-Feng; Li, Ting-Gang; Zhou, Lei; Wang, Bao-Li; Bao, Yu-Ming; Subbarao, Krishna V; Zhang, Geng-Yun; Dai, Xiao-Feng

    2018-01-01

    Verticillium dahliae isolates are most virulent on the host from which they were originally isolated. Mechanisms underlying these dominant host adaptations are currently unknown. We sequenced the genome of V. dahliae Vd991, which is highly virulent on its original host, cotton, and performed comparisons with the reference genomes of JR2 (from tomato) and VdLs.17 (from lettuce). Pathogenicity-related factor prediction, orthology and multigene family classification, transcriptome analyses, phylogenetic analyses, and pathogenicity experiments were performed. The Vd991 genome harbored several exclusive, lineage-specific (LS) genes within LS regions (LSRs). Deletion mutants of the seven genes within one LSR (G-LSR2) in Vd991 were less virulent only on cotton. Integration of G-LSR2 genes individually into JR2 and VdLs.17 resulted in significantly enhanced virulence on cotton but did not affect virulence on tomato or lettuce. Transcription levels of the seven LS genes in Vd991 were higher during the early stages of cotton infection, as compared with other hosts. Phylogenetic analyses suggested that G-LSR2 was acquired from Fusarium oxysporum f. sp. vasinfectum through horizontal gene transfer. Our results provide evidence that horizontal gene transfer from Fusarium to Vd991 contributed significantly to its adaptation to cotton and may represent a significant mechanism in the evolution of an asexual plant pathogen. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  10. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  11. Relation of genomic variants for Alzheimer disease dementia to common neuropathologies.

    Science.gov (United States)

    Farfel, Jose M; Yu, Lei; Buchman, Aron S; Schneider, Julie A; De Jager, Philip L; Bennett, David A

    2016-08-02

    To investigate the associations of previously reported Alzheimer disease (AD) dementia genomic variants with common neuropathologies. This is a postmortem study including 1,017 autopsied participants from 2 clinicopathologic cohorts. Analyses focused on 22 genomic variants associated with AD dementia in large-scale case-control genome-wide association study (GWAS) meta-analyses. The neuropathologic traits of interest were a pathologic diagnosis of AD according to NIA-Reagan criteria, macroscopic and microscopic infarcts, Lewy bodies (LB), and hippocampal sclerosis. For each variant, multiple logistic regression was used to investigate its association with neuropathologic traits, adjusting for age, sex, and subpopulation structure. We also conducted power analyses to estimate the sample sizes required to detect genome-wide significance (p dementia variants are not likely to be detected for association with pathologic AD with a sample size in excess of the largest GWAS meta-analyses of AD dementia. Many recently discovered genomic variants for AD dementia are not associated with the pathology of AD. Some genomic variants for AD dementia appear to be associated with other common neuropathologies. © 2016 American Academy of Neurology.

  12. Congruent Deep Relationships in the Grape Family (Vitaceae) Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Science.gov (United States)

    Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A

    2015-01-01

    Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  13. Congruent Deep Relationships in the Grape Family (Vitaceae Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    Full Text Available Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera. The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  14. DMINDA: an integrated web server for DNA motif identification and analyses.

    Science.gov (United States)

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-07-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. MicroRNA-34a promotes genomic instability by a broad suppression of genome maintenance mechanisms downstream of the oncogene KSHV-vGPCR.

    Science.gov (United States)

    Krause, Claudia J; Popp, Oliver; Thirunarayanan, Nanthakumar; Dittmar, Gunnar; Lipp, Martin; Müller, Gerd

    2016-03-01

    The Kaposi's sarcoma-associated herpesvirus (KSHV)-encoded chemokine receptor vGPCR acts as an oncogene in Kaposi's sarcomagenesis. Until now, the molecular mechanisms by which the vGPCR contributes to tumor development remain incompletely understood. Here, we show that the KSHV-vGPCR contributes to tumor progression through microRNA (miR)-34a-mediated induction of genomic instability. Large-scale analyses on the DNA, gene and protein level of cell lines derived from a mouse model of vGPCR-driven tumorigenesis revealed that a vGPCR-induced upregulation of miR-34a resulted in a broad suppression of genome maintenance genes. A knockdown of either the vGPCR or miR-34a largely restored the expression of these genes and confirmed miR-34a as a downstream effector of the KSHV-vGPCR that compromises genome maintenance mechanisms. This novel, protumorigenic role of miR-34a questions the use of miR-34a mimetics in cancer therapy as they could impair genome stability.

  16. Acrylamide levels in Finnish foodstuffs analysed with liquid chromatography tandem mass spectrometry.

    Science.gov (United States)

    Eerola, Susanna; Hollebekkers, Koen; Hallikainen, Anja; Peltonen, Kimmo

    2007-02-01

    Sample clean-up and HPLC with tandem mass spectrometric detection (LC-MS/MS) was validated for the routine analysis of acrylamide in various foodstuffs. The method used proved to be reliable and the detection limit for routine monitoring was sensitive enough for foods and drinks (38 microg/kg for foods and 5 microg/L for drinks). The RSDs for repeatability and day-to-day variation were below 15% in all food matrices. Two hundred and one samples which included more than 30 different types of food and foods manufactured and prepared in various ways were analysed. The main types of food analysed were potato and cereal-based foods, processed foods (pizza, minced beef meat, meat balls, chicken nuggets, potato-ham casserole and fried bacon) and coffee. Acrylamide was detected at levels, ranging from nondetectable to 1480 microg/kg level in solid food, with crisp bread exhibiting the highest levels. In drinks, the highest value (29 microg/L) was found in regular coffee drinks.

  17. The Japanese Society of Pathology Guidelines on the handling of pathological tissue samples for genomic research: Standard operating procedures based on empirical analyses.

    Science.gov (United States)

    Kanai, Yae; Nishihara, Hiroshi; Miyagi, Yohei; Tsuruyama, Tatsuhiro; Taguchi, Kenichi; Katoh, Hiroto; Takeuchi, Tomoyo; Gotoh, Masahiro; Kuramoto, Junko; Arai, Eri; Ojima, Hidenori; Shibuya, Ayako; Yoshida, Teruhiko; Akahane, Toshiaki; Kasajima, Rika; Morita, Kei-Ichi; Inazawa, Johji; Sasaki, Takeshi; Fukayama, Masashi; Oda, Yoshinao

    2018-02-01

    Genome research using appropriately collected pathological tissue samples is expected to yield breakthroughs in the development of biomarkers and identification of therapeutic targets for diseases such as cancers. In this connection, the Japanese Society of Pathology (JSP) has developed "The JSP Guidelines on the Handling of Pathological Tissue Samples for Genomic Research" based on an abundance of data from empirical analyses of tissue samples collected and stored under various conditions. Tissue samples should be collected from appropriate sites within surgically resected specimens, without disturbing the features on which pathological diagnosis is based, while avoiding bleeding or necrotic foci. They should be collected as soon as possible after resection: at the latest within about 3 h of storage at 4°C. Preferably, snap-frozen samples should be stored in liquid nitrogen (about -180°C) until use. When intending to use genomic DNA extracted from formalin-fixed paraffin-embedded tissue, 10% neutral buffered formalin should be used. Insufficient fixation and overfixation must both be avoided. We hope that pathologists, clinicians, clinical laboratory technicians and biobank operators will come to master the handling of pathological tissue samples based on the standard operating procedures in these Guidelines to yield results that will assist in the realization of genomic medicine. © 2018 The Authors. Pathology International published by Japanese Society of Pathology and John Wiley & Sons Australia, Ltd.

  18. Pan-Genome Analysis of Human Gastric Pathogen H. pylori: Comparative Genomics and Pathogenomics Approaches to Identify Regions Associated with Pathogenicity and Prediction of Potential Core Therapeutic Targets

    DEFF Research Database (Denmark)

    Ali, Amjad; Naz, Anam; Soares, Siomar C.

    2015-01-01

    -genome approach; the predicted conserved gene families (1,193) constitute similar to 77% of the average H. pylori genome and 45% of the global gene repertoire of the species. Reverse vaccinology strategies have been adopted to identify and narrow down the potential core-immunogenic candidates. Total of 28 nonhost....... Pan-genome analyses of the global representative H. pylori isolates consisting of 39 complete genomes are presented in this paper. Phylogenetic analyses have revealed close relationships among geographically diverse strains of H. pylori. The conservation among these genomes was further analyzed by pan...

  19. Inversion variants in human and primate genomes.

    Science.gov (United States)

    Catacchio, Claudia Rita; Maggiolini, Flavia Angela Maria; D'Addabbo, Pietro; Bitonto, Miriana; Capozzi, Oronzo; Signorile, Martina Lepore; Miroballo, Mattia; Archidiacono, Nicoletta; Eichler, Evan E; Ventura, Mario; Antonacci, Francesca

    2018-05-18

    For many years, inversions have been proposed to be a direct driving force in speciation since they suppress recombination when heterozygous. Inversions are the most common large-scale differences among humans and great apes. Nevertheless, they represent large events easily distinguishable by classical cytogenetics, whose resolution, however, is limited. Here, we performed a genome-wide comparison between human, great ape, and macaque genomes using the net alignments for the most recent releases of genome assemblies. We identified a total of 156 putative inversions, between 103 kb and 91 Mb, corresponding to 136 human loci. Combining literature, sequence, and experimental analyses, we analyzed 109 of these loci and found 67 regions inverted in one or multiple primates, including 28 newly identified inversions. These events overlap with 81 human genes at their breakpoints, and seven correspond to sites of recurrent rearrangements associated with human disease. This work doubles the number of validated primate inversions larger than 100 kb, beyond what was previously documented. We identified 74 sites of errors, where the sequence has been assembled in the wrong orientation, in the reference genomes analyzed. Our data serve two purposes: First, we generated a map of evolutionary inversions in these genomes representing a resource for interrogating differences among these species at a functional level; second, we provide a list of misassembled regions in these primate genomes, involving over 300 Mb of DNA and 1978 human genes. Accurately annotating these regions in the genome references has immediate applications for evolutionary and biomedical studies on primates. © 2018 Catacchio et al.; Published by Cold Spring Harbor Laboratory Press.

  20. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  1. Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses

    Science.gov (United States)

    Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A.; Janke, Axel

    2015-01-01

    The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. PMID:26019166

  2. Genomic signatures of diet-related shifts during human origins.

    Science.gov (United States)

    Babbitt, Courtney C; Warner, Lisa R; Fedrigo, Olivier; Wall, Christine E; Wray, Gregory A

    2011-04-07

    There are numerous anthropological analyses concerning the importance of diet during human evolution. Diet is thought to have had a profound influence on the human phenotype, and dietary differences have been hypothesized to contribute to the dramatic morphological changes seen in modern humans as compared with non-human primates. Here, we attempt to integrate the results of new genomic studies within this well-developed anthropological context. We then review the current evidence for adaptation related to diet, both at the level of sequence changes and gene expression. Finally, we propose some ways in which new technologies can help identify specific genomic adaptations that have resulted in metabolic and morphological differences between humans and non-human primates.

  3. Community genomics among stratified microbial assemblages in the ocean's interior

    DEFF Research Database (Denmark)

    DeLong, Edward F; Preston, Christina M; Mincer, Tracy

    2006-01-01

    Microbial life predominates in the ocean, yet little is known about its genomic variability, especially along the depth continuum. We report here genomic analyses of planktonic microbial communities in the North Pacific Subtropical Gyre, from the ocean's surface to near-sea floor depths. Sequence......, and host-viral interactions. Comparative genomic analyses of stratified microbial communities have the potential to provide significant insight into higher-order community organization and dynamics....

  4. Examination of triacylglycerol biosynthetic pathways via de novo transcriptomic and proteomic analyses in an unsequenced microalga.

    Directory of Open Access Journals (Sweden)

    Michael T Guarnieri

    Full Text Available Biofuels derived from algal lipids represent an opportunity to dramatically impact the global energy demand for transportation fuels. Systems biology analyses of oleaginous algae could greatly accelerate the commercialization of algal-derived biofuels by elucidating the key components involved in lipid productivity and leading to the initiation of hypothesis-driven strain-improvement strategies. However, higher-level systems biology analyses, such as transcriptomics and proteomics, are highly dependent upon available genomic sequence data, and the lack of these data has hindered the pursuit of such analyses for many oleaginous microalgae. In order to examine the triacylglycerol biosynthetic pathway in the unsequenced oleaginous microalga, Chlorella vulgaris, we have established a strategy with which to bypass the necessity for genomic sequence information by using the transcriptome as a guide. Our results indicate an upregulation of both fatty acid and triacylglycerol biosynthetic machinery under oil-accumulating conditions, and demonstrate the utility of a de novo assembled transcriptome as a search model for proteomic analysis of an unsequenced microalga.

  5. On the road to quantitative genetic/genomic analyses of root growth and development components underlying root architecture

    International Nuclear Information System (INIS)

    Draye, X.; Dorlodot, S. de; Lavigne, T.

    2006-01-01

    The quantitative genetic and functional genomic analyses of root development, growth and plasticity will be instrumental in revealing the major regulatory pathways of root architecture. Such knowledge, combined with in-depth consideration of root physiology (e.g. uptake, exsudation), form (space-time dynamics of soil exploration) and ecology (including root environment), will settle the bases for designing root ideotypes for specific environments, for low-input agriculture or for successful agricultural production with minimal impact on the environment. This report summarizes root research initiated in our lab between 2000 and 2004 in the following areas: quantitative analysis of root branching in bananas, high throughput characterisation of root morphology, image analysis, QTL mapping of detailed features of root architecture in rice, and attempts to settle a Crop Root Research Consortium. (author)

  6. Genome based analyses of six hexacorallian species reject the “naked coral” hypothesis

    KAUST Repository

    Wang, Xin; Drillon, Gué nola; Ryu, Taewoo; Voolstra, Christian R.; Aranda, Manuel

    2017-01-01

    Scleractinian corals are the foundation species of the coral-reef ecosystem. Their calcium carbonate skeletons form extensive structures that are home to millions of species, making coral reefs one of the most diverse ecosystems of our planet. However, our understanding of how reef-building corals have evolved the ability to calcify and become the ecosystem builders they are today is hampered by uncertain relationships within their subclass Hexacorallia. Corallimorpharians have been proposed to originate from a complex scleractinian ancestor that lost the ability to calcify in response to increasing ocean acidification, suggesting the possibility for corals to lose and gain the ability to calcify in response to increasing ocean acidification. Here we employed a phylogenomic approach using whole-genome data from six hexacorallian species to resolve the evolutionary relationship between reef-building corals and their non-calcifying relatives. Phylogenetic analysis based on 1,421 single-copy orthologs, as well as gene presence/absence and synteny information, converged on the same topologies, showing strong support for scleractinian monophyly and a corallimorpharian sister clade. Our broad phylogenomic approach using sequence-based and sequence-independent analyses provides unambiguous evidence for the monophyly of scleractinian corals and the rejection of corallimorpharians as descendants of a complex coral ancestor.

  7. Genome based analyses of six hexacorallian species reject the “naked coral” hypothesis

    KAUST Repository

    Wang, Xin

    2017-09-23

    Scleractinian corals are the foundation species of the coral-reef ecosystem. Their calcium carbonate skeletons form extensive structures that are home to millions of species, making coral reefs one of the most diverse ecosystems of our planet. However, our understanding of how reef-building corals have evolved the ability to calcify and become the ecosystem builders they are today is hampered by uncertain relationships within their subclass Hexacorallia. Corallimorpharians have been proposed to originate from a complex scleractinian ancestor that lost the ability to calcify in response to increasing ocean acidification, suggesting the possibility for corals to lose and gain the ability to calcify in response to increasing ocean acidification. Here we employed a phylogenomic approach using whole-genome data from six hexacorallian species to resolve the evolutionary relationship between reef-building corals and their non-calcifying relatives. Phylogenetic analysis based on 1,421 single-copy orthologs, as well as gene presence/absence and synteny information, converged on the same topologies, showing strong support for scleractinian monophyly and a corallimorpharian sister clade. Our broad phylogenomic approach using sequence-based and sequence-independent analyses provides unambiguous evidence for the monophyly of scleractinian corals and the rejection of corallimorpharians as descendants of a complex coral ancestor.

  8. Microarray Expression Analyses of Arabidopsis Guard Cells and Isolation of a Recessive Abscisic Acid Hypersensitive Protein Phosphatase 2C MutantW⃞

    Science.gov (United States)

    Leonhardt, Nathalie; Kwak, June M.; Robert, Nadia; Waner, David; Leonhardt, Guillaume; Schroeder, Julian I.

    2004-01-01

    Oligomer-based DNA Affymetrix GeneChips representing about one-third of Arabidopsis (Arabidopsis thaliana) genes were used to profile global gene expression in a single cell type, guard cells, identifying 1309 guard cell–expressed genes. Highly pure preparations of guard cells and mesophyll cells were isolated in the presence of transcription inhibitors that prevented induction of stress-inducible genes during cell isolation procedures. Guard cell expression profiles were compared with those of mesophyll cells, resulting in identification of 64 transcripts expressed preferentially in guard cells. Many large gene families and gene duplications are known to exist in the Arabidopsis genome, giving rise to redundancies that greatly hamper conventional genetic and functional genomic analyses. The presented genomic scale analysis identifies redundant expression of specific isoforms belonging to large gene families at the single cell level, which provides a powerful tool for functional genomic characterization of the many signaling pathways that function in guard cells. Reverse transcription–PCR of 29 genes confirmed the reliability of GeneChip results. Statistical analyses of promoter regions of abscisic acid (ABA)–regulated genes reveal an overrepresented ABA responsive motif, which is the known ABA response element. Interestingly, expression profiling reveals ABA modulation of many known guard cell ABA signaling components at the transcript level. We further identified a highly ABA-induced protein phosphatase 2C transcript, AtP2C-HA, in guard cells. A T-DNA disruption mutation in AtP2C-HA confers ABA-hypersensitive regulation of stomatal closing and seed germination. The presented data provide a basis for cell type–specific genomic scale analyses of gene function. PMID:14973164

  9. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes.

    Science.gov (United States)

    Singh, Param Priya; Arora, Jatin; Isambert, Hervé

    2015-07-01

    Whole genome duplications (WGD) have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases.

  10. Development of a Rhizoctonia solani AG1-IB Specific Gene Model Enables Comparative Genome Analyses between Phytopathogenic R. solani AG1-IA, AG1-IB, AG3 and AG8 Isolates.

    Directory of Open Access Journals (Sweden)

    Daniel Wibberg

    Full Text Available Rhizoctonia solani, a soil-born plant pathogenic basidiomycetous fungus, affects various economically important agricultural and horticultural crops. The draft genome sequence for the R. solani AG1-IB isolate 7/3/14 as well as a corresponding transcriptome dataset (Expressed Sequence Tags--ESTs were established previously. Development of a specific R. solani AG1-IB gene model based on GMAP transcript mapping within the eukaryotic gene prediction platform AUGUSTUS allowed detection of new genes and provided insights into the gene structure of this fungus. In total, 12,616 genes were recognized in the genome of the AG1-IB isolate. Analysis of predicted genes by means of different bioinformatics tools revealed new genes whose products potentially are involved in degradation of plant cell wall components, melanin formation and synthesis of secondary metabolites. Comparative genome analyses between members of different R. solani anastomosis groups, namely AG1-IA, AG3 and AG8 and the newly annotated R. solani AG1-IB genome were performed within the comparative genomics platform EDGAR. It appeared that only 21 to 28% of all genes encoded in the draft genomes of the different strains were identified as core genes. Based on Average Nucleotide Identity (ANI and Average Amino-acid Identity (AAI analyses, considerable sequence differences between isolates representing different anastomosis groups were identified. However, R. solani isolates form a distinct cluster in relation to other fungi of the phylum Basidiomycota. The isolate representing AG1-IB encodes significant more genes featuring predictable functions in secondary metabolite production compared to other completely sequenced R. solani strains. The newly established R. solani AG1-IB 7/3/14 gene layout now provides a reliable basis for post-genomics studies.

  11. Implementing Genome-Driven Oncology

    Science.gov (United States)

    Hyman, David M.; Taylor, Barry S.; Baselga, José

    2017-01-01

    Early successes in identifying and targeting individual oncogenic drivers, together with the increasing feasibility of sequencing tumor genomes, have brought forth the promise of genome-driven oncology care. As we expand the breadth and depth of genomic analyses, the biological and clinical complexity of its implementation will be unparalleled. Challenges include target credentialing and validation, implementing drug combinations, clinical trial designs, targeting tumor heterogeneity, and deploying technologies beyond DNA sequencing, among others. We review how contemporary approaches are tackling these challenges and will ultimately serve as an engine for biological discovery and increase our insight into cancer and its treatment. PMID:28187282

  12. Population Genomics of Paramecium Species.

    Science.gov (United States)

    Johri, Parul; Krenek, Sascha; Marinov, Georgi K; Doak, Thomas G; Berendonk, Thomas U; Lynch, Michael

    2017-05-01

    Population-genomic analyses are essential to understanding factors shaping genomic variation and lineage-specific sequence constraints. The dearth of such analyses for unicellular eukaryotes prompted us to assess genomic variation in Paramecium, one of the most well-studied ciliate genera. The Paramecium aurelia complex consists of ∼15 morphologically indistinguishable species that diverged subsequent to two rounds of whole-genome duplications (WGDs, as long as 320 MYA) and possess extremely streamlined genomes. We examine patterns of both nuclear and mitochondrial polymorphism, by sequencing whole genomes of 10-13 worldwide isolates of each of three species belonging to the P. aurelia complex: P. tetraurelia, P. biaurelia, P. sexaurelia, as well as two outgroup species that do not share the WGDs: P. caudatum and P. multimicronucleatum. An apparent absence of global geographic population structure suggests continuous or recent dispersal of Paramecium over long distances. Intergenic regions are highly constrained relative to coding sequences, especially in P. caudatum and P. multimicronucleatum that have shorter intergenic distances. Sequence diversity and divergence are reduced up to ∼100-150 bp both upstream and downstream of genes, suggesting strong constraints imposed by the presence of densely packed regulatory modules. In addition, comparison of sequence variation at non-synonymous and synonymous sites suggests similar recent selective pressures on paralogs within and orthologs across the deeply diverging species. This study presents the first genome-wide population-genomic analysis in ciliates and provides a valuable resource for future studies in evolutionary and functional genetics in Paramecium. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  13. Phylogenomic, Pan-genomic, Pathogenomic and Evolutionary Genomic Insights into the Agronomically Relevant Enterobacteria Pantoea ananatis and Pantoea stewartii

    Directory of Open Access Journals (Sweden)

    Pieter De Maayer

    2017-09-01

    Full Text Available Pantoea ananatis is ubiquitously found in the environment and causes disease on a wide range of plant hosts. By contrast, its sister species, Pantoea stewartii subsp. stewartii is the host-specific causative agent of the devastating maize disease Stewart’s wilt. This pathogen has a restricted lifecycle, overwintering in an insect vector before being introduced into susceptible maize cultivars, causing disease and returning to overwinter in its vector. The other subspecies of P. stewartii subsp. indologenes, has been isolated from different plant hosts and is predicted to proliferate in different environmental niches. Here we have, by the use of comparative genomics and a comprehensive suite of bioinformatic tools, analyzed the genomes of ten P. stewartii and nineteen P. ananatis strains. Our phylogenomic analyses have revealed that there are two distinct clades within P. ananatis while far less phylogenetic diversity was observed among the P. stewartii subspecies. Pan-genome analyses revealed a large core genome comprising of 3,571 protein coding sequences is shared among the twenty-nine compared strains. Furthermore, we showed that an extensive accessory genome made up largely by a mobilome of plasmids, integrated prophages, integrative and conjugative elements and insertion elements has resulted in extensive diversification of P. stewartii and P. ananatis. While these organisms share many pathogenicity determinants, our comparative genomic analyses show that they differ in terms of the secretion systems they encode. The genomic differences identified in this study have allowed us to postulate on the divergent evolutionary histories of the analyzed P. ananatis and P. stewartii strains and on the molecular basis underlying their ecological success and host range.

  14. Phylogenomic, Pan-genomic, Pathogenomic and Evolutionary Genomic Insights into the Agronomically Relevant Enterobacteria Pantoea ananatis and Pantoea stewartii.

    Science.gov (United States)

    De Maayer, Pieter; Aliyu, Habibu; Vikram, Surendra; Blom, Jochen; Duffy, Brion; Cowan, Don A; Smits, Theo H M; Venter, Stephanus N; Coutinho, Teresa A

    2017-01-01

    Pantoea ananatis is ubiquitously found in the environment and causes disease on a wide range of plant hosts. By contrast, its sister species, Pantoea stewartii subsp. stewartii is the host-specific causative agent of the devastating maize disease Stewart's wilt. This pathogen has a restricted lifecycle, overwintering in an insect vector before being introduced into susceptible maize cultivars, causing disease and returning to overwinter in its vector. The other subspecies of P. stewartii subsp. indologenes , has been isolated from different plant hosts and is predicted to proliferate in different environmental niches. Here we have, by the use of comparative genomics and a comprehensive suite of bioinformatic tools, analyzed the genomes of ten P. stewartii and nineteen P. ananatis strains. Our phylogenomic analyses have revealed that there are two distinct clades within P. ananatis while far less phylogenetic diversity was observed among the P. stewartii subspecies. Pan-genome analyses revealed a large core genome comprising of 3,571 protein coding sequences is shared among the twenty-nine compared strains. Furthermore, we showed that an extensive accessory genome made up largely by a mobilome of plasmids, integrated prophages, integrative and conjugative elements and insertion elements has resulted in extensive diversification of P. stewartii and P. ananatis . While these organisms share many pathogenicity determinants, our comparative genomic analyses show that they differ in terms of the secretion systems they encode. The genomic differences identified in this study have allowed us to postulate on the divergent evolutionary histories of the analyzed P. ananatis and P. stewartii strains and on the molecular basis underlying their ecological success and host range.

  15. Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses.

    Science.gov (United States)

    Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A; Janke, Axel

    2015-05-27

    The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. Improved genome recovery and integrated cell-size analyses of individual uncultured microbial cells and viral particles.

    Science.gov (United States)

    Stepanauskas, Ramunas; Fergusson, Elizabeth A; Brown, Joseph; Poulton, Nicole J; Tupper, Ben; Labonté, Jessica M; Becraft, Eric D; Brown, Julia M; Pachiadaki, Maria G; Povilaitis, Tadas; Thompson, Brian P; Mascena, Corianna J; Bellows, Wendy K; Lubys, Arvydas

    2017-07-20

    Microbial single-cell genomics can be used to provide insights into the metabolic potential, interactions, and evolution of uncultured microorganisms. Here we present WGA-X, a method based on multiple displacement amplification of DNA that utilizes a thermostable mutant of the phi29 polymerase. WGA-X enhances genome recovery from individual microbial cells and viral particles while maintaining ease of use and scalability. The greatest improvements are observed when amplifying high G+C content templates, such as those belonging to the predominant bacteria in agricultural soils. By integrating WGA-X with calibrated index-cell sorting and high-throughput genomic sequencing, we are able to analyze genomic sequences and cell sizes of hundreds of individual, uncultured bacteria, archaea, protists, and viral particles, obtained directly from marine and soil samples, in a single experiment. This approach may find diverse applications in microbiology and in biomedical and forensic studies of humans and other multicellular organisms.Single-cell genomics can be used to study uncultured microorganisms. Here, Stepanauskas et al. present a method combining improved multiple displacement amplification and FACS, to obtain genomic sequences and cell size information from uncultivated microbial cells and viral particles in environmental samples.

  17. Whole-genome analyses resolve early branches in the tree of life of modern birds

    DEFF Research Database (Denmark)

    Sicheritz-Pontén, Thomas; Li, Cai; Li, Bo

    2014-01-01

    To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister...... or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator...... and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high...

  18. Evolution and Diversity of Transposable Elements in Vertebrate Genomes.

    Science.gov (United States)

    Sotero-Caio, Cibele G; Platt, Roy N; Suh, Alexander; Ray, David A

    2017-01-01

    Transposable elements (TEs) are selfish genetic elements that mobilize in genomes via transposition or retrotransposition and often make up large fractions of vertebrate genomes. Here, we review the current understanding of vertebrate TE diversity and evolution in the context of recent advances in genome sequencing and assembly techniques. TEs make up 4-60% of assembled vertebrate genomes, and deeply branching lineages such as ray-finned fishes and amphibians generally exhibit a higher TE diversity than the more recent radiations of birds and mammals. Furthermore, the list of taxa with exceptional TE landscapes is growing. We emphasize that the current bottleneck in genome analyses lies in the proper annotation of TEs and provide examples where superficial analyses led to misleading conclusions about genome evolution. Finally, recent advances in long-read sequencing will soon permit access to TE-rich genomic regions that previously resisted assembly including the gigantic, TE-rich genomes of salamanders and lungfishes. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  19. Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways

    Science.gov (United States)

    Scott, Robert A; Lagou, Vasiliki; Welch, Ryan P; Wheeler, Eleanor; Montasser, May E; Luan, Jian’an; Mägi, Reedik; Strawbridge, Rona J; Rehnberg, Emil; Gustafsson, Stefan; Kanoni, Stavroula; Rasmussen-Torvik, Laura J; Yengo, Loïc; Lecoeur, Cecile; Shungin, Dmitry; Sanna, Serena; Sidore, Carlo; Johnson, Paul C D; Jukema, J Wouter; Johnson, Toby; Mahajan, Anubha; Verweij, Niek; Thorleifsson, Gudmar; Hottenga, Jouke-Jan; Shah, Sonia; Smith, Albert V; Sennblad, Bengt; Gieger, Christian; Salo, Perttu; Perola, Markus; Timpson, Nicholas J; Evans, David M; Pourcain, Beate St; Wu, Ying; Andrews, Jeanette S; Hui, Jennie; Bielak, Lawrence F; Zhao, Wei; Horikoshi, Momoko; Navarro, Pau; Isaacs, Aaron; O’Connell, Jeffrey R; Stirrups, Kathleen; Vitart, Veronique; Hayward, Caroline; Esko, Tönu; Mihailov, Evelin; Fraser, Ross M; Fall, Tove; Voight, Benjamin F; Raychaudhuri, Soumya; Chen, Han; Lindgren, Cecilia M; Morris, Andrew P; Rayner, Nigel W; Robertson, Neil; Rybin, Denis; Liu, Ching-Ti; Beckmann, Jacques S; Willems, Sara M; Chines, Peter S; Jackson, Anne U; Kang, Hyun Min; Stringham, Heather M; Song, Kijoung; Tanaka, Toshiko; Peden, John F; Goel, Anuj; Hicks, Andrew A; An, Ping; Müller-Nurasyid, Martina; Franco-Cereceda, Anders; Folkersen, Lasse; Marullo, Letizia; Jansen, Hanneke; Oldehinkel, Albertine J; Bruinenberg, Marcel; Pankow, James S; North, Kari E; Forouhi, Nita G; Loos, Ruth J F; Edkins, Sarah; Varga, Tibor V; Hallmans, Göran; Oksa, Heikki; Antonella, Mulas; Nagaraja, Ramaiah; Trompet, Stella; Ford, Ian; Bakker, Stephan J L; Kong, Augustine; Kumari, Meena; Gigante, Bruna; Herder, Christian; Munroe, Patricia B; Caulfield, Mark; Antti, Jula; Mangino, Massimo; Small, Kerrin; Miljkovic, Iva; Liu, Yongmei; Atalay, Mustafa; Kiess, Wieland; James, Alan L; Rivadeneira, Fernando; Uitterlinden, Andre G; Palmer, Colin N A; Doney, Alex S F; Willemsen, Gonneke; Smit, Johannes H; Campbell, Susan; Polasek, Ozren; Bonnycastle, Lori L; Hercberg, Serge; Dimitriou, Maria; Bolton, Jennifer L; Fowkes, Gerard R; Kovacs, Peter; Lindström, Jaana; Zemunik, Tatijana; Bandinelli, Stefania; Wild, Sarah H; Basart, Hanneke V; Rathmann, Wolfgang; Grallert, Harald; Maerz, Winfried; Kleber, Marcus E; Boehm, Bernhard O; Peters, Annette; Pramstaller, Peter P; Province, Michael A; Borecki, Ingrid B; Hastie, Nicholas D; Rudan, Igor; Campbell, Harry; Watkins, Hugh; Farrall, Martin; Stumvoll, Michael; Ferrucci, Luigi; Waterworth, Dawn M; Bergman, Richard N; Collins, Francis S; Tuomilehto, Jaakko; Watanabe, Richard M; de Geus, Eco J C; Penninx, Brenda W; Hofman, Albert; Oostra, Ben A; Psaty, Bruce M; Vollenweider, Peter; Wilson, James F; Wright, Alan F; Hovingh, G Kees; Metspalu, Andres; Uusitupa, Matti; Magnusson, Patrik K E; Kyvik, Kirsten O; Kaprio, Jaakko; Price, Jackie F; Dedoussis, George V; Deloukas, Panos; Meneton, Pierre; Lind, Lars; Boehnke, Michael; Shuldiner, Alan R; van Duijn, Cornelia M; Morris, Andrew D; Toenjes, Anke; Peyser, Patricia A; Beilby, John P; Körner, Antje; Kuusisto, Johanna; Laakso, Markku; Bornstein, Stefan R; Schwarz, Peter E H; Lakka, Timo A; Rauramaa, Rainer; Adair, Linda S; Smith, George Davey; Spector, Tim D; Illig, Thomas; de Faire, Ulf; Hamsten, Anders; Gudnason, Vilmundur; Kivimaki, Mika; Hingorani, Aroon; Keinanen-Kiukaanniemi, Sirkka M; Saaristo, Timo E; Boomsma, Dorret I; Stefansson, Kari; van der Harst, Pim; Dupuis, Josée; Pedersen, Nancy L; Sattar, Naveed; Harris, Tamara B; Cucca, Francesco; Ripatti, Samuli; Salomaa, Veikko; Mohlke, Karen L; Balkau, Beverley; Froguel, Philippe; Pouta, Anneli; Jarvelin, Marjo-Riitta; Wareham, Nicholas J; Bouatia-Naji, Nabila; McCarthy, Mark I; Franks, Paul W; Meigs, James B; Teslovich, Tanya M; Florez, Jose C; Langenberg, Claudia; Ingelsson, Erik; Prokopenko, Inga; Barroso, Inês

    2012-01-01

    Through genome-wide association meta-analyses of up to 133,010 individuals of European ancestry without diabetes, including individuals newly genotyped using the Metabochip, we have raised the number of confirmed loci influencing glycemic traits to 53, of which 33 also increase type 2 diabetes risk (q fasting insulin showed association with lipid levels and fat distribution, suggesting impact on insulin resistance. Gene-based analyses identified further biologically plausible loci, suggesting that additional loci beyond those reaching genome-wide significance are likely to represent real associations. This conclusion is supported by an excess of directionally consistent and nominally significant signals between discovery and follow-up studies. Functional follow-up of these newly discovered loci will further improve our understanding of glycemic control. PMID:22885924

  20. Protecting genomic data analytics in the cloud: state of the art and opportunities.

    Science.gov (United States)

    Tang, Haixu; Jiang, Xiaoqian; Wang, Xiaofeng; Wang, Shuang; Sofia, Heidi; Fox, Dov; Lauter, Kristin; Malin, Bradley; Telenti, Amalio; Xiong, Li; Ohno-Machado, Lucila

    2016-10-13

    The outsourcing of genomic data into public cloud computing settings raises concerns over privacy and security. Significant advancements in secure computation methods have emerged over the past several years, but such techniques need to be rigorously evaluated for their ability to support the analysis of human genomic data in an efficient and cost-effective manner. With respect to public cloud environments, there are concerns about the inadvertent exposure of human genomic data to unauthorized users. In analyses involving multiple institutions, there is additional concern about data being used beyond agreed research scope and being prcoessed in untrused computational environments, which may not satisfy institutional policies. To systematically investigate these issues, the NIH-funded National Center for Biomedical Computing iDASH (integrating Data for Analysis, 'anonymization' and SHaring) hosted the second Critical Assessment of Data Privacy and Protection competition to assess the capacity of cryptographic technologies for protecting computation over human genomes in the cloud and promoting cross-institutional collaboration. Data scientists were challenged to design and engineer practical algorithms for secure outsourcing of genome computation tasks in working software, whereby analyses are performed only on encrypted data. They were also challenged to develop approaches to enable secure collaboration on data from genomic studies generated by multiple organizations (e.g., medical centers) to jointly compute aggregate statistics without sharing individual-level records. The results of the competition indicated that secure computation techniques can enable comparative analysis of human genomes, but greater efficiency (in terms of compute time and memory utilization) are needed before they are sufficiently practical for real world environments.

  1. The pig genome project has plenty to squeal about.

    Science.gov (United States)

    Fan, B; Gorbach, D M; Rothschild, M F

    2011-01-01

    Significant progress on pig genetics and genomics research has been witnessed in recent years due to the integration of advanced molecular biology techniques, bioinformatics and computational biology, and the collaborative efforts of researchers in the swine genomics community. Progress on expanding the linkage map has slowed down, but the efforts have created a higher-resolution physical map integrating the clone map and BAC end sequence. The number of QTL mapped is still growing and most of the updated QTL mapping results are available through PigQTLdb. Additionally, expression studies using high-throughput microarrays and other gene expression techniques have made significant advancements. The number of identified non-coding RNAs is rapidly increasing and their exact regulatory functions are being explored. A publishable draft (build 10) of the swine genome sequence was available for the pig genomics community by the end of December 2010. Build 9 of the porcine genome is currently available with Ensembl annotation; manual annotation is ongoing. These drafts provide useful tools for such endeavors as comparative genomics and SNP scans for fine QTL mapping. A recent community-wide effort to create a 60K porcine SNP chip has greatly facilitated whole-genome association analyses, haplotype block construction and linkage disequilibrium mapping, which can contribute to whole-genome selection. The future 'systems biology' that integrates and optimizes the information from all research levels can enhance the pig community's understanding of the full complexity of the porcine genome. These recent technological advances and where they may lead are reviewed. Copyright © 2011 S. Karger AG, Basel.

  2. The tiger genome and comparative analysis with lion and snow leopard genomes.

    Science.gov (United States)

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species.

  3. The tiger genome and comparative analysis with lion and snow leopard genomes

    Science.gov (United States)

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-uk; Luo, Shu-Jin; Johnson, Warren E.; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A.; Marker, Laurie; Harper, Cindy; Miller, Susan M.; Jacobs, Wilhelm; Bertola, Laura D.; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O’Brien, Stephen J.; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world’s most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  4. Genomics and the challenging translation into conservation practice

    Science.gov (United States)

    Aaron B. A. Shafer; Jochen B. W. Wolf; Paulo C. Alves; Linnea Bergstrom; Michael W. Bruford; Ioana Brannstrom; Guy Colling; Love Dalen; Luc De Meester; Robert Ekblom; Katie D. Fawcett; Simone Fior; Mehrdad Hajibabaei; Jason A. Hill; A. Rus Hoezel; Jacob Hoglund; Evelyn L. Jensen; Johannes Krause; Torsten N. Kristensen; Michael Krutzen; John K. McKay; Anita J. Norman; Rob Ogden; E. Martin Osterling; N. Joop Ouborg; John Piccolo; Danijela Popovic; Craig R. Primmer; Floyd A. Reed; Marie Roumet; Jordi Salmona; Tamara Schenekar; Michael K. Schwartz; Gernot Segelbacher; Helen Senn; Jens Thaulow; Mia Valtonen; Andrew Veale; Philippine Vergeer; Nagarjun Vijay; Carles Vila; Matthias Weissensteiner; Lovisa Wennerstrom; Christopher W. Wheat; Piotr Zielinski

    2015-01-01

    The global loss of biodiversity continues at an alarming rate. Genomic approaches have been suggested as a promising tool for conservation practice as scaling up to genome-wide data can improve traditional conservation genetic inferences and provide qualitatively novel insights. However, the generation of genomic data and subsequent analyses and interpretations remain...

  5. Genome update: the 1000th genome - a cautionary tale

    DEFF Research Database (Denmark)

    Lagesen, Karin; Ussery, David; Wassenaar, Gertrude Maria

    2010-01-01

    conclusions for example about the largest bacterial genome sequenced. Biological diversity is far greater than many have thought. For example, analysis of multiple Escherichia coli genomes has led to an estimate of around 45 000 gene families more genes than are recognized in the human genome. Moreover......There are now more than 1000 sequenced prokaryotic genomes deposited in public databases and available for analysis. Currently, although the sequence databases GenBank, DNA Database of Japan and EMBL are synchronized continually, there are slight differences in content at the genomes level...... for a variety of logistical reasons, including differences in format and loading errors, such as those caused by file transfer protocol interruptions. This means that the 1000th genome will be different in the various databases. Some of the data on the highly accessed web pages are inaccurate, leading to false...

  6. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

    Energy Technology Data Exchange (ETDEWEB)

    Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; Harmon-Smith, Miranda; Doud, Devin; Reddy, T. B. K.; Schulz, Frederik; Jarett, Jessica; Rivers, Adam R.; Eloe-Fadrosh, Emiley A.; Tringe, Susannah G.; Ivanova, Natalia N.; Copeland, Alex; Clum, Alicia; Becraft, Eric D.; Malmstrom, Rex R.; Birren, Bruce; Podar, Mircea; Bork, Peer; Weinstock, George M.; Garrity, George M.; Dodsworth, Jeremy A.; Yooseph, Shibu; Sutton, Granger; Glöckner, Frank O.; Gilbert, Jack A.; Nelson, William C.; Hallam, Steven J.; Jungbluth, Sean P.; Ettema, Thijs J. G.; Tighe, Scott; Konstantinidis, Konstantinos T.; Liu, Wen-Tso; Baker, Brett J.; Rattei, Thomas; Eisen, Jonathan A.; Hedlund, Brian; McMahon, Katherine D.; Fierer, Noah; Knight, Rob; Finn, Rob; Cochrane, Guy; Karsch-Mizrachi, Ilene; Tyson, Gene W.; Rinke, Christian; Kyrpides, Nikos C.; Schriml, Lynn; Garrity, George M.; Hugenholtz, Philip; Sutton, Granger; Yilmaz, Pelin; Meyer, Folker; Glöckner, Frank O.; Gilbert, Jack A.; Knight, Rob; Finn, Rob; Cochrane, Guy; Karsch-Mizrachi, Ilene; Lapidus, Alla; Meyer, Folker; Yilmaz, Pelin; Parks, Donovan H.; Eren, A. M.; Schriml, Lynn; Banfield, Jillian F.; Hugenholtz, Philip; Woyke, Tanja

    2017-08-08

    We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Gene Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.

  7. MVisAGe Identifies Concordant and Discordant Genomic Alterations of Driver Genes in Squamous Tumors.

    Science.gov (United States)

    Walter, Vonn; Du, Ying; Danilova, Ludmila; Hayward, Michele C; Hayes, D Neil

    2018-06-15

    Integrated analyses of multiple genomic datatypes are now common in cancer profiling studies. Such data present opportunities for numerous computational experiments, yet analytic pipelines are limited. Tools such as the cBioPortal and Regulome Explorer, although useful, are not easy to access programmatically or to implement locally. Here, we introduce the MVisAGe R package, which allows users to quantify gene-level associations between two genomic datatypes to investigate the effect of genomic alterations (e.g., DNA copy number changes on gene expression). Visualizing Pearson/Spearman correlation coefficients according to the genomic positions of the underlying genes provides a powerful yet novel tool for conducting exploratory analyses. We demonstrate its utility by analyzing three publicly available cancer datasets. Our approach highlights canonical oncogenes in chr11q13 that displayed the strongest associations between expression and copy number, including CCND1 and CTTN , genes not identified by copy number analysis in the primary reports. We demonstrate highly concordant usage of shared oncogenes on chr3q, yet strikingly diverse oncogene usage on chr11q as a function of HPV infection status. Regions of chr19 that display remarkable associations between methylation and gene expression were identified, as were previously unreported miRNA-gene expression associations that may contribute to the epithelial-to-mesenchymal transition. Significance: This study presents an important bioinformatics tool that will enable integrated analyses of multiple genomic datatypes. Cancer Res; 78(12); 3375-85. ©2018 AACR . ©2018 American Association for Cancer Research.

  8. The Naked Mole Rat Genome Resource : facilitating analyses of cancer and longevity-related adaptations

    OpenAIRE

    Keane, Michael; Craig, Thomas; Alfoldi, Jessica; Berlin, Aaron M; Johnson, Jeremy; Seluanov, Andrei; Gorbunova, Vera; Di Palma, Federica; Lindblad-Toh, Kerstin; Church, George M; de Magalhaes, Joao Pedro

    2014-01-01

    MOTIVATION: The naked mole rat (Heterocephalus glaber) is an exceptionally long-lived and cancer-resistant rodent native to East Africa. Although its genome was previously sequenced, here we report a new assembly sequenced by us with substantially higher N50 values for scaffolds and contigs. RESULTS: We analyzed the annotation of this new improved assembly and identified candidate genomic adaptations which may have contributed to the evolution of the naked mole rat's extraordinary traits, inc...

  9. Low-pass sequencing for microbial comparative genomics

    Directory of Open Access Journals (Sweden)

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  10. HLA diversity in the 1000 genomes dataset.

    Directory of Open Access Journals (Sweden)

    Pierre-Antoine Gourraud

    Full Text Available The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation by sequencing at a level that should allow the genome-wide detection of most variants with frequencies as low as 1%. However, in the major histocompatibility complex (MHC, only the top 10 most frequent haplotypes are in the 1% frequency range whereas thousands of haplotypes are present at lower frequencies. Given the limitation of both the coverage and the read length of the sequences generated by the 1000 Genomes Project, the highly variable positions that define HLA alleles may be difficult to identify. We used classical Sanger sequencing techniques to type the HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1 genes in the available 1000 Genomes samples and combined the results with the 103,310 variants in the MHC region genotyped by the 1000 Genomes Project. Using pairwise identity-by-descent distances between individuals and principal component analysis, we established the relationship between ancestry and genetic diversity in the MHC region. As expected, both the MHC variants and the HLA phenotype can identify the major ancestry lineage, informed mainly by the most frequent HLA haplotypes. To some extent, regions of the genome with similar genetic or similar recombination rate have similar properties. An MHC-centric analysis underlines departures between the ancestral background of the MHC and the genome-wide picture. Our analysis of linkage disequilibrium (LD decay in these samples suggests that overestimation of pairwise LD occurs due to a limited sampling of the MHC diversity. This collection of HLA-specific MHC variants, available on the dbMHC portal, is a valuable resource for future analyses of the role of MHC in population and disease studies.

  11. Genomics and fish adaptation

    Directory of Open Access Journals (Sweden)

    Agostinho Antunes

    2015-12-01

    Full Text Available The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of varied fish species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

  12. Genome-wide association study of circulating estradiol, testosterone, and sex hormone-binding globulin in postmenopausal women.

    Directory of Open Access Journals (Sweden)

    Jennifer Prescott

    Full Text Available Genome-wide association studies (GWAS have successfully identified common genetic variants that contribute to breast cancer risk. Discovering additional variants has become difficult, as power to detect variants of weaker effect with present sample sizes is limited. An alternative approach is to look for variants associated with quantitative traits that in turn affect disease risk. As exposure to high circulating estradiol and testosterone, and low sex hormone-binding globulin (SHBG levels is implicated in breast cancer etiology, we conducted GWAS analyses of plasma estradiol, testosterone, and SHBG to identify new susceptibility alleles. Cancer Genetic Markers of Susceptibility (CGEMS data from the Nurses' Health Study (NHS, and Sisters in Breast Cancer Screening data were used to carry out primary meta-analyses among ~1600 postmenopausal women who were not taking postmenopausal hormones at blood draw. We observed a genome-wide significant association between SHBG levels and rs727428 (joint β = -0.126; joint P = 2.09 × 10(-16, downstream of the SHBG gene. No genome-wide significant associations were observed with estradiol or testosterone levels. Among variants that were suggestively associated with estradiol (P<10(-5, several were located at the CYP19A1 gene locus. Overall results were similar in secondary meta-analyses that included ~900 NHS current postmenopausal hormone users. No variant associated with estradiol, testosterone, or SHBG at P<10(-5 was associated with postmenopausal breast cancer risk among CGEMS participants. Our results suggest that the small magnitude of difference in hormone levels associated with common genetic variants is likely insufficient to detectably contribute to breast cancer risk.

  13. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  14. Ecology and genomics of Bacillus subtilis.

    Science.gov (United States)

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2008-06-01

    Bacillus subtilis is a remarkably diverse bacterial species that is capable of growth within many environments. Recent microarray-based comparative genomic analyses have revealed that members of this species also exhibit considerable genomic diversity. The identification of strain-specific genes might explain how B. subtilis has become so broadly adapted. The goal of identifying ecologically adaptive genes could soon be realized with the imminent release of several new B. subtilis genome sequences. As we embark upon this exciting new era of B. subtilis comparative genomics we review what is currently known about the ecology and evolution of this species.

  15. Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication

    Science.gov (United States)

    vonHoldt, Bridgett M.; Pollinger, John P.; Lohmueller, Kirk E.; Han, Eunjung; Parker, Heidi G.; Quignon, Pascale; Degenhardt, Jeremiah D.; Boyko, Adam R.; Earl, Dent A.; Auton, Adam; Reynolds, Andy; Bryc, Kasia; Brisbin, Abra; Knowles, James C.; Mosher, Dana S.; Spady, Tyrone C.; Elkahloun, Abdel; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Greco, Claudia; Randi, Ettore; Bannasch, Danika; Wilton, Alan; Shearman, Jeremy; Musiani, Marco; Cargill, Michelle; Jones, Paul G.; Qian, Zuwei; Huang, Wei; Ding, Zhao-Li; Zhang, Ya-ping; Bustamante, Carlos D.; Ostrander, Elaine A.; Novembre, John; Wayne, Robert K.

    2010-01-01

    Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication1,2. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data3. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity. PMID:20237475

  16. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Science.gov (United States)

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  17. Comparison of Genome-Wide Association Methods in Analyses of Admixed Populations with Complex Familial Relationships

    DEFF Research Database (Denmark)

    Kadri, Naveen; Guldbrandtsen, Bernt; Sørensen, Peter

    2014-01-01

    Population structure is known to cause false-positive detection in association studies. We compared the power, precision, and type-I error rates of various association models in analyses of a simulated dataset with structure at the population (admixture from two populations; P) and family (K......) levels. We also compared type-I error rates among models in analyses of publicly available human and dog datasets. The models corrected for none, one, or both structure levels. Correction for K was performed with linear mixed models incorporating familial relationships estimated from pedigrees or genetic...... corrected for P. In contrast, correction for P alone in linear models was insufficient. The power and precision of linear mixed models with and without correction for P were similar. Furthermore, power, precision, and type-I error rate were comparable in linear mixed models incorporating pedigree...

  18. Genome Sequences of Marine Shrimp Exopalaemon carinicauda Holthuis Provide Insights into Genome Size Evolution of Caridea.

    Science.gov (United States)

    Yuan, Jianbo; Gao, Yi; Zhang, Xiaojun; Wei, Jiankai; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2017-07-05

    Crustacea, particularly Decapoda, contains many economically important species, such as shrimps and crabs. Crustaceans exhibit enormous (nearly 500-fold) variability in genome size. However, limited genome resources are available for investigating these species. Exopalaemon carinicauda Holthuis, an economical caridean shrimp, is a potential ideal experimental animal for research on crustaceans. In this study, we performed low-coverage sequencing and de novo assembly of the E. carinicauda genome. The assembly covers more than 95% of coding regions. E. carinicauda possesses a large complex genome (5.73 Gb), with size twice higher than those of many decapod shrimps. As such, comparative genomic analyses were implied to investigate factors affecting genome size evolution of decapods. However, clues associated with genome duplication were not identified, and few horizontally transferred sequences were detected. Ultimately, the burst of transposable elements, especially retrotransposons, was determined as the major factor influencing genome expansion. A total of 2 Gb repeats were identified, and RTE-BovB, Jockey, Gypsy, and DIRS were the four major retrotransposons that significantly expanded. Both recent (Jockey and Gypsy) and ancestral (DIRS) originated retrotransposons responsible for the genome evolution. The E. carinicauda genome also exhibited potential for the genomic and experimental research of shrimps.

  19. A hybrid reference-guided de novo assembly approach for generating Cyclospora mitochondrion genomes.

    Science.gov (United States)

    Gopinath, G R; Cinar, H N; Murphy, H R; Durigan, M; Almeria, M; Tall, B D; DaSilva, A J

    2018-01-01

    Cyclospora cayetanensis is a coccidian parasite associated with large and complex foodborne outbreaks worldwide. Linking samples from cyclosporiasis patients during foodborne outbreaks with suspected contaminated food sources, using conventional epidemiological methods, has been a persistent challenge. To address this issue, development of new methods based on potential genomically-derived markers for strain-level identification has been a priority for the food safety research community. The absence of reference genomes to identify nucleotide and structural variants with a high degree of confidence has limited the application of using sequencing data for source tracking during outbreak investigations. In this work, we determined the quality of a high resolution, curated, public mitochondrial genome assembly to be used as a reference genome by applying bioinformatic analyses. Using this reference genome, three new mitochondrial genome assemblies were built starting with metagenomic reads generated by sequencing DNA extracted from oocysts present in stool samples from cyclosporiasis patients. Nucleotide variants were identified in the new and other publicly available genomes in comparison with the mitochondrial reference genome. A consolidated workflow, presented here, to generate new mitochondrion genomes using our reference-guided de novo assembly approach could be useful in facilitating the generation of other mitochondrion sequences, and in their application for subtyping C. cayetanensis strains during foodborne outbreak investigations.

  20. Cultural energy analyses of dairy cattle receiving different concentrate levels

    International Nuclear Information System (INIS)

    Koknaroglu, Hayati

    2010-01-01

    Purpose of this study was to conduct cultural energy analyses of dairy cows receiving different levels of concentrate. Data were acquired by conducting a survey on 132 dairy farms selected by the stratified random sampling method. Dairy cattle farms were divided into three groups according to concentrate level and were analyzed. Accordingly concentrate levels were assigned as low (LLC) ( 50%, 44 farms). Cultural energy used for feed for cows was calculated by multiplying each ingredient with corresponding values of ingredients from literature. Transportation energy was also included in the analysis. Total cultural energy expended was highest for LLC (P < 0.05). Cultural energy expended for feed constituted more than half of the total cultural energy and was highest for LLC (P < 0.05). Cultural energy expended per kg milk and per Mcal protein energy was higher for LLC (P < 0.05). Efficiency defined as Mcal input/Mcal output was better for ILC and was worse for LLC (P < 0.05) and HLC was intermediate thus not differing from other groups. Results show that cultural energy use efficiency does not linearly increases as concentrate level increases and increasing concentrate level does not necessarily mean better efficiency. Thus optimum concentrate level not interfering cows performance should be sought for sustainable dairy production.

  1. Genomic and bioinformatics analyses of HAdV-4vac and HAdV-7vac, two human adenovirus (HAdV) strains that constituted original prophylaxis against HAdV-related acute respiratory disease, a reemerging epidemic disease.

    Science.gov (United States)

    Purkayastha, Anjan; Su, Jing; McGraw, John; Ditty, Susan E; Hadfield, Ted L; Seto, Jason; Russell, Kevin L; Tibbetts, Clark; Seto, Donald

    2005-07-01

    Vaccine strains of human adenovirus serotypes 4 and 7 (HAdV-4vac and HAdV-7vac) have been used successfully to prevent adenovirus-related acute respiratory disease outbreaks. The genomes of these two vaccine strains have been sequenced, annotated, and compared with their prototype equivalents with the goals of understanding their genomes for molecular diagnostics applications, vaccine redevelopment, and HAdV pathoepidemiology. These reference genomes are archived in GenBank as HAdV-4vac (35,994 bp; AY594254) and HAdV-7vac (35,240 bp; AY594256). Bioinformatics and comparative whole-genome analyses with their recently reported and archived prototype genomes reveal six mismatches and four insertions-deletions (indels) between the HAdV-4 prototype and vaccine strains, in contrast to the 611 mismatches and 130 indels between the HAdV-7 prototype and vaccine strains. Annotation reveals that the HAdV-4vac and HAdV-7vac genomes contain 51 and 50 coding units, respectively. Neither vaccine strain appears to be attenuated for virulence based on bioinformatics analyses. There is evidence of genome recombination, as the inverted terminal repeat of HAdV-4vac is initially identical to that of species C whereas the prototype is identical to species B1. These vaccine reference sequences yield unique genome signatures for molecular diagnostics. As a molecular forensics application, these references identify the circulating and problematic 1950s era field strains as the original HAdV-4 prototype and the Greider prototype, from which the vaccines are derived. Thus, they are useful for genomic comparisons to current epidemic and reemerging field strains, as well as leading to an understanding of pathoepidemiology among the human adenoviruses.

  2. Family-based Association Analyses of Imputed Genotypes Reveal Genome-Wide Significant Association of Alzheimer’s disease with OSBPL6, PTPRG and PDCL3

    Science.gov (United States)

    Herold, Christine; Hooli, Basavaraj V.; Mullin, Kristina; Liu, Tian; Roehr, Johannes T; Mattheisen, Manuel; Parrado, Antonio R.; Bertram, Lars; Lange, Christoph; Tanzi, Rudolph E.

    2015-01-01

    The genetic basis of Alzheimer's disease (AD) is complex and heterogeneous. Over 200 highly penetrant pathogenic variants in the genes APP, PSEN1 and PSEN2 cause a subset of early-onset familial Alzheimer's disease (EOFAD). On the other hand, susceptibility to late-onset forms of AD (LOAD) is indisputably associated to the ε4 allele in the gene APOE, and more recently to variants in more than two-dozen additional genes identified in the large-scale genome-wide association studies (GWAS) and meta-analyses reports. Taken together however, although the heritability in AD is estimated to be as high as 80%, a large proportion of the underlying genetic factors still remain to be elucidated. In this study we performed a systematic family-based genome-wide association and meta-analysis on close to 15 million imputed variants from three large collections of AD families (~3,500 subjects from 1,070 families). Using a multivariate phenotype combining affection status and onset age, meta-analysis of the association results revealed three single nucleotide polymorphisms (SNPs) that achieved genome-wide significance for association with AD risk: rs7609954 in the gene PTPRG (P-value = 3.98·10−08), rs1347297 in the gene OSBPL6 (P-value = 4.53·10−08), and rs1513625 near PDCL3 (P-value = 4.28·10−08). In addition, rs72953347 in OSBPL6 (P-value = 6.36·10−07) and two SNPs in the gene CDKAL1 showed marginally significant association with LOAD (rs10456232, P-value: 4.76·10−07; rs62400067, P-value: 3.54·10−07). In summary, family-based GWAS meta-analysis of imputed SNPs revealed novel genomic variants in (or near) PTPRG, OSBPL6, and PDCL3 that influence risk for AD with genome-wide significance. PMID:26830138

  3. Psychiatric genome-wide association study analyses implicate neuronal, immune and histone pathways

    DEFF Research Database (Denmark)

    O'Dushlaine, Colm; Rossin, Lizzy; Lee, Phil H.

    2015-01-01

    Genome-wide association studies (GWAS) of psychiatric disorders have identified multiple genetic associations with such disorders, but better methods are needed to derive the underlying biological mechanisms that these signals indicate. We sought to identify biological pathways in GWAS data from ...

  4. Psychiatric genome-wide association study analyses implicate neuronal, immune and histone pathways

    NARCIS (Netherlands)

    O'Dushlaine, Colm; Rossin, Lizzy; Lee, Phil H.; Duncan, Laramie; Parikshak, Neelroop N.; Newhouse, Stephen; Ripke, Stephan; Neale, Benjamin M.; Purcell, Shaun M.; Posthuma, Danielle; Nurnberger, John I.; Lee, S. Hong; Faraone, Stephen V.; Perlis, Roy H.; Mowry, Bryan J.; Thapar, Anita; Goddard, Michael E.; Witte, John S.; Absher, Devin; Agartz, Ingrid; Akil, Huda; Amin, Farooq; Andreassen, Ole A.; Anjorin, Adebayo; Anney, Richard; Anttila, Verneri; Arking, Dan E.; Asherson, Philip; Azevedo, Maria H.; Backlund, Lena; Badner, Judith A.; Bailey, Anthony J.; Banaschewski, Tobias; Barchas, Jack D.; Barnes, Michael R.; Barrett, Thomas B.; Bass, Nicholas; Battaglia, Agatino; Bauer, Michael; Bayes, Monica; Bellivier, Frank; Bergen, Sarah E.; Berrettini, Wade; Betancur, Catalina; Bettecken, Thomas; Biederman, Joseph; Binder, Elisabeth B.; Bruggeman, Richard; Nolen, Willem A.; Penninx, Brenda W.

    Genome-wide association studies (GWAS) of psychiatric disorders have identified multiple genetic associations with such disorders, but better methods are needed to derive the underlying biological mechanisms that these signals indicate. We sought to identify biological pathways in GWAS data from

  5. Psychiatric genome-wide association study analyses implicate neuronal, immune and histone pathways

    NARCIS (Netherlands)

    O'Dushlaine, Colm; Rossin, Lizzy; Lee, Phil H.; Duncan, Laramie; Parikshak, Neelroop N.; Newhouse, Stephen; Ripke, Stephan; Neale, Benjamin M.; Purcell, Shaun M.; Posthuma, Danielle; Nurnberger, John I.; Lee, S. Hong; Faraone, Stephen V.; Perlis, Roy H.; Mowry, Bryan J.; Thapar, Anita; Goddard, Michael E.; Witte, John S.; Absher, Devin; Agartz, Ingrid; Akil, Huda; Amin, Farooq; Andreassen, Ole A.; Anjorin, Adebayo; Anney, Richard; Anttila, Verneri; Arking, Dan E.; Asherson, Philip; Azevedo, Maria H.; Backlund, Lena; Badner, Judith A.; Bailey, Anthony J.; Banaschewski, Tobias; Barchas, Jack D.; Barnes, Michael R.; Barrett, Thomas B.; Bass, Nicholas; Battaglia, Agatino; Bauer, Michael; Bayés, Mònica; Bellivier, Frank; Bergen, Sarah E.; Berrettini, Wade; Betancur, Catalina; Bettecken, Thomas; Biederman, Joseph; Binder, Elisabeth B.; Black, Donald W.; de Haan, Lieuwe; Linszen, Don H.

    2015-01-01

    Genome-wide association studies (GWAS) of psychiatric disorders have identified multiple genetic associations with such disorders, but better methods are needed to derive the underlying biological mechanisms that these signals indicate. We sought to identify biological pathways in GWAS data from

  6. Considerations for creating and annotating the budding yeast Genome Map at SGD: a progress report.

    Science.gov (United States)

    Chan, Esther T; Cherry, J Michael

    2012-01-01

    The Saccharomyces Genome Database (SGD) is compiling and annotating a comprehensive catalogue of functional sequence elements identified in the budding yeast genome. Recent advances in deep sequencing technologies have enabled for example, global analyses of transcription profiling and assembly of maps of transcription factor occupancy and higher order chromatin organization, at nucleotide level resolution. With this growing influx of published genome-scale data, come new challenges for their storage, display, analysis and integration. Here, we describe SGD's progress in the creation of a consolidated resource for genome sequence elements in the budding yeast, the considerations taken in its design and the lessons learned thus far. The data within this collection can be accessed at http://browse.yeastgenome.org and downloaded from http://downloads.yeastgenome.org. DATABASE URL: http://www.yeastgenome.org.

  7. Diversity in non-repetitive human sequences not found in the reference genome.

    Science.gov (United States)

    Kehr, Birte; Helgadottir, Anna; Melsted, Pall; Jonsson, Hakon; Helgason, Hannes; Jonasdottir, Adalbjörg; Jonasdottir, Aslaug; Sigurdsson, Asgeir; Gylfason, Arnaldur; Halldorsson, Gisli H; Kristmundsdottir, Snaedis; Thorgeirsson, Gudmundur; Olafsson, Isleifur; Holm, Hilma; Thorsteinsdottir, Unnur; Sulem, Patrick; Helgason, Agnar; Gudbjartsson, Daniel F; Halldorsson, Bjarni V; Stefansson, Kari

    2017-04-01

    Genomes usually contain some non-repetitive sequences that are missing from the reference genome and occur only in a population subset. Such non-repetitive, non-reference (NRNR) sequences have remained largely unexplored in terms of their characterization and downstream analyses. Here we describe 3,791 breakpoint-resolved NRNR sequence variants called using PopIns from whole-genome sequence data of 15,219 Icelanders. We found that over 95% of the 244 NRNR sequences that are 200 bp or longer are present in chimpanzees, indicating that they are ancestral. Furthermore, 149 variant loci are in linkage disequilibrium (r 2 > 0.8) with a genome-wide association study (GWAS) catalog marker, suggesting disease relevance. Additionally, we report an association (P = 3.8 × 10 -8 , odds ratio (OR) = 0.92) with myocardial infarction (23,360 cases, 300,771 controls) for a 766-bp NRNR sequence variant. Our results underline the importance of including variation of all complexity levels when searching for variants that associate with disease.

  8. Assembly of the Lactuca sativa, L. cv. Tizian draft genome sequence reveals differences within major resistance complex 1 as compared to the cv. Salinas reference genome.

    Science.gov (United States)

    Verwaaijen, Bart; Wibberg, Daniel; Nelkner, Johanna; Gordin, Miriam; Rupp, Oliver; Winkler, Anika; Bremges, Andreas; Blom, Jochen; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2018-02-10

    Lettuce (Lactuca sativa, L.) is an important annual plant of the family Asteraceae (Compositae). The commercial lettuce cultivar Tizian has been used in various scientific studies investigating the interaction of the plant with phytopathogens or biological control agents. Here, we present the de novo draft genome sequencing and gene prediction for this specific cultivar derived from transcriptome sequence data. The assembled scaffolds amount to a size of 2.22 Gb. Based on RNAseq data, 31,112 transcript isoforms were identified. Functional predictions for these transcripts were determined within the GenDBE annotation platform. Comparison with the cv. Salinas reference genome revealed a high degree of sequence similarity on genome and transcriptome levels, with an average amino acid identity of 99%. Furthermore, it was observed that two large regions are either missing or are highly divergent within the cv. Tizian genome compared to cv. Salinas. One of these regions covers the major resistance complex 1 region of cv. Salinas. The cv. Tizian draft genome sequence provides a valuable resource for future functional and transcriptome analyses focused on this lettuce cultivar. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Protecting genomic data analytics in the cloud: state of the art and opportunities

    Directory of Open Access Journals (Sweden)

    Haixu Tang

    2016-10-01

    Full Text Available Abstract The outsourcing of genomic data into public cloud computing settings raises concerns over privacy and security. Significant advancements in secure computation methods have emerged over the past several years, but such techniques need to be rigorously evaluated for their ability to support the analysis of human genomic data in an efficient and cost-effective manner. With respect to public cloud environments, there are concerns about the inadvertent exposure of human genomic data to unauthorized users. In analyses involving multiple institutions, there is additional concern about data being used beyond agreed research scope and being prcoessed in untrused computational environments, which may not satisfy institutional policies. To systematically investigate these issues, the NIH-funded National Center for Biomedical Computing iDASH (integrating Data for Analysis, ‘anonymization’ and SHaring hosted the second Critical Assessment of Data Privacy and Protection competition to assess the capacity of cryptographic technologies for protecting computation over human genomes in the cloud and promoting cross-institutional collaboration. Data scientists were challenged to design and engineer practical algorithms for secure outsourcing of genome computation tasks in working software, whereby analyses are performed only on encrypted data. They were also challenged to develop approaches to enable secure collaboration on data from genomic studies generated by multiple organizations (e.g., medical centers to jointly compute aggregate statistics without sharing individual-level records. The results of the competition indicated that secure computation techniques can enable comparative analysis of human genomes, but greater efficiency (in terms of compute time and memory utilization are needed before they are sufficiently practical for real world environments.

  10. Draft genome sequences of two virulent serotypes of avian Pasteurella multocida

    Science.gov (United States)

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent Pasteurella multocida strain Pm70....

  11. Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida

    OpenAIRE

    Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.; Briggs, Robert E.

    2013-01-01

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P.?multocida strain Pm70.

  12. Pathophysiology of MDS: genomic aberrations.

    Science.gov (United States)

    Ichikawa, Motoshi

    2016-01-01

    Myelodysplastic syndromes (MDS) are characterized by clonal proliferation of hematopoietic stem/progenitor cells and their apoptosis, and show a propensity to progress to acute myelogenous leukemia (AML). Although MDS are recognized as neoplastic diseases caused by genomic aberrations of hematopoietic cells, the details of the genetic abnormalities underlying disease development have not as yet been fully elucidated due to difficulties in analyzing chromosomal abnormalities. Recent advances in comprehensive analyses of disease genomes including whole-genome sequencing technologies have revealed the genomic abnormalities in MDS. Surprisingly, gene mutations were found in approximately 80-90% of cases with MDS, and the novel mutations discovered with these technologies included previously unknown, MDS-specific, mutations such as those of the genes in the RNA-splicing machinery. It is anticipated that these recent studies will shed new light on the pathophysiology of MDS due to genomic aberrations.

  13. Rules of engagement: perspectives on stakeholder engagement for genomic biobanking research in South Africa.

    Science.gov (United States)

    Staunton, Ciara; Tindana, Paulina; Hendricks, Melany; Moodley, Keymanthri

    2018-02-27

    Genomic biobanking research is undergoing exponential growth in Africa raising a host of legal, ethical and social issues. Given the scientific complexity associated with genomics, there is a growing recognition globally of the importance of science translation and community engagement (CE) for this type of research, as it creates the potential to build relationships, increase trust, improve consent processes and empower local communities. Despite this level of recognition, there is a lack of empirical evidence of the practise and processes for effective CE in genomic biobanking in Africa. To begin to address this vacuum, 17 in-depth face to face interviews were conducted with South African experts in genomic biobanking research and CE to provide insight into the process, benefits and challenges of CE in South Africa. Emerging themes were analysed using a contextualised thematic approach. Several themes emerged concerning the conduct of CE in genomic biobanking research in Africa. Although the literature tends to focus on the local community in CE, respondents in this study described three different layers of stakeholder engagement: community level, peer level and high level. Community level engagement includes potential participants, community advisory boards (CAB) and field workers; peer level engagement includes researchers, biobankers and scientists, while high level engagement includes government officials, funders and policy makers. Although education of each stakeholder layer is important, education of the community layer can be most challenging, due to the complexity of the research and educational levels of stakeholders in this layer. CE is time-consuming and often requires an interdisciplinary research team approach. However careful planning of the engagement strategy, including an understanding of the differing layers of stakeholder engagement, and the specific educational needs at each layer, can help in the development of a relationship based on trust

  14. Advances in Exercise, Fitness, and Performance Genomics in 2015.

    Science.gov (United States)

    Sarzynski, Mark A; Loos, Ruth J F; Lucia, Alejandro; Pérusse, Louis; Roth, Stephen M; Wolfarth, Bernd; Rankinen, Tuomo; Bouchard, Claude

    2016-10-01

    This review of the exercise genomics literature encompasses the highest-quality articles published in 2015 across seven broad topics: physical activity behavior, muscular strength and power, cardiorespiratory fitness and endurance performance, body weight and adiposity, insulin and glucose metabolism, lipid and lipoprotein metabolism, and hemodynamic traits. One study used a quantitative trait locus for wheel running in mice to identify single nucleotide polymorphisms (SNPs) in humans associated with physical activity levels. Two studies examined the association of candidate gene ACTN3 R577X genotype on muscular performance. Several studies examined gene-physical activity interactions on cardiometabolic traits. One study showed that physical inactivity exacerbated the body mass index (BMI)-increasing effect of an FTO SNP but only in individuals of European ancestry, whereas another showed that high-density lipoprotein cholesterol (HDL-C) SNPs from genome-wide association studies exerted a smaller effect in active individuals. Increased levels of moderate-to-vigorous-intensity physical activity were associated with higher Matsuda insulin sensitivity index in PPARG Ala12 carriers but not Pro12 homozygotes. One study combined genome-wide and transcriptome-wide profiling to identify genes and SNPs associated with the response of triglycerides (TG) to exercise training. The genome-wide association study results showed that four SNPs accounted for all of the heritability of △TG, whereas the baseline expression of 11 genes predicted 27% of △TG. A composite SNP score based on the top eight SNPs derived from the genomic and transcriptomic analyses was the strongest predictor of ΔTG, explaining 14% of the variance. The review concludes with a discussion of a conceptual framework defining some of the critical conditions for exercise genomics studies and highlights the importance of the recently launched National Institutes of Health Common Fund program titled "Molecular

  15. Comparative genomics using data mining tools

    Indian Academy of Sciences (India)

    We have analysed the genomes of representatives of three kingdoms of life, namely, archaea, eubacteria and eukaryota using data mining tools based on compositional analyses of the protein sequences. The representatives chosen in this analysis were Methanococcus jannaschii, Haemophilus influenzae and ...

  16. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  17. The draft genome of Tibetan hulless barley reveals adaptive patterns to the high stressful Tibetan Plateau.

    Science.gov (United States)

    Zeng, Xingquan; Long, Hai; Wang, Zhuo; Zhao, Shancen; Tang, Yawei; Huang, Zhiyong; Wang, Yulin; Xu, Qijun; Mao, Likai; Deng, Guangbing; Yao, Xiaoming; Li, Xiangfeng; Bai, Lijun; Yuan, Hongjun; Pan, Zhifen; Liu, Renjian; Chen, Xin; WangMu, QiMei; Chen, Ming; Yu, Lili; Liang, Junjun; DunZhu, DaWa; Zheng, Yuan; Yu, Shuiyang; LuoBu, ZhaXi; Guang, Xuanmin; Li, Jiang; Deng, Cao; Hu, Wushu; Chen, Chunhai; TaBa, XiongNu; Gao, Liyun; Lv, Xiaodan; Abu, Yuval Ben; Fang, Xiaodong; Nevo, Eviatar; Yu, Maoqun; Wang, Jun; Tashi, Nyima

    2015-01-27

    The Tibetan hulless barley (Hordeum vulgare L. var. nudum), also called "Qingke" in Chinese and "Ne" in Tibetan, is the staple food for Tibetans and an important livestock feed in the Tibetan Plateau. The diploid nature and adaptation to diverse environments of the highland give it unique resources for genetic research and crop improvement. Here we produced a 3.89-Gb draft assembly of Tibetan hulless barley with 36,151 predicted protein-coding genes. Comparative analyses revealed the divergence times and synteny between barley and other representative Poaceae genomes. The expansion of the gene family related to stress responses was found in Tibetan hulless barley. Resequencing of 10 barley accessions uncovered high levels of genetic variation in Tibetan wild barley and genetic divergence between Tibetan and non-Tibetan barley genomes. Selective sweep analyses demonstrate adaptive correlations of genes under selection with extensive environmental variables. Our results not only construct a genomic framework for crop improvement but also provide evolutionary insights of highland adaptation of Tibetan hulless barley.

  18. [Sequencing and analysis of complete genome of rabies viruses isolated from Chinese Ferret-Badger and dog in Zhejiang province].

    Science.gov (United States)

    Lei, Yong-Liang; Wang, Xiao-Guang; Tao, Xiao-Yan; Li, Hao; Meng, Sheng-Li; Chen, Xiu-Ying; Liu, Fu-Ming; Ye, Bi-Feng; Tang, Qing

    2010-01-01

    Based on sequencing the full-length genomes of four Chinese Ferret-Badger and dog, we analyze the properties of rabies viruses genetic variation in molecular level, get the information about rabies viruses prevalence and variation in Zhejiang, and enrich the genome database of rabies viruses street strains isolated from China. Rabies viruses in suckling mice were isolated, overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses from Chinese Ferret-Badger, dog, sika deer, vole, used vaccine strain were determined. The four full-length genomes were sequenced completely and had the same genetic structure with the length of 11, 923 nts or 11, 925 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions(IGRs), 423 nts-Pseudogene-like sequence (psi), 70 nts-Trailer. The four full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by BLAST and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the four full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so the nucleotide mutations happened in these four genomes were most synonymous mutations. Compared with the reference rabies viruses, the lengths of the five protein coding regions had no change, no recombination, only with a few point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the four genomes were similar to the reference vaccine or street strains. And the four strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessed the distinct district characteristics of China. Therefore, these four rabies viruses are likely to be street viruses

  19. Improved genomic resources and new bioinformatic workflow for the carcinogenic parasite Clonorchis sinensis: Biotechnological implications.

    Science.gov (United States)

    Wang, Daxi; Korhonen, Pasi K; Gasser, Robin B; Young, Neil D

    Clonorchis sinensis (family Opisthorchiidae) is an important foodborne parasite that has a major socioeconomic impact on ~35 million people predominantly in China, Vietnam, Korea and the Russian Far East. In humans, infection with C. sinensis causes clonorchiasis, a complex hepatobiliary disease that can induce cholangiocarcinoma (CCA), a malignant cancer of the bile ducts. Central to understanding the epidemiology of this disease is knowledge of genetic variation within and among populations of this parasite. Although most published molecular studies seem to suggest that C. sinensis represents a single species, evidence of karyotypic variation within C. sinensis and cryptic species within a related opisthorchiid fluke (Opisthorchis viverrini) emphasise the importance of studying and comparing the genes and genomes of geographically distinct isolates of C. sinensis. Recently, we sequenced, assembled and characterised a draft nuclear genome of a C. sinensis isolate from Korea and compared it with a published draft genome of a Chinese isolate of this species using a bioinformatic workflow established for comparing draft genome assemblies and their gene annotations. We identified that 50.6% and 51.3% of the Korean and Chinese C. sinensis genomic scaffolds were syntenic, respectively. Within aligned syntenic blocks, the genomes had a high level of nucleotide identity (99.1%) and encoded 15 variable proteins likely to be involved in diverse biological processes. Here, we review current technical challenges of using draft genome assemblies to undertake comparative genomic analyses to quantify genetic variation between isolates of the same species. Using a workflow that overcomes these challenges, we report on a high-quality draft genome for C. sinensis from Korea and comparative genomic analyses, as a basis for future investigations of the genetic structures of C. sinensis populations, and discuss the biotechnological implications of these explorations. Copyright © 2018

  20. Functional Insights from Structural Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  1. Genome sequences and comparative genomics of two Lactobacillus ruminis strains from the bovine and human intestinal tracts

    LENUS (Irish Health Repository)

    2011-08-30

    Abstract Background The genus Lactobacillus is characterized by an extraordinary degree of phenotypic and genotypic diversity, which recent genomic analyses have further highlighted. However, the choice of species for sequencing has been non-random and unequal in distribution, with only a single representative genome from the L. salivarius clade available to date. Furthermore, there is no data to facilitate a functional genomic analysis of motility in the lactobacilli, a trait that is restricted to the L. salivarius clade. Results The 2.06 Mb genome of the bovine isolate Lactobacillus ruminis ATCC 27782 comprises a single circular chromosome, and has a G+C content of 44.4%. In silico analysis identified 1901 coding sequences, including genes for a pediocin-like bacteriocin, a single large exopolysaccharide-related cluster, two sortase enzymes, two CRISPR loci and numerous IS elements and pseudogenes. A cluster of genes related to a putative pilin was identified, and shown to be transcribed in vitro. A high quality draft assembly of the genome of a second L. ruminis strain, ATCC 25644 isolated from humans, suggested a slightly larger genome of 2.138 Mb, that exhibited a high degree of synteny with the ATCC 27782 genome. In contrast, comparative analysis of L. ruminis and L. salivarius identified a lack of long-range synteny between these closely related species. Comparison of the L. salivarius clade core proteins with those of nine other Lactobacillus species distributed across 4 major phylogenetic groups identified the set of shared proteins, and proteins unique to each group. Conclusions The genome of L. ruminis provides a comparative tool for directing functional analyses of other members of the L. salivarius clade, and it increases understanding of the divergence of this distinct Lactobacillus lineage from other commensal lactobacilli. The genome sequence provides a definitive resource to facilitate investigation of the genetics, biochemistry and host

  2. Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-03-12

    The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scale genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.

  3. Pedigree and genomic analyses of feed consumption and residual feed intake in laying hens.

    Science.gov (United States)

    Wolc, Anna; Arango, Jesus; Jankowski, Tomasz; Settar, Petek; Fulton, Janet E; O'Sullivan, Neil P; Fernando, Rohan; Garrick, Dorian J; Dekkers, Jack C M

    2013-09-01

    Efficiency of production is increasingly important with the current escalation of feed costs and demands to minimize the environmental footprint. The objectives of this study were 1) to estimate heritabilities for daily feed consumption and residual feed intake and their genetic correlations with production and egg-quality traits; 2) to evaluate accuracies of estimated breeding values from pedigree- and marker-based prediction models; and 3) to localize genomic regions associated with feed efficiency in a brown egg layer line. Individual feed intake data collected over 2-wk trial periods were available for approximately 6,000 birds from 8 generations. Genetic parameters were estimated with a multitrait animal model; methods BayesB and BayesCπ were used to estimate marker effects and find genomic regions associated with feed efficiency. Using pedigree information, feed efficiency was found to be moderately heritable (h(2) = 0.46 for daily feed consumption and 0.47 for residual feed intake). Hens that consumed more feed and had greater residual feed intake (lower efficiency) had a genetic tendency to lay slightly more eggs with greater yolk weights and albumen heights. Regions on chromosomes 1, 2, 4, 7, 13, and Z were found to be associated with feed intake and efficiency. The accuracy from genomic prediction was higher and more persistent (better maintained across generations) than that from pedigree-based prediction. These results indicate that genomic selection can be used to improve feed efficiency in layers.

  4. Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels

    DEFF Research Database (Denmark)

    Van Leeuwen, Elisabeth M.; Karssen, Lennart C.; Deelen, Joris

    2015-01-01

    created by the Genome of the Netherlands Project and perform association testing with blood lipid levels. We report the discovery of five novel associations at four loci (P value -4), including a rare missense variant in ABCA6 (rs77542162, p.Cys1359Arg, frequency 0.034), which is predicted...

  5. Multiple Genome Sequences of Lactobacillus plantarum Strains

    OpenAIRE

    Kafka, Thomas A.; Geissler, Andreas J.; Vogel, Rudi F.

    2017-01-01

    ABSTRACT We report here the genome sequences of four Lactobacillus plantarum strains which vary in surface hydrophobicity. Bioinformatic analysis, using additional genomes of Lactobacillus plantarum strains, revealed a possible correlation between the cell wall teichoic acid-type and cell surface hydrophobicity and provide the basis for consecutive analyses.

  6. Comparative mitogenomics of Braconidae (Insecta: Hymenoptera) and the phylogenetic utility of mitochondrial genomes with special reference to Holometabolous insects

    Science.gov (United States)

    2010-01-01

    Background Animal mitochondrial genomes are potential models for molecular evolution and markers for phylogenetic and population studies. Previous research has shown interesting features in hymenopteran mitochondrial genomes. Here, we conducted a comparative study of mitochondrial genomes of the family Braconidae, one of the largest families of Hymenoptera, and assessed the utility of mitochondrial genomic data for phylogenetic inference at three different hierarchical levels, i.e., Braconidae, Hymenoptera, and Holometabola. Results Seven mitochondrial genomes from seven subfamilies of Braconidae were sequenced. Three of the four sequenced A+T-rich regions are shown to be inverted. Furthermore, all species showed reversal of strand asymmetry, suggesting that inversion of the A+T-rich region might be a synapomorphy of the Braconidae. Gene rearrangement events occurred in all braconid species, but gene rearrangement rates were not taxonomically correlated. Most rearranged genes were tRNAs, except those of Cotesia vestalis, in which 13 protein-coding genes and 14 tRNA genes changed positions or/and directions through three kinds of gene rearrangement events. Remote inversion is posited to be the result of two independent recombination events. Evolutionary rates were lower in species of the cyclostome group than those of noncyclostomes. Phylogenetic analyses based on complete mitochondrial genomes and secondary structure of rrnS supported a sister-group relationship between Aphidiinae and cyclostomes. Many well accepted relationships within Hymenoptera, such as paraphyly of Symphyta and Evaniomorpha, a sister-group relationship between Orussoidea and Apocrita, and monophyly of Proctotrupomorpha, Ichneumonoidea and Aculeata were robustly confirmed. New hypotheses, such as a sister-group relationship between Evanioidea and Aculeata, were generated. Among holometabolous insects, Hymenoptera was shown to be the sister to all other orders. Mecoptera was recovered as the

  7. Comparative mitogenomics of Braconidae (Insecta: Hymenoptera and the phylogenetic utility of mitochondrial genomes with special reference to Holometabolous insects

    Directory of Open Access Journals (Sweden)

    Shi Min

    2010-06-01

    Full Text Available Abstract Background Animal mitochondrial genomes are potential models for molecular evolution and markers for phylogenetic and population studies. Previous research has shown interesting features in hymenopteran mitochondrial genomes. Here, we conducted a comparative study of mitochondrial genomes of the family Braconidae, one of the largest families of Hymenoptera, and assessed the utility of mitochondrial genomic data for phylogenetic inference at three different hierarchical levels, i.e., Braconidae, Hymenoptera, and Holometabola. Results Seven mitochondrial genomes from seven subfamilies of Braconidae were sequenced. Three of the four sequenced A+T-rich regions are shown to be inverted. Furthermore, all species showed reversal of strand asymmetry, suggesting that inversion of the A+T-rich region might be a synapomorphy of the Braconidae. Gene rearrangement events occurred in all braconid species, but gene rearrangement rates were not taxonomically correlated. Most rearranged genes were tRNAs, except those of Cotesia vestalis, in which 13 protein-coding genes and 14 tRNA genes changed positions or/and directions through three kinds of gene rearrangement events. Remote inversion is posited to be the result of two independent recombination events. Evolutionary rates were lower in species of the cyclostome group than those of noncyclostomes. Phylogenetic analyses based on complete mitochondrial genomes and secondary structure of rrnS supported a sister-group relationship between Aphidiinae and cyclostomes. Many well accepted relationships within Hymenoptera, such as paraphyly of Symphyta and Evaniomorpha, a sister-group relationship between Orussoidea and Apocrita, and monophyly of Proctotrupomorpha, Ichneumonoidea and Aculeata were robustly confirmed. New hypotheses, such as a sister-group relationship between Evanioidea and Aculeata, were generated. Among holometabolous insects, Hymenoptera was shown to be the sister to all other orders

  8. Proxy decision making and dementia: Using Construal Level Theory to analyse the thoughts of decision makers.

    Science.gov (United States)

    Convey, Helen; Holt, Janet; Summers, Barbara

    2018-03-08

    This study explored the feasibility of using Construal Level Theory to analyse proxy decision maker thinking about a hypothetical ethical dilemma, relating to a person who has dementia. Proxy decision makers make decisions on behalf of individuals who are living with dementia when dementia affects that individual's decision making ability. Ethical dilemmas arise because there is a need to balance the individual's past and contemporary values and views. Understanding of how proxy decision makers respond is incomplete. Construal Level Theory contends that individuals imagine reactions and make predications about the future by crossing psychological distance. This involves abstract thinking, giving meaning to decisions. There is no empirical evidence of Construal Level Theory being used to analyse proxy decision maker thinking. Exploring the feasibility of using Construal Level Theory to understand dementia carer thinking regarding proxy decisions may provide insights which inform the support given. Descriptive qualitative research with semi-structured interviews. Seven participants were interviewed using a hypothetical dementia care scenario in February 2016. Interview transcripts were analysed for themes. Construal Level Theory was applied to analyse participant responses within themes using the Linguistic Category Model. Participants travelled across psychological distance, using abstract thinking to clarify goals and provide a basis for decisions. When thinking concretely participants established boundaries regarding the ethical dilemma. Construal Level Theory gives insight into proxy decision maker thinking and the levels of abstraction used. Understanding what dementia carers think about when making proxy decisions may help nurses to understand their perspectives and to provide appropriate support. © 2018 John Wiley & Sons Ltd.

  9. Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida

    Science.gov (United States)

    Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.

    2013-01-01

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P. multocida strain Pm70. PMID:23405337

  10. ClusTrack: feature extraction and similarity measures for clustering of genome-wide data sets.

    Directory of Open Access Journals (Sweden)

    Halfdan Rydbeck

    Full Text Available Clustering is a popular technique for explorative analysis of data, as it can reveal subgroupings and similarities between data in an unsupervised manner. While clustering is routinely applied to gene expression data, there is a lack of appropriate general methodology for clustering of sequence-level genomic and epigenomic data, e.g. ChIP-based data. We here introduce a general methodology for clustering data sets of coordinates relative to a genome assembly, i.e. genomic tracks. By defining appropriate feature extraction approaches and similarity measures, we allow biologically meaningful clustering to be performed for genomic tracks using standard clustering algorithms. An implementation of the methodology is provided through a tool, ClusTrack, which allows fine-tuned clustering analyses to be specified through a web-based interface. We apply our methods to the clustering of occupancy of the H3K4me1 histone modification in samples from a range of different cell types. The majority of samples form meaningful subclusters, confirming that the definitions of features and similarity capture biological, rather than technical, variation between the genomic tracks. Input data and results are available, and can be reproduced, through a Galaxy Pages document at http://hyperbrowser.uio.no/hb/u/hb-superuser/p/clustrack. The clustering functionality is available as a Galaxy tool, under the menu option "Specialized analyzis of tracks", and the submenu option "Cluster tracks based on genome level similarity", at the Genomic HyperBrowser server: http://hyperbrowser.uio.no/hb/.

  11. The Banana Genome Hub

    Science.gov (United States)

    Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie

    2013-01-01

    Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967

  12. A genome-wide association study reveals variants in ARL15 that influence adiponectin levels.

    Directory of Open Access Journals (Sweden)

    J Brent Richards

    2009-12-01

    Full Text Available The adipocyte-derived protein adiponectin is highly heritable and inversely associated with risk of type 2 diabetes mellitus (T2D and coronary heart disease (CHD. We meta-analyzed 3 genome-wide association studies for circulating adiponectin levels (n = 8,531 and sought validation of the lead single nucleotide polymorphisms (SNPs in 5 additional cohorts (n = 6,202. Five SNPs were genome-wide significant in their relationship with adiponectin (P< or =5x10(-8. We then tested whether these 5 SNPs were associated with risk of T2D and CHD using a Bonferroni-corrected threshold of P< or =0.011 to declare statistical significance for these disease associations. SNPs at the adiponectin-encoding ADIPOQ locus demonstrated the strongest associations with adiponectin levels (P-combined = 9.2x10(-19 for lead SNP, rs266717, n = 14,733. A novel variant in the ARL15 (ADP-ribosylation factor-like 15 gene was associated with lower circulating levels of adiponectin (rs4311394-G, P-combined = 2.9x10(-8, n = 14,733. This same risk allele at ARL15 was also associated with a higher risk of CHD (odds ratio [OR] = 1.12, P = 8.5x10(-6, n = 22,421 more nominally, an increased risk of T2D (OR = 1.11, P = 3.2x10(-3, n = 10,128, and several metabolic traits. Expression studies in humans indicated that ARL15 is well-expressed in skeletal muscle. These findings identify a novel protein, ARL15, which influences circulating adiponectin levels and may impact upon CHD risk.

  13. Whole Genome Analyses of a Well-Differentiated Liposarcoma Reveals Novel SYT1 and DDR2 Rearrangements

    Science.gov (United States)

    Egan, Jan B.; Barrett, Michael T.; Champion, Mia D.; Middha, Sumit; Lenkiewicz, Elizabeth; Evers, Lisa; Francis, Princy; Schmidt, Jessica; Shi, Chang-Xin; Van Wier, Scott; Badar, Sandra; Ahmann, Gregory; Kortuem, K. Martin; Boczek, Nicole J.; Fonseca, Rafael; Craig, David W.; Carpten, John D.; Borad, Mitesh J.; Stewart, A. Keith

    2014-01-01

    Liposarcoma is the most common soft tissue sarcoma, but little is known about the genomic basis of this disease. Given the low cell content of this tumor type, we utilized flow cytometry to isolate the diploid normal and aneuploid tumor populations from a well-differentiated liposarcoma prior to array comparative genomic hybridization and whole genome sequencing. This work revealed massive highly focal amplifications throughout the aneuploid tumor genome including MDM2, a gene that has previously been found to be amplified in well-differentiated liposarcoma. Structural analysis revealed massive rearrangement of chromosome 12 and 11 gene fusions, some of which may be part of double minute chromosomes commonly present in well-differentiated liposarcoma. We identified a hotspot of genomic instability localized to a region of chromosome 12 that includes a highly conserved, putative L1 retrotransposon element, LOC100507498 which resides within a gene cluster (NAV3, SYT1, PAWR) where 6 of the 11 fusion events occurred. Interestingly, a potential gene fusion was also identified in amplified DDR2, which is a potential therapeutic target of kinase inhibitors such as dastinib, that are not routinely used in the treatment of patients with liposarcoma. Furthermore, 7 somatic, damaging single nucleotide variants have also been identified, including D125N in the PTPRQ protein. In conclusion, this work is the first to report the entire genome of a well-differentiated liposarcoma with novel chromosomal rearrangements associated with amplification of therapeutically targetable genes such as MDM2 and DDR2. PMID:24505276

  14. Whole genome analyses of a well-differentiated liposarcoma reveals novel SYT1 and DDR2 rearrangements.

    Directory of Open Access Journals (Sweden)

    Jan B Egan

    Full Text Available Liposarcoma is the most common soft tissue sarcoma, but little is known about the genomic basis of this disease. Given the low cell content of this tumor type, we utilized flow cytometry to isolate the diploid normal and aneuploid tumor populations from a well-differentiated liposarcoma prior to array comparative genomic hybridization and whole genome sequencing. This work revealed massive highly focal amplifications throughout the aneuploid tumor genome including MDM2, a gene that has previously been found to be amplified in well-differentiated liposarcoma. Structural analysis revealed massive rearrangement of chromosome 12 and 11 gene fusions, some of which may be part of double minute chromosomes commonly present in well-differentiated liposarcoma. We identified a hotspot of genomic instability localized to a region of chromosome 12 that includes a highly conserved, putative L1 retrotransposon element, LOC100507498 which resides within a gene cluster (NAV3, SYT1, PAWR where 6 of the 11 fusion events occurred. Interestingly, a potential gene fusion was also identified in amplified DDR2, which is a potential therapeutic target of kinase inhibitors such as dastinib, that are not routinely used in the treatment of patients with liposarcoma. Furthermore, 7 somatic, damaging single nucleotide variants have also been identified, including D125N in the PTPRQ protein. In conclusion, this work is the first to report the entire genome of a well-differentiated liposarcoma with novel chromosomal rearrangements associated with amplification of therapeutically targetable genes such as MDM2 and DDR2.

  15. The challenges of genome-wide interaction studies: lessons to learn from the analysis of HDL blood levels.

    Directory of Open Access Journals (Sweden)

    Elisabeth M van Leeuwen

    Full Text Available Genome-wide association studies (GWAS have revealed 74 single nucleotide polymorphisms (SNPs associated with high-density lipoprotein cholesterol (HDL blood levels. This study is, to our knowledge, the first genome-wide interaction study (GWIS to identify SNP×SNP interactions associated with HDL levels. We performed a GWIS in the Rotterdam Study (RS cohort I (RS-I using the GLIDE tool which leverages the massively parallel computing power of Graphics Processing Units (GPUs to perform linear regression on all genome-wide pairs of SNPs. By performing a meta-analysis together with Rotterdam Study cohorts II and III (RS-II and RS-III, we were able to filter 181 interaction terms with a p-value<1 · 10-8 that replicated in the two independent cohorts. We were not able to replicate any of these interaction term in the AGES, ARIC, CHS, ERF, FHS and NFBC-66 cohorts (Ntotal = 30,011 when adjusting for multiple testing. Our GWIS resulted in the consistent finding of a possible interaction between rs774801 in ARMC8 (ENSG00000114098 and rs12442098 in SPATA8 (ENSG00000185594 being associated with HDL levels. However, p-values do not reach the preset Bonferroni correction of the p-values. Our study suggest that even for highly genetically determined traits such as HDL the sample sizes needed to detect SNP×SNP interactions are large and the 2-step filtering approaches do not yield a solution. Here we present our analysis plan and our reservations concerning GWIS.

  16. Genomic analyses identify molecular subtypes of pancreatic cancer.

    Science.gov (United States)

    Bailey, Peter; Chang, David K; Nones, Katia; Johns, Amber L; Patch, Ann-Marie; Gingras, Marie-Claude; Miller, David K; Christ, Angelika N; Bruxner, Tim J C; Quinn, Michael C; Nourse, Craig; Murtaugh, L Charles; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourbakhsh, Ehsan; Wani, Shivangi; Fink, Lynn; Holmes, Oliver; Chin, Venessa; Anderson, Matthew J; Kazakoff, Stephen; Leonard, Conrad; Newell, Felicity; Waddell, Nick; Wood, Scott; Xu, Qinying; Wilson, Peter J; Cloonan, Nicole; Kassahn, Karin S; Taylor, Darrin; Quek, Kelly; Robertson, Alan; Pantano, Lorena; Mincarelli, Laura; Sanchez, Luis N; Evers, Lisa; Wu, Jianmin; Pinese, Mark; Cowley, Mark J; Jones, Marc D; Colvin, Emily K; Nagrial, Adnan M; Humphrey, Emily S; Chantrill, Lorraine A; Mawson, Amanda; Humphris, Jeremy; Chou, Angela; Pajic, Marina; Scarlett, Christopher J; Pinho, Andreia V; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S; Kench, James G; Lovell, Jessica A; Merrett, Neil D; Toon, Christopher W; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Moran-Jones, Kim; Jamieson, Nigel B; Graham, Janet S; Duthie, Fraser; Oien, Karin; Hair, Jane; Grützmann, Robert; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Corbo, Vincenzo; Bassi, Claudio; Rusev, Borislav; Capelli, Paola; Salvia, Roberto; Tortora, Giampaolo; Mukhopadhyay, Debabrata; Petersen, Gloria M; Munzy, Donna M; Fisher, William E; Karim, Saadia A; Eshleman, James R; Hruban, Ralph H; Pilarsky, Christian; Morton, Jennifer P; Sansom, Owen J; Scarpa, Aldo; Musgrove, Elizabeth A; Bailey, Ulla-Maja Hagbo; Hofmann, Oliver; Sutherland, Robert L; Wheeler, David A; Gill, Anthony J; Gibbs, Richard A; Pearson, John V; Waddell, Nicola; Biankin, Andrew V; Grimmond, Sean M

    2016-03-03

    Integrated genomic analysis of 456 pancreatic ductal adenocarcinomas identified 32 recurrently mutated genes that aggregate into 10 pathways: KRAS, TGF-β, WNT, NOTCH, ROBO/SLIT signalling, G1/S transition, SWI-SNF, chromatin modification, DNA repair and RNA processing. Expression analysis defined 4 subtypes: (1) squamous; (2) pancreatic progenitor; (3) immunogenic; and (4) aberrantly differentiated endocrine exocrine (ADEX) that correlate with histopathological characteristics. Squamous tumours are enriched for TP53 and KDM6A mutations, upregulation of the TP63∆N transcriptional network, hypermethylation of pancreatic endodermal cell-fate determining genes and have a poor prognosis. Pancreatic progenitor tumours preferentially express genes involved in early pancreatic development (FOXA2/3, PDX1 and MNX1). ADEX tumours displayed upregulation of genes that regulate networks involved in KRAS activation, exocrine (NR5A2 and RBPJL), and endocrine differentiation (NEUROD1 and NKX2-2). Immunogenic tumours contained upregulated immune networks including pathways involved in acquired immune suppression. These data infer differences in the molecular evolution of pancreatic cancer subtypes and identify opportunities for therapeutic development.

  17. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes.

    Science.gov (United States)

    Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M; Murphy, Robert W; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

    2015-03-17

    The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies.

  18. Meta-analysis of human genome-microbiome association studies: the MiBioGen consortium initiative.

    Science.gov (United States)

    Wang, Jun; Kurilshikov, Alexander; Radjabzadeh, Djawad; Turpin, Williams; Croitoru, Kenneth; Bonder, Marc Jan; Jackson, Matthew A; Medina-Gomez, Carolina; Frost, Fabian; Homuth, Georg; Rühlemann, Malte; Hughes, David; Kim, Han-Na; Spector, Tim D; Bell, Jordana T; Steves, Claire J; Timpson, Nicolas; Franke, Andre; Wijmenga, Cisca; Meyer, Katie; Kacprowski, Tim; Franke, Lude; Paterson, Andrew D; Raes, Jeroen; Kraaij, Robert; Zhernakova, Alexandra

    2018-06-08

    In recent years, human microbiota, especially gut microbiota, have emerged as an important yet complex trait influencing human metabolism, immunology, and diseases. Many studies are investigating the forces underlying the observed variation, including the human genetic variants that shape human microbiota. Several preliminary genome-wide association studies (GWAS) have been completed, but more are necessary to achieve a fuller picture. Here, we announce the MiBioGen consortium initiative, which has assembled 18 population-level cohorts and some 19,000 participants. Its aim is to generate new knowledge for the rapidly developing field of microbiota research. Each cohort has surveyed the gut microbiome via 16S rRNA sequencing and genotyped their participants with full-genome SNP arrays. We have standardized the analytical pipelines for both the microbiota phenotypes and genotypes, and all the data have been processed using identical approaches. Our analysis of microbiome composition shows that we can reduce the potential artifacts introduced by technical differences in generating microbiota data. We are now in the process of benchmarking the association tests and performing meta-analyses of genome-wide associations. All pipeline and summary statistics results will be shared using public data repositories. We present the largest consortium to date devoted to microbiota-GWAS. We have adapted our analytical pipelines to suit multi-cohort analyses and expect to gain insight into host-microbiota cross-talk at the genome-wide level. And, as an open consortium, we invite more cohorts to join us (by contacting one of the corresponding authors) and to follow the analytical pipeline we have developed.

  19. Genome-wide Analyses Identify KIF5A as a Novel ALS Gene

    NARCIS (Netherlands)

    Nicolas, Aude; Kenna, Kevin P.; Renton, Alan E.; Ticozzi, Nicola; Faghri, Faraz; Chia, Ruth; Dominov, Janice A.; Kenna, Brendan J.; Nalls, Mike A.; Keagle, Pamela; Rivera, Alberto M.; van Rheenen, Wouter; Murphy, Natalie A.; van Vugt, Joke J.F.A.; Geiger, Joshua T.; van der Spek, Rick; Pliner, Hannah A.; Smith, Bradley N.; Marangi, Giuseppe; Topp, Simon D.; Abramzon, Yevgeniya; Gkazi, Athina Soragia; Eicher, John D.; Kenna, Aoife; Logullo, Francesco O.; Simone, Isabella L.; Logroscino, Giancarlo; Salvi, Fabrizio; Bartolomei, Ilaria; Borghero, Giuseppe; Murru, Maria Rita; Costantino, Emanuela; Pani, Carla; Puddu, Roberta; Caredda, Carla; Piras, Valeria; Tranquilli, Stefania; Cuccu, Stefania; Corongiu, Daniela; Melis, Maurizio; Milia, Antonio; Marrosu, Francesco; Marrosu, Maria Giovanna; Floris, Gianluca; Cannas, Antonino; Capasso, Margherita; Caponnetto, Claudia; Mancardi, Gianluigi; Origone, Paola; Mandich, Paola; Conforti, Francesca L.; Cavallaro, Sebastiano; Mora, Gabriele; Marinou, Kalliopi; Sideri, Riccardo; Penco, Silvana; Mosca, Lorena; Lunetta, Christian; Pinter, Giuseppe Lauria; Corbo, Massimo; Riva, Nilo; Carrera, Paola; Volanti, Paolo; Mandrioli, Jessica; Fini, Nicola; Fasano, Antonio; Tremolizzo, Lucio; Arosio, Alessandro; Ferrarese, Carlo; Trojsi, Francesca; Tedeschi, Gioacchino; Monsurrò, Maria Rosaria; Piccirillo, Giovanni; Femiano, Cinzia; Ticca, Anna; Ortu, Enzo; La Bella, Vincenzo; Spataro, Rossella; Colletti, Tiziana; Sabatelli, Mario; Zollino, Marcella; Conte, Amelia; Luigetti, Marco; Lattante, Serena; Marangi, Giuseppe; Santarelli, Marialuisa; Petrucci, Antonio; Pugliatti, Maura; Pirisi, Angelo; Parish, Leslie D.; Occhineri, Patrizia; Giannini, Fabio; Battistini, Stefania; Ricci, Claudia; Benigni, Michele; Cau, Tea B.; Loi, Daniela; Calvo, Andrea; Moglia, Cristina; Brunetti, Maura; Barberis, Marco; Restagno, Gabriella; Casale, Federico; Marrali, Giuseppe; Fuda, Giuseppe; Ossola, Irene; Cammarosano, Stefania; Canosa, Antonio; Ilardi, Antonio; Manera, Umberto; Grassano, Maurizio; Tanel, Raffaella; Pisano, Fabrizio; Mora, Gabriele; Calvo, Andrea; Mazzini, Letizia; Riva, Nilo; Mandrioli, Jessica; Caponnetto, Claudia; Battistini, Stefania; Volanti, Paolo; La Bella, Vincenzo; Conforti, Francesca L.; Borghero, Giuseppe; Messina, Sonia; Simone, Isabella L.; Trojsi, Francesca; Salvi, Fabrizio; Logullo, Francesco O.; D'Alfonso, Sandra; Corrado, Lucia; Capasso, Margherita; Ferrucci, Luigi; Harms, Matthew B.; Goldstein, David B.; Shneider, Neil A.; Goutman, Stephen A.; Simmons, Zachary; Miller, Timothy M.; Chandran, Siddharthan; Pal, Suvankar; Manousakis, George; Appel, Stanley H.; Simpson, Ericka; Wang, Leo; Baloh, Robert H.; Gibson, Summer B.; Bedlack, Richard; Lacomis, David; Sareen, Dhruv; Sherman, Alexander; Bruijn, Lucie; Penny, Michelle; Moreno, Cristiane de Araujo Martins; Kamalakaran, Sitharthan; Goldstein, David B.; Allen, Andrew S.; Appel, Stanley; Baloh, Robert H.; Bedlack, Richard S.; Boone, Braden E.; Brown, Robert; Carulli, John P.; Chesi, Alessandra; Chung, Wendy K.; Cirulli, Elizabeth T.; Cooper, Gregory M.; Couthouis, Julien; Day-Williams, Aaron G.; Dion, Patrick A.; Gibson, Summer B.; Gitler, Aaron D.; Glass, Jonathan D.; Goldstein, David B.; Han, Yujun; Harms, Matthew B.; Harris, Tim; Hayes, Sebastian D.; Jones, Angela L.; Keebler, Jonathan; Krueger, Brian J.; Lasseigne, Brittany N.; Levy, Shawn E.; Lu, Yi Fan; Maniatis, Tom; McKenna-Yasek, Diane; Miller, Timothy M.; Myers, Richard M.; Petrovski, Slavé; Pulst, Stefan M.; Raphael, Alya R.; Ravits, John M.; Ren, Zhong; Rouleau, Guy A.; Sapp, Peter C.; Shneider, Neil A.; Simpson, Ericka; Sims, Katherine B.; Staropoli, John F.; Waite, Lindsay L.; Wang, Quanli; Wimbish, Jack R.; Xin, Winnie W.; Gitler, Aaron D.; Harris, Tim; Myers, Richard M.; Phatnani, Hemali; Kwan, Justin; Sareen, Dhruv; Broach, James R.; Simmons, Zachary; Arcila-Londono, Ximena; Lee, Edward B.; Van Deerlin, Vivianna M.; Shneider, Neil A.; Fraenkel, Ernest; Ostrow, Lyle W.; Baas, Frank; Zaitlen, Noah; Berry, James D.; Malaspina, Andrea; Fratta, Pietro; Cox, Gregory A.; Thompson, Leslie M.; Finkbeiner, Steve; Dardiotis, Efthimios; Miller, Timothy M.; Chandran, Siddharthan; Pal, Suvankar; Hornstein, Eran; MacGowan, Daniel J.L.; Heiman-Patterson, Terry D.; Hammell, Molly G.; Patsopoulos, Nikolaos A.; Dubnau, Joshua; Nath, Avindra; Phatnani, Hemali; Musunuri, Rajeeva Lochan; Evani, Uday Shankar; Abhyankar, Avinash; Zody, Michael C.; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K.; LeNail, Alexander; Lima, Leandro; Fraenkel, Ernest; Rothstein, Jeffrey D.; Svendsen, Clive N.; Thompson, Leslie M.; Van Eyk, Jenny; Maragakis, Nicholas J.; Berry, James D.; Glass, Jonathan D.; Miller, Timothy M.; Kolb, Stephen J.; Baloh, Robert H.; Cudkowicz, Merit; Baxi, Emily; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K.; Finkbeiner, Steven; LeNail, Alex; Lima, Leandro; Fraenkel, Ernest; Fraenkel, Ernest; Svendsen, Clive N.; Svendsen, Clive N.; Thompson, Leslie M.; Thompson, Leslie M.; Van Eyk, Jennifer E.; Berry, James D.; Berry, James D.; Miller, Timothy M.; Kolb, Stephen J.; Cudkowicz, Merit; Cudkowicz, Merit; Baxi, Emily; Benatar, Michael; Taylor, J. Paul; Wu, Gang; Rampersaud, Evadnie; Wuu, Joanne; Rademakers, Rosa; Züchner, Stephan; Schule, Rebecca; McCauley, Jacob; Hussain, Sumaira; Cooley, Anne; Wallace, Marielle; Clayman, Christine; Barohn, Richard; Statland, Jeffrey; Ravits, John M.; Swenson, Andrea; Jackson, Carlayne; Trivedi, Jaya; Khan, Shaida; Katz, Jonathan; Jenkins, Liberty; Burns, Ted; Gwathmey, Kelly; Caress, James; McMillan, Corey; Elman, Lauren; Pioro, Erik P.; Heckmann, Jeannine; So, Yuen; Walk, David; Maiser, Samuel; Zhang, Jinghui; Benatar, Michael; Taylor, J. Paul; Taylor, J. Paul; Rampersaud, Evadnie; Wu, Gang; Wuu, Joanne; Silani, Vincenzo; Ticozzi, Nicola; Gellera, Cinzia; Ratti, Antonia; Taroni, Franco; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Comi, Giacomo P.; Sorarù, Gianni; Cereda, Cristina; D'Alfonso, Sandra; Corrado, Lucia; De Marchi, Fabiola; Corti, Stefania; Ceroni, Mauro; Mazzini, Letizia; Siciliano, Gabriele; Filosto, Massimiliano; Inghilleri, Maurizio; Peverelli, Silvia; Colombrita, Claudia; Poletti, Barbara; Maderna, Luca; Del Bo, Roberto; Gagliardi, Stella; Querin, Giorgia; Bertolin, Cinzia; Pensato, Viviana; Castellotti, Barbara; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Fogh, Isabella; Comi, Giacomo P.; Sorarù, Gianni; Cereda, Cristina; Camu, William; Mouzat, Kevin; Lumbroso, Serge; Corcia, Philippe; Meininger, Vincent; Besson, Gérard; Lagrange, Emmeline; Clavelou, Pierre; Guy, Nathalie; Couratier, Philippe; Vourch, Patrick; Danel, Véronique; Bernard, Emilien; Lemasson, Gwendal; Corcia, Philippe; Laaksovirta, Hannu; Myllykangas, Liisa; Jansson, Lilja; Valori, Miko; Ealing, John; Hamdalla, Hisham; Rollinson, Sara; Pickering-Brown, Stuart; Orrell, Richard W.; Sidle, Katie C.; Malaspina, Andrea; Hardy, John; Singleton, Andrew B.; Johnson, Janel O.; Arepalli, Sampath; Sapp, Peter C.; McKenna-Yasek, Diane; Polak, Meraida; Asress, Seneshaw; Al-Sarraj, Safa; King, Andrew; Troakes, Claire; Vance, Caroline; de Belleroche, Jacqueline; Baas, Frank; ten Asbroek, Anneloor L.M.A.; Muñoz-Blanco, José Luis; Hernandez, Dena G.; Ding, Jinhui; Gibbs, J. Raphael; Scholz, Sonja W.; Scholz, Sonja W.; Floeter, Mary Kay; Campbell, Roy H.; Landi, Francesco; Bowser, Robert; Pulst, Stefan M.; Ravits, John M.; MacGowan, Daniel J.L.; Kirby, Janine; Pioro, Erik P.; Pamphlett, Roger; Broach, James; Gerhard, Glenn; Dunckley, Travis L.; Brady, Christopher B.; Brady, Christopher B.; Kowall, Neil W.; Troncoso, Juan C.; Le Ber, Isabelle; Mouzat, Kevin; Lumbroso, Serge; Mouzat, Kevin; Lumbroso, Serge; Heiman-Patterson, Terry D.; Heiman-Patterson, Terry D.; Kamel, Freya; Van Den Bosch, Ludo; Van Den Bosch, Ludo; Baloh, Robert H.; Strom, Tim M.; Meitinger, Thomas; Strom, Tim M.; Shatunov, Aleksey; Van Eijk, Kristel R.; de Carvalho, Mamede; de Carvalho, Mamede; Kooyman, Maarten; Middelkoop, Bas; Moisse, Matthieu; McLaughlin, Russell; Van Es, Michael A.; Weber, Markus; Boylan, Kevin B.; Van Blitterswijk, Marka; Rademakers, Rosa; Morrison, Karen; Basak, A. Nazli; Mora, Jesús S.; Drory, Vivian; Shaw, Pamela; Turner, Martin R.; Talbot, Kevin; Hardiman, Orla; Williams, Kelly L.; Fifita, Jennifer A.; Nicholson, Garth A.; Blair, Ian P.; Nicholson, Garth A.; Rouleau, Guy A.; Esteban-Pérez, Jesús; García-Redondo, Alberto; Al-Chalabi, Ammar; Al Kheifat, Ahmad; Al-Chalabi, Ammar; Andersen, Peter M.; Basak, A. Nazli; Blair, Ian P.; Chio, Adriano; Cooper-Knock, Jonathan; Corcia, Philippe; Couratier, Philippe; de Carvalho, Mamede; Dekker, Annelot; Drory, Vivian; Redondo, Alberto Garcia; Gotkine, Marc; Hardiman, Orla; Hide, Winston; Iacoangeli, Alfredo; Glass, Jonathan D.; Kenna, Kevin P.; Kiernan, Matthew; Kooyman, Maarten; Landers, John E.; McLaughlin, Russell; Middelkoop, Bas; Mill, Jonathan; Neto, Miguel Mitne; Moisse, Matthieu; Pardina, Jesus Mora; Morrison, Karen; Newhouse, Stephen; Pinto, Susana; Pulit, Sara; Robberecht, Wim; Shatunov, Aleksey; Shaw, Pamela; Shaw, Chris; Silani, Vincenzo; Sproviero, William; Tazelaar, Gijs; Ticozzi, Nicola; Van Damme, Philip; van den Berg, Leonard; van der Spek, Rick; Van Eijk, Kristel R.; Van Es, Michael A.; van Rheenen, Wouter; van Vugt, Joke J.F.A.; Veldink, Jan H.; Weber, Markus; Williams, Kelly L.; Van Damme, Philip; Robberecht, Wim; Zatz, Mayana; Robberecht, Wim; Bauer, Denis C.; Twine, Natalie A.; Rogaeva, Ekaterina; Zinman, Lorne; Ostrow, Lyle W.; Maragakis, Nicholas J.; Rothstein, Jeffrey D.; Simmons, Zachary; Cooper-Knock, Johnathan; Brice, Alexis; Goutman, Stephen A.; Feldman, Eva L.; Gibson, Summer B.; Taroni, Franco; Ratti, Antonia; Ratti, Antonia; Gellera, Cinzia; Van Damme, Philip; Robberecht, Wim; Fratta, Pietro; Sabatelli, Mario; Lunetta, Christian; Ludolph, Albert C.; Andersen, Peter M.; Weishaupt, Jochen H.; Camu, William; Trojanowski, John Q.; Van Deerlin, Vivianna M.; Brown, Robert H.; van den Berg, Leonard; Veldink, Jan H.; Harms, Matthew B.; Glass, Jonathan D.; Stone, David J.; Tienari, Pentti; Silani, Vincenzo; Silani, Vincenzo; Chiò, Adriano; Shaw, Christopher E.; Chiò, Adriano; Traynor, Bryan J.; Landers, John E.; Traynor, Bryan J.

    2018-01-01

    To identify novel genes associated with ALS, we undertook two lines of investigation. We carried out a genome-wide association study comparing 20,806 ALS cases and 59,804 controls. Independently, we performed a rare variant burden analysis comparing 1,138 index familial ALS cases and 19,494

  20. A genomic approach to examine the complex evolution of laurasiatherian mammals.

    Directory of Open Access Journals (Sweden)

    Björn M Hallström

    Full Text Available Recent phylogenomic studies have failed to conclusively resolve certain branches of the placental mammalian tree, despite the evolutionary analysis of genomic data from 32 species. Previous analyses of single genes and retroposon insertion data yielded support for different phylogenetic scenarios for the most basal divergences. The results indicated that some mammalian divergences were best interpreted not as a single bifurcating tree, but as an evolutionary network. In these studies the relationships among some orders of the super-clade Laurasiatheria were poorly supported, albeit not studied in detail. Therefore, 4775 protein-coding genes (6,196,263 nucleotides were collected and aligned in order to analyze the evolution of this clade. Additionally, over 200,000 introns were screened in silico, resulting in 32 phylogenetically informative long interspersed nuclear elements (LINE insertion events. The present study shows that the genome evolution of Laurasiatheria may best be understood as an evolutionary network. Thus, contrary to the common expectation to resolve major evolutionary events as a bifurcating tree, genome analyses unveil complex speciation processes even in deep mammalian divergences. We exemplify this on a subset of 1159 suitable genes that have individual histories, most likely due to incomplete lineage sorting or introgression, processes that can make the genealogy of mammalian genomes complex. These unexpected results have major implications for the understanding of evolution in general, because the evolution of even some higher level taxa such as mammalian orders may sometimes not be interpreted as a simple bifurcating pattern.

  1. Re-exploration of U's Triangle Brassica Species Based on Chloroplast Genomes and 45S nrDNA Sequences.

    Science.gov (United States)

    Kim, Chang-Kug; Seol, Young-Joo; Perumal, Sampath; Lee, Jonghoon; Waminal, Nomar Espinosa; Jayakodi, Murukarthick; Lee, Sang-Choon; Jin, Seungwoo; Choi, Beom-Soon; Yu, Yeisoo; Ko, Ho-Cheol; Choi, Ji-Weon; Ryu, Kyoung-Yul; Sohn, Seong-Han; Parkin, Isobel; Yang, Tae-Jin

    2018-05-09

    The concept of U's triangle, which revealed the importance of polyploidization in plant genome evolution, described natural allopolyploidization events in Brassica using three diploids [B. rapa (A genome), B. nigra (B), and B. oleracea (C)] and derived allotetraploids [B. juncea (AB genome), B. napus (AC), and B. carinata (BC)]. However, comprehensive understanding of Brassica genome evolution has not been fully achieved. Here, we performed low-coverage (2-6×) whole-genome sequencing of 28 accessions of Brassica as well as of Raphanus sativus [R genome] to explore the evolution of six Brassica species based on chloroplast genome and ribosomal DNA variations. Our phylogenomic analyses led to two main conclusions. (1) Intra-species-level chloroplast genome variations are low in the three allotetraploids (2~7 SNPs), but rich and variable in each diploid species (7~193 SNPs). (2) Three allotetraploids maintain two 45SnrDNA types derived from both ancestral species with maternal dominance. Furthermore, this study sheds light on the maternal origin of the AC chloroplast genome. Overall, this study clarifies the genetic relationships of U's triangle species based on a comprehensive genomics approach and provides important genomic resources for correlative and evolutionary studies.

  2. Large-scale genome-wide association studies and meta-analyses of longitudinal change in adult lung function.

    Directory of Open Access Journals (Sweden)

    Wenbo Tang

    Full Text Available Genome-wide association studies (GWAS have identified numerous loci influencing cross-sectional lung function, but less is known about genes influencing longitudinal change in lung function.We performed GWAS of the rate of change in forced expiratory volume in the first second (FEV1 in 14 longitudinal, population-based cohort studies comprising 27,249 adults of European ancestry using linear mixed effects model and combined cohort-specific results using fixed effect meta-analysis to identify novel genetic loci associated with longitudinal change in lung function. Gene expression analyses were subsequently performed for identified genetic loci. As a secondary aim, we estimated the mean rate of decline in FEV1 by smoking pattern, irrespective of genotypes, across these 14 studies using meta-analysis.The overall meta-analysis produced suggestive evidence for association at the novel IL16/STARD5/TMC3 locus on chromosome 15 (P  =  5.71 × 10(-7. In addition, meta-analysis using the five cohorts with ≥3 FEV1 measurements per participant identified the novel ME3 locus on chromosome 11 (P  =  2.18 × 10(-8 at genome-wide significance. Neither locus was associated with FEV1 decline in two additional cohort studies. We confirmed gene expression of IL16, STARD5, and ME3 in multiple lung tissues. Publicly available microarray data confirmed differential expression of all three genes in lung samples from COPD patients compared with controls. Irrespective of genotypes, the combined estimate for FEV1 decline was 26.9, 29.2 and 35.7 mL/year in never, former, and persistent smokers, respectively.In this large-scale GWAS, we identified two novel genetic loci in association with the rate of change in FEV1 that harbor candidate genes with biologically plausible functional links to lung function.

  3. IDEA: Interactive Display for Evolutionary Analyses.

    Science.gov (United States)

    Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C

    2008-12-08

    The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data.

  4. Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

    Science.gov (United States)

    Jayakodi, Murukarthick; Choi, Beom-Soon; Lee, Sang-Choon; Kim, Nam-Hoon; Park, Jee Young; Jang, Woojong; Lakshmanan, Meiyappan; Mohan, Shobhana V G; Lee, Dong-Yup; Yang, Tae-Jin

    2018-04-12

    The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb. The first draft genome sequences of P. ginseng cultivar "Chunpoong" were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page. This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.

  5. Comparative Genomics of the Bacterial Genus Streptococcus Illuminates Evolutionary Implications of Species Groups

    Science.gov (United States)

    Gao, Xiao-Yang; Zhi, Xiao-Yang; Li, Hong-Wei; Klenk, Hans-Peter; Li, Wen-Jun

    2014-01-01

    Members of the genus Streptococcus within the phylum Firmicutes are among the most diverse and significant zoonotic pathogens. This genus has gone through considerable taxonomic revision due to increasing improvements of chemotaxonomic approaches, DNA hybridization and 16S rRNA gene sequencing. It is proposed to place the majority of streptococci into “species groups”. However, the evolutionary implications of species groups are not clear presently. We use comparative genomic approaches to yield a better understanding of the evolution of Streptococcus through genome dynamics, population structure, phylogenies and virulence factor distribution of species groups. Genome dynamics analyses indicate that the pan-genome size increases with the addition of newly sequenced strains, while the core genome size decreases with sequential addition at the genus level and species group level. Population structure analysis reveals two distinct lineages, one including Pyogenic, Bovis, Mutans and Salivarius groups, and the other including Mitis, Anginosus and Unknown groups. Phylogenetic dendrograms show that species within the same species group cluster together, and infer two main clades in accordance with population structure analysis. Distribution of streptococcal virulence factors has no obvious patterns among the species groups; however, the evolution of some common virulence factors is congruous with the evolution of species groups, according to phylogenetic inference. We suggest that the proposed streptococcal species groups are reasonable from the viewpoints of comparative genomics; evolution of the genus is congruent with the individual evolutionary trajectories of different species groups. PMID:24977706

  6. Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

    Science.gov (United States)

    Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

    2009-01-01

    Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593

  7. Salmon and steelhead genetics and genomics - Epigenetic and genomic variation in salmon and steelhead

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Conduct analyses of epigenetic and genomic variation in Chinook salmon and steelhead to determine influence on phenotypic expression of life history traits. Genetic,...

  8. Intrinsic disorder in Viral Proteins Genome-Linked: experimental and predictive analyses

    Directory of Open Access Journals (Sweden)

    Van Dorsselaer Alain

    2009-02-01

    Full Text Available Abstract Background VPgs are viral proteins linked to the 5' end of some viral genomes. Interactions between several VPgs and eukaryotic translation initiation factors eIF4Es are critical for plant infection. However, VPgs are not restricted to phytoviruses, being also involved in genome replication and protein translation of several animal viruses. To date, structural data are still limited to small picornaviral VPgs. Recently three phytoviral VPgs were shown to be natively unfolded proteins. Results In this paper, we report the bacterial expression, purification and biochemical characterization of two phytoviral VPgs, namely the VPgs of Rice yellow mottle virus (RYMV, genus Sobemovirus and Lettuce mosaic virus (LMV, genus Potyvirus. Using far-UV circular dichroism and size exclusion chromatography, we show that RYMV and LMV VPgs are predominantly or partly unstructured in solution, respectively. Using several disorder predictors, we show that both proteins are predicted to possess disordered regions. We next extend theses results to 14 VPgs representative of the viral diversity. Disordered regions were predicted in all VPg sequences whatever the genus and the family. Conclusion Based on these results, we propose that intrinsic disorder is a common feature of VPgs. The functional role of intrinsic disorder is discussed in light of the biological roles of VPgs.

  9. A methodological study of genome-wide DNA methylation analyses using matched archival formalin-fixed paraffin embedded and fresh frozen breast tumors.

    Science.gov (United States)

    Espinal, Allyson C; Wang, Dan; Yan, Li; Liu, Song; Tang, Li; Hu, Qiang; Morrison, Carl D; Ambrosone, Christine B; Higgins, Michael J; Sucheston-Campbell, Lara E

    2017-02-28

    DNA from archival formalin-fixed and paraffin embedded (FFPE) tissue is an invaluable resource for genome-wide methylation studies although concerns about poor quality may limit its use. In this study, we compared DNA methylation profiles of breast tumors using DNA from fresh-frozen (FF) tissues and three types of matched FFPE samples. For 9/10 patients, correlation and unsupervised clustering analysis revealed that the FF and FFPE samples were consistently correlated with each other and clustered into distinct subgroups. Greater than 84% of the top 100 loci previously shown to differentiate ER+ and ER- tumors in FF tissues were also FFPE DML. Weighted Correlation Gene Network Analyses (WCGNA) grouped the DML loci into 16 modules in FF tissue, with ~85% of the module membership preserved across tissue types. Restored FFPE and matched FF samples were profiled using the Illumina Infinium HumanMethylation450K platform. Methylation levels (β-values) across all loci and the top 100 loci previously shown to differentiate tumors by estrogen receptor status (ER+ or ER-) in a larger FF study, were compared between matched FF and FFPE samples using Pearson's correlation, hierarchical clustering and WCGNA. Positive predictive values and sensitivity levels for detecting differentially methylated loci (DML) in FF samples were calculated in an independent FFPE cohort. FFPE breast tumors samples show lower overall detection of DMLs versus FF, however FFPE and FF DMLs compare favorably. These results support the emerging consensus that the 450K platform can be employed to investigate epigenetics in large sets of archival FFPE tissues.

  10. Genome-wide association analyses identify variants in developmental genes associated with hypospadias

    DEFF Research Database (Denmark)

    Geller, Frank; Feenstra, Bjarke; Carstensen, Lisbeth

    2014-01-01

    Hypospadias is a common congenital condition in boys in which the urethra opens on the underside of the penis. We performed a genome-wide association study on 1,006 surgery-confirmed hypospadias cases and 5,486 controls from Denmark. After replication genotyping of an additional 1,972 cases and 1...

  11. C-RAF function at the genome-wide transcriptome level: A systematic view.

    Science.gov (United States)

    Huang, Ying; Zhang, Xin-Yu; An, Su; Yang, Yang; Liu, Ying; Hao, Qian; Guo, Xiao-Xi; Xu, Tian-Rui

    2018-05-20

    C-RAF was the first member of the RAF kinase family to be discovered. Since its discovery, C-RAF has been found to regulate many fundamental cell processes, such as cell proliferation, cell death, and metabolism. However, the majority of these functions are achieved through interactions with different proteins; the genes regulated by C-RAF in its active or inactive state remain unclear. In the work, we used RNA-seq analysis to study the global transcriptomes of C-RAF bearing or C-RAF knockout cells in quiescent or EGF activated states. We identified 3353 genes that are promoted or suppressed by C-RAF. Gene ontology and Kyoto Encyclopedia of Genes and Genomes analyses revealed that these genes are involved in drug addiction, cardiomyopathy, autoimmunity, and regulation of cell metabolism. Our results provide a panoramic view of C-RAF function, including known and novel functions, and have revealed potential targets for elucidating the role of C-RAF. Copyright © 2018 Elsevier B.V. All rights reserved.

  12. Reconstructing ancient genomes and epigenomes

    DEFF Research Database (Denmark)

    Orlando, Ludovic Antoine Alexandre; Gilbert, M. Thomas P.; Willerslev, Eske

    2015-01-01

    DNA studies have now progressed to whole-genome sequencing for an increasing number of ancient individuals and extinct species, as well as to epigenomic characterization. Such advances have enabled the sequencing of specimens of up to 1 million years old, which, owing to their extensive DNA damage...... and contamination, were previously not amenable to genetic analyses. In this Review, we discuss these varied technical challenges and solutions for sequencing ancient genomes and epigenomes....

  13. Genome-health nutrigenomics and nutrigenetics: nutritional requirements or 'nutriomes' for chromosomal stability and telomere maintenance at the individual level.

    Science.gov (United States)

    Bull, Caroline; Fenech, Michael

    2008-05-01

    It is becoming increasingly evident that (a) risk for developmental and degenerative disease increases with more DNA damage, which in turn is dependent on nutritional status, and (b) the optimal concentration of micronutrients for prevention of genome damage is also dependent on genetic polymorphisms that alter the function of genes involved directly or indirectly in the uptake and metabolism of micronutrients required for DNA repair and DNA replication. The development of dietary patterns, functional foods and supplements that are designed to improve genome-health maintenance in individuals with specific genetic backgrounds may provide an important contribution to an optimum health strategy based on the diagnosis and individualised nutritional prevention of genome damage, i.e. genome health clinics. The present review summarises some of the recent knowledge relating to micronutrients that are associated with chromosomal stability and provides some initial insights into the likely nutritional factors that may be expected to have an impact on the maintenance of telomeres. It is evident that developing effective strategies for defining nutrient doses and combinations or 'nutriomes' for genome-health maintenance at the individual level is essential for further progress in this research field.

  14. Governance in genomics: a conceptual challenge for public health genomics law

    Directory of Open Access Journals (Sweden)

    Tobias Schulte in den Bäumen

    2006-12-01

    Full Text Available Increasing levels of genomic knowledge has led to awareness that new governance issues need to be taken into consideration. While some countries have created new statutory laws in the last 10 years, science supports the idea that genomic data should be treated like other medical data. In this article we discuss the three core models of governance in medical law on a conceptual level. The three models, the Medical, Public Health and Fundamental Rights Model stress different values, or in legal terms serve different principles. The Medical Model stands for expert knowledge and the standardisation of quality in healthcare. The Public Health Model fosters a social point of view as it advocates distribution justice in healthcare and an awareness of healthcare as a broader concept. The Fundamental Rights Model focuses on individual rights such as the right to privacy and autonomy. We argue that none of the models can be used in a purist fashion as governance in genomics should enable society and individuals to protect individual rights, to strive for a distribution justice and to ensure the quality of genomic services in one coherent process. Thus, genomic governance in genomics requires procedural law and a set of applicable principles. The principle which underlies all three models is the principle of medical beneficence. Therefore genomic governance should refer to it as a key principle when conflicting rights of individuals or communities need to be balanced.

  15. A tutorial of diverse genome analysis tools found in the CoGe web-platform using Plasmodium spp. as a model

    Science.gov (United States)

    Castillo, Andreina I; Nelson, Andrew D L; Haug-Baltzell, Asher K; Lyons, Eric

    2018-01-01

    Abstract Integrated platforms for storage, management, analysis and sharing of large quantities of omics data have become fundamental to comparative genomics. CoGe (https://genomevolution.org/coge/) is an online platform designed to manage and study genomic data, enabling both data- and hypothesis-driven comparative genomics. CoGe’s tools and resources can be used to organize and analyse both publicly available and private genomic data from any species. Here, we demonstrate the capabilities of CoGe through three example workflows using 17 Plasmodium genomes as a model. Plasmodium genomes present unique challenges for comparative genomics due to their rapidly evolving and highly variable genomic AT/GC content. These example workflows are intended to serve as templates to help guide researchers who would like to use CoGe to examine diverse aspects of genome evolution. In the first workflow, trends in genome composition and amino acid usage are explored. In the second, changes in genome structure and the distribution of synonymous (Ks) and non-synonymous (Kn) substitution values are evaluated across species with different levels of evolutionary relatedness. In the third workflow, microsyntenic analyses of multigene families’ genomic organization are conducted using two Plasmodium-specific gene families—serine repeat antigen, and cytoadherence-linked asexual gene—as models. In general, these example workflows show how to achieve quick, reproducible and shareable results using the CoGe platform. We were able to replicate previously published results, as well as leverage CoGe’s tools and resources to gain additional insight into various aspects of Plasmodium genome evolution. Our results highlight the usefulness of the CoGe platform, particularly in understanding complex features of genome evolution. Database URL: https://genomevolution.org/coge/

  16. In silico genomic analyses reveal three distinct lineages of Escherichia coli O157:H7, one of which is associated with hyper-virulence.

    Science.gov (United States)

    Laing, Chad R; Buchanan, Cody; Taboada, Eduardo N; Zhang, Yongxiang; Karmali, Mohamed A; Thomas, James E; Gannon, Victor Pj

    2009-06-29

    accessed locally in an easily transferable, informative and extensible format based on comparative genomic analyses.

  17. In silico genomic analyses reveal three distinct lineages of Escherichia coli O157:H7, one of which is associated with hyper-virulence

    Directory of Open Access Journals (Sweden)

    Karmali Mohamed A

    2009-06-01

    methods should provide data that can be stored centrally and accessed locally in an easily transferable, informative and extensible format based on comparative genomic analyses.

  18. CMG-biotools, a free workbench for basic comparative microbial genomics.

    Science.gov (United States)

    Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David

    2013-01-01

    Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training.

  19. Global organization of a positive-strand RNA virus genome.

    Directory of Open Access Journals (Sweden)

    Baodong Wu

    Full Text Available The genomes of plus-strand RNA viruses contain many regulatory sequences and structures that direct different viral processes. The traditional view of these RNA elements are as local structures present in non-coding regions. However, this view is changing due to the discovery of regulatory elements in coding regions and functional long-range intra-genomic base pairing interactions. The ∼4.8 kb long RNA genome of the tombusvirus tomato bushy stunt virus (TBSV contains these types of structural features, including six different functional long-distance interactions. We hypothesized that to achieve these multiple interactions this viral genome must utilize a large-scale organizational strategy and, accordingly, we sought to assess the global conformation of the entire TBSV genome. Atomic force micrographs of the genome indicated a mostly condensed structure composed of interconnected protrusions extending from a central hub. This configuration was consistent with the genomic secondary structure model generated using high-throughput selective 2'-hydroxyl acylation analysed by primer extension (i.e. SHAPE, which predicted different sized RNA domains originating from a central region. Known RNA elements were identified in both domain and inter-domain regions, and novel structural features were predicted and functionally confirmed. Interestingly, only two of the six long-range interactions known to form were present in the structural model. However, for those interactions that did not form, complementary partner sequences were positioned relatively close to each other in the structure, suggesting that the secondary structure level of viral genome structure could provide a basic scaffold for the formation of different long-range interactions. The higher-order structural model for the TBSV RNA genome provides a snapshot of the complex framework that allows multiple functional components to operate in concert within a confined context.

  20. The Personal Genome Project Canada: findings from whole genome sequences of the inaugural 56 participants.

    Science.gov (United States)

    Reuter, Miriam S; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K C; Trost, Brett; Paton, Tara A; Pereira, Sergio L; Herbrick, Jo-Anne; Wintle, Richard F; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W L; Wang, Zhuozhi; Patel, Rohan V; Pellecchia, Giovanna; Wei, John; Strug, Lisa J; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M; Bassett, Anne S; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D; Stavropoulos, Dimitri J; Bowdin, Sarah; Hildebrandt, Matthew R; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M Stephen; Monfared, Nasim; Hosseini, S Mohsen; Joseph-George, Ann M; Keeley, Fred W; Cook, Ryan A; Fiume, Marc; Lee, Hin C; Marshall, Christian R; Davies, Jill; Hazell, Allison; Buchanan, Janet A; Szego, Michael J; Scherer, Stephen W

    2018-02-05

    The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set ( n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. © 2018 Joule Inc. or its licensors.

  1. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Science.gov (United States)

    Macas, Jiří; Novák, Petr; Pellicer, Jaume; Čížková, Jana; Koblížková, Andrea; Neumann, Pavel; Fuková, Iva; Doležel, Jaroslav; Kelly, Laura J; Leitch, Ilia J

    2015-01-01

    The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  2. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Directory of Open Access Journals (Sweden)

    Jiří Macas

    Full Text Available The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57% of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%. Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  3. Genomic Characterization of DArT Markers Based on High-Density Linkage Analysis and Physical Mapping to the Eucalyptus Genome

    Science.gov (United States)

    Petroli, César D.; Sansaloni, Carolina P.; Carling, Jason; Steane, Dorothy A.; Vaillancourt, René E.; Myburg, Alexander A.; da Silva, Orzenil Bonfim; Pappas, Georgios Joannis; Kilian, Andrzej; Grattapaglia, Dario

    2012-01-01

    Diversity Arrays Technology (DArT) provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for which no reference

  4. Genomic characterization of DArT markers based on high-density linkage analysis and physical mapping to the Eucalyptus genome.

    Directory of Open Access Journals (Sweden)

    César D Petroli

    Full Text Available Diversity Arrays Technology (DArT provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for

  5. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Wasnick Michael

    2008-03-01

    Full Text Available Abstract Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any

  6. Genomic prediction of complex human traits: relatedness, trait architecture and predictive meta-models

    Science.gov (United States)

    Spiliopoulou, Athina; Nagy, Reka; Bermingham, Mairead L.; Huffman, Jennifer E.; Hayward, Caroline; Vitart, Veronique; Rudan, Igor; Campbell, Harry; Wright, Alan F.; Wilson, James F.; Pong-Wong, Ricardo; Agakov, Felix; Navarro, Pau; Haley, Chris S.

    2015-01-01

    We explore the prediction of individuals' phenotypes for complex traits using genomic data. We compare several widely used prediction models, including Ridge Regression, LASSO and Elastic Nets estimated from cohort data, and polygenic risk scores constructed using published summary statistics from genome-wide association meta-analyses (GWAMA). We evaluate the interplay between relatedness, trait architecture and optimal marker density, by predicting height, body mass index (BMI) and high-density lipoprotein level (HDL) in two data cohorts, originating from Croatia and Scotland. We empirically demonstrate that dense models are better when all genetic effects are small (height and BMI) and target individuals are related to the training samples, while sparse models predict better in unrelated individuals and when some effects have moderate size (HDL). For HDL sparse models achieved good across-cohort prediction, performing similarly to the GWAMA risk score and to models trained within the same cohort, which indicates that, for predicting traits with moderately sized effects, large sample sizes and familial structure become less important, though still potentially useful. Finally, we propose a novel ensemble of whole-genome predictors with GWAMA risk scores and demonstrate that the resulting meta-model achieves higher prediction accuracy than either model on its own. We conclude that although current genomic predictors are not accurate enough for diagnostic purposes, performance can be improved without requiring access to large-scale individual-level data. Our methodologically simple meta-model is a means of performing predictive meta-analysis for optimizing genomic predictions and can be easily extended to incorporate multiple population-level summary statistics or other domain knowledge. PMID:25918167

  7. Genomic diversity of Bombyx mori nucleopolyhedrovirus strains.

    Science.gov (United States)

    Xu, Yi-Peng; Cheng, Ruo-Lin; Xi, Yu; Zhang, Chuan-Xi

    2013-07-01

    Bombyx mori nucleopolyhedrovirus (BmNPV) is a baculovirus that selectively infects the domestic silkworm. In this study, six BmNPV strains were compared at the whole genome level. We found that the number of bro genes and the composition of the homologous regions (hrs) are the two primary areas of divergence within these genomes. When we compared the ORFs of these BmNPV variants, we noticed a high degree of sequence divergence in the ORFs that are not baculovirus core genes. This result is consistent with the results derived from phylogenetic trees and evolutionary pressure analyses of these ORFs, indicating that ORFs that are not core genes likely play important roles in the evolution of BmNPV strains. The evolutionary relationships of these BmNPV strains might be explained by their geographic origins or those of their hosts. In addition, the total number of hr palindromes seems to affect viral DNA replication in Bm5 cells. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. Infrastructure for genomic interactions: Bioconductor classes for Hi-C, ChIA-PET and related experiments [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Aaron T. L. Lun

    2016-05-01

    Full Text Available The study of genomic interactions has been greatly facilitated by techniques such as chromatin conformation capture with high-throughput sequencing (Hi-C. These genome-wide experiments generate large amounts of data that require careful analysis to obtain useful biological conclusions. However, development of the appropriate software tools is hindered by the lack of basic infrastructure to represent and manipulate genomic interaction data. Here, we present the InteractionSet package that provides classes to represent genomic interactions and store their associated experimental data, along with the methods required for low-level manipulation and processing of those classes. The InteractionSet package exploits existing infrastructure in the open-source Bioconductor project, while in turn being used by Bioconductor packages designed for higher-level analyses. For new packages, use of the functionality in InteractionSet will simplify development, allow access to more features and improve interoperability between packages.

  9. Infrastructure for genomic interactions: Bioconductor classes for Hi-C, ChIA-PET and related experiments [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Aaron T. L. Lun

    2016-06-01

    Full Text Available The study of genomic interactions has been greatly facilitated by techniques such as chromatin conformation capture with high-throughput sequencing (Hi-C. These genome-wide experiments generate large amounts of data that require careful analysis to obtain useful biological conclusions. However, development of the appropriate software tools is hindered by the lack of basic infrastructure to represent and manipulate genomic interaction data. Here, we present the InteractionSet package that provides classes to represent genomic interactions and store their associated experimental data, along with the methods required for low-level manipulation and processing of those classes. The InteractionSet package exploits existing infrastructure in the open-source Bioconductor project, while in turn being used by Bioconductor packages designed for higher-level analyses. For new packages, use of the functionality in InteractionSet will simplify development, allow access to more features and improve interoperability between packages.

  10. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  11. Comparative genomics of toxigenic and non-toxigenic Staphylococcus hyicus

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Pamp, Sünje Johanna; Andresen, Lars Ole

    2016-01-01

    The most common causative agent of exudative epidermitis (EE) in pigs is Staphylococcus hyicus. S. hyicus can be grouped into toxigenic and non-toxigenic strains based on their ability to cause EE in pigs and specific virulence genes have been identified. A genome wide comparison between non......-toxigenic and toxigenic strains has never been performed. In this study, we sequenced eleven toxigenic and six non-toxigenic S. hyicus strains and performed comparative genomic and phylogenetic analysis. Our analyses revealed two genomic regions encoding genes that were predominantly found in toxigenic strains...... (polymorphic toxin) and was associated with the gene encoding ExhA. A clear differentiation between toxigenic and non-toxigenic strains based on genomic and phylogenetic analyses was not apparent. The results of this study support the observation that exfoliative toxins of S. hyicus and S. aureus are located...

  12. Whole genome sequencing in clinical and public health microbiology.

    Science.gov (United States)

    Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

    2015-04-01

    Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure.

  13. UniPrimer: A Web-Based Primer Design Tool for Comparative Analyses of Primate Genomes

    Directory of Open Access Journals (Sweden)

    Nomin Batnyam

    2012-01-01

    Full Text Available Whole genome sequences of various primates have been released due to advanced DNA-sequencing technology. A combination of computational data mining and the polymerase chain reaction (PCR assay to validate the data is an excellent method for conducting comparative genomics. Thus, designing primers for PCR is an essential procedure for a comparative analysis of primate genomes. Here, we developed and introduced UniPrimer for use in those studies. UniPrimer is a web-based tool that designs PCR- and DNA-sequencing primers. It compares the sequences from six different primates (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque and designs primers on the conserved region across species. UniPrimer is linked to RepeatMasker, Primer3Plus, and OligoCalc softwares to produce primers with high accuracy and UCSC In-Silico PCR to confirm whether the designed primers work. To test the performance of UniPrimer, we designed primers on sample sequences using UniPrimer and manually designed primers for the same sequences. The comparison of the two processes showed that UniPrimer was more effective than manual work in terms of saving time and reducing errors.

  14. Steric sea level variability (1993-2010) in an ensemble of ocean reanalyses and objective analyses

    Science.gov (United States)

    Storto, Andrea; Masina, Simona; Balmaseda, Magdalena; Guinehut, Stéphanie; Xue, Yan; Szekely, Tanguy; Fukumori, Ichiro; Forget, Gael; Chang, You-Soon; Good, Simon A.; Köhl, Armin; Vernieres, Guillaume; Ferry, Nicolas; Peterson, K. Andrew; Behringer, David; Ishii, Masayoshi; Masuda, Shuhei; Fujii, Yosuke; Toyoda, Takahiro; Yin, Yonghong; Valdivieso, Maria; Barnier, Bernard; Boyer, Tim; Lee, Tony; Gourrion, Jérome; Wang, Ou; Heimback, Patrick; Rosati, Anthony; Kovach, Robin; Hernandez, Fabrice; Martin, Matthew J.; Kamachi, Masafumi; Kuragano, Tsurane; Mogensen, Kristian; Alves, Oscar; Haines, Keith; Wang, Xiaochun

    2017-08-01

    Quantifying the effect of the seawater density changes on sea level variability is of crucial importance for climate change studies, as the sea level cumulative rise can be regarded as both an important climate change indicator and a possible danger for human activities in coastal areas. In this work, as part of the Ocean Reanalysis Intercomparison Project, the global and regional steric sea level changes are estimated and compared from an ensemble of 16 ocean reanalyses and 4 objective analyses. These estimates are initially compared with a satellite-derived (altimetry minus gravimetry) dataset for a short period (2003-2010). The ensemble mean exhibits a significant high correlation at both global and regional scale, and the ensemble of ocean reanalyses outperforms that of objective analyses, in particular in the Southern Ocean. The reanalysis ensemble mean thus represents a valuable tool for further analyses, although large uncertainties remain for the inter-annual trends. Within the extended intercomparison period that spans the altimetry era (1993-2010), we find that the ensemble of reanalyses and objective analyses are in good agreement, and both detect a trend of the global steric sea level of 1.0 and 1.1 ± 0.05 mm/year, respectively. However, the spread among the products of the halosteric component trend exceeds the mean trend itself, questioning the reliability of its estimate. This is related to the scarcity of salinity observations before the Argo era. Furthermore, the impact of deep ocean layers is non-negligible on the steric sea level variability (22 and 12 % for the layers below 700 and 1500 m of depth, respectively), although the small deep ocean trends are not significant with respect to the products spread.

  15. Pan-genome and phylogeny of Bacillus cereus sensu lato.

    Science.gov (United States)

    Bazinet, Adam L

    2017-08-02

    Bacillus cereus sensu lato (s. l.) is an ecologically diverse bacterial group of medical and agricultural significance. In this study, I use publicly available genomes and novel bioinformatic workflows to characterize the B. cereus s. l. pan-genome and perform the largest phylogenetic and population genetic analyses of this group to date in terms of the number of genes and taxa included. With these fundamental data in hand, I identify genes associated with particular phenotypic traits (i.e., "pan-GWAS" analysis), and quantify the degree to which taxa sharing common attributes are phylogenetically clustered. A rapid k-mer based approach (Mash) was used to create reduced representations of selected Bacillus genomes, and a fast distance-based phylogenetic analysis of this data (FastME) was performed to determine which species should be included in B. cereus s. l. The complete genomes of eight B. cereus s. l. species were annotated de novo with Prokka, and these annotations were used by Roary to produce the B. cereus s. l. pan-genome. Scoary was used to associate gene presence and absence patterns with various phenotypes. The orthologous protein sequence clusters produced by Roary were filtered and used to build HaMStR databases of gene models that were used in turn to construct phylogenetic data matrices. Phylogenetic analyses used RAxML, DendroPy, ClonalFrameML, PAUP*, and SplitsTree. Bayesian model-based population genetic analysis assigned taxa to clusters using hierBAPS. The genealogical sorting index was used to quantify the phylogenetic clustering of taxa sharing common attributes. The B. cereus s. l. pan-genome currently consists of ≈60,000 genes, ≈600 of which are "core" (common to at least 99% of taxa sampled). Pan-GWAS analysis revealed genes associated with phenotypes such as isolation source, oxygen requirement, and ability to cause diseases such as anthrax or food poisoning. Extensive phylogenetic analyses using an unprecedented amount of data

  16. What can we learn about lyssavirus genomes using 454 sequencing?